Parallelized Data Extraction with Pipelines

A pipeline extracts data from a source, in parallel, using these general rules:

  • The pipeline pairs n number of source partitions or objects with p number of SingleStore leaf node partitions.

  • Each leaf node partition runs its own extraction process independently of other leaf nodes and their partitions.

  • Extracted data is stored on the leaf node where a partition resides until it can be written to the destination table. Depending on the way your table is sharded, the extracted data may only temporarily be stored on this leaf node.

Note

The term batch partition is used below and elsewhere in the documentation. A batch partition is a partition in the data source. If the data source does not contain partitions, then a batch partition refers to a single object in the data source.

Last modified: January 20, 2022

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK