Load CSV Files Examples

Note

For more information on the LOAD DATA command, refer to LOAD DATA.

Examples

Loading Data when the Order of the Columns in the Destination Table and Source File are Different

If the order of columns in the table is different from the order in the source file, you can name them explicitly. In this example, the columns are loaded in reverse order:

LOAD DATA INFILE 'foo.tsv'
INTO TABLE foo (fourth, third, second, first);

Skipping Columns in the Source File

You can skip columns in the source file using the @ sign. In this example only the first and fourth columns are imported into table foo:

LOAD DATA INFILE 'foo.tsv'
INTO TABLE foo (bar, @, @, baz);

Specifying the Column Delimiter

The default column delimiter is the tab (t) character, ASCII code 09. You can specify a different delimiter, even multi-character delimiters, with the COLUMNS TERMINATED BY clause:

LOAD DATA INFILE 'foo.csv'
INTO TABLE foo
COLUMNS TERMINATED BY ',';

In the following example, field and line delimiters are used to read a file that contains fields separated by commas and lines terminated by carriage return/newline pairs:

LOAD DATA INFILE 'foo.csv' INTO TABLE foo FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';

Source File with Unusual Column Separators

The following example demonstrates loading a file that has unusual column separators (|||):

LOAD DATA INFILE 'foo.oddformat'
INTO TABLE foo
COLUMNS TERMINATED BY '|||';

Loading Data from Multiple Files

Using globbing, you can load data from multiple files in a single LOAD DATA query.

The following query loads data from all the .csv files with names starting with a digit:

LOAD DATA INFILE "[0-9]*.csv"
INTO TABLE cust(ID,NAME,ORDERS);

The following query loads data from all the .csv files with filenames having four characters:

LOAD DATA INFILE "????.csv"
INTO TABLE cust(ID,NAME,ORDERS);

The following query loads data from all the .csv files with filenames not starting with a number:

LOAD DATA INFILE "[!0-9]*.csv"
INTO TABLE cust(ID,NAME,ORDERS);

Note

LOAD DATA LOCAL INFILE does not support globbing.

LOAD DATA INFILE supports globbing in filenames, but not in directory names.

CREATE PIPELINE contains a LOAD DATA clause. Here, LOAD DATA supports globbing, both in directory names and filenames.

Using the TRAILING NULLCOLS Clause

The following example demonstrates how to use the TRAILING NULLCOLS clause using the file numbers.csv , with the following content:

1,2,3
4,5
6

Run the following commands:

CREATE TABLE foo(a INT, b INT, c INT);
LOAD DATA INFILE 'numbers.csv' INTO TABLE foo COLUMNS TERMINATED BY ',' TRAILING NULLCOLS;
SELECT * FROM foo;
+------+------+------+
| a    | b    | c    |
+------+------+------+
|    1 |    2 |    3 |
|    4 |    5 | NULL |
|    6 | NULL | NULL |
+------+------+------+

Using the NULL DEFINED BY Clause

The following example demonstrates how to use the NULL DEFINED BY clause using the data.csv file.

cat data.csv
DTB,'',25
SPD,,40
SELECT * FROM stockN;
+------+-------------+-------+
| ID   | City        | Count |
+------+-------------+-------+
| XCN  | new york    |    45 |
| ZDF  | washington  |    20 |
| XCN  | chicago     |    32 |
+------+-------------+-------+

The following query inserts the un-enclosed empty field as a NULL value and the enclosed empty field as an empty string.

LOAD DATA INFILE '/data.csv'
INTO TABLE stockN
COLUMNS TERMINATED BY ','
OPTIONALLY ENCLOSED BY "'"
NULL DEFINED BY '';
SELECT * FROM stockN;
+------+-------------+-------+
| ID   | City        | Count |
+------+-------------+-------+
| XCN  | new york    |    45 |
| ZDF  | washington  |    20 |
| XCN  | chicago     |    32 |
| DTB  |             |    25 |
| SPD  | NULL        |    40 |
+------+-------------+-------+

If you add the OPTIONALLY ENCLOSED option to the NULL DEFINED BY clause in the query above, and run the following query instead, both the empty fields are inserted as a NULL value:

LOAD DATA INFILE '/data.csv'
INTO TABLE stockN
COLUMNS TERMINATED BY ','
OPTIONALLY ENCLOSED BY "'"
NULL DEFINED BY '' OPTIONALLY ENCLOSED;
SELECT * FROM stockN;
+------+-------------+-------+
| ID   | City        | Count |
+------+-------------+-------+
| XCN  | new york    |    45 |
| ZDF  | washington  |    20 |
| XCN  | chicago     |    32 |
| DTB  | NULL        |    25 |
| SPD  | NULL        |    40 |
+------+-------------+-------+

Using the IGNORE LINES Clause

In the following example, the IGNORE LINES clause is used to skip the header line that contains column names in the source file:

LOAD DATA INFILE '/tmp/data.txt' INTO City IGNORE 1 LINES;

Using the ESCAPED BY Clause

The following example demonstrates how to load data into the loadEsc table using the ESCAPED BY clause from the file contacts.csv, whose contents are shown below.

GALE\, ADAM, Brooklyn
FLETCHER\, RON, New York
WAKEFIELD\, CLARA, DC
DESC loadEsc;
+-------+-------------+------+------+---------+-------+
| Field | Type        | Null | Key  | Default | Extra |
+-------+-------------+------+------+---------+-------+
| Name  | varchar(40) | YES  |      | NULL    |       |
| City  | varchar(40) | YES  |      | NULL    |       |
+-------+-------------+------+------+---------+-------+

Execute the following query:

LOAD DATA INFILE '/contacts.csv'
INTO TABLE loadEsc COLUMNS TERMINATED BY ',' ESCAPED BY '\\' ;
SELECT * FROM loadEsc;
+-------------------+-----------+
| Name              | City      |
+-------------------+-----------+
| GALE, ADAM        |  Brooklyn |
| FLETCHER, RON     |  New York |
| WAKEFIELD, CLARA  |  DC       |
+-------------------+-----------+

In this query, the \ character escapes the comma (,) between the first two fields of the contacts.csv file. (The \ (backslash) is the default escape character in a SQL query. Hence, the \\ (double backslash) is used escape the backslash itself inside the query.)

Warning

If you (accidentally) escape the TERMINATED BY character in a file, the SQL query may return an error. For example, if you escape both the commas in any row of the contacts.csv file mentioned above, as:

GALE\, ADAM\, Brooklyn
FLETCHER\, RON, New York
WAKEFIELD\, CLARA, DC

and then execute the following query

LOAD DATA INFILE '/contacts.csv'
INTO TABLE loadEsc COLUMNS TERMINATED BY ',' ESCAPED BY '\\' ;

it returns the following error: ERROR 1261 (01000): Row 1 does not contain data for all columns. Because, the \ (backslash) escapes both the commas and LOAD DATA perceives the first row as a single column.

Using the STARTING BY Clause

The following example demonstrates how to skip the prefix ### in the stockUpd.txt data file using the STARTING BY clause.

cat stockUpd.txt
###1,"xcg",
3,"dfg"
new product###4,"rfk",5
LOAD DATA INFILE 'stockUpd.txt'
INTO TABLE stock
FIELDS TERMINATED BY ','
LINES STARTING BY '###';
SELECT * FROM stock;
+----+------+----------+
| ID | Code | Quantity |
+----+------+----------+
|  1 |  xcg |       10 |
|  4 |  rfk |        5 |
+----+------+----------+

In this example, the STARTING BY clause skips the prefix ### in the first and third lines and anything before it. It skips the second line, because it does not contain ###.

Filtering out Rows from the Source File

You can also filter out unwanted rows using the WHERE clause. In this example, only rows where bar is equal to 5 will be loaded. All other rows will be discarded:

LOAD DATA INFILE 'foo.oddformat'
INTO TABLE foo (bar, baz)
WHERE bar = 5;

Filtering out and Transforming Rows From the Source File

Complex transformations can be performed in both the SET and WHERE clauses. For example, if you have an input file with a EventDate field and an EventId field:

10-1-2016,1
4-15-2016,2
1-10-2017,3
4-10-2017,4

You want to only load the rows with a date that is within three months from a certain date, 10/15/2016, for instance. This can be accomplished by the following:

CREATE TABLE foo (EventDate date, EventId int);
LOAD DATA INFILE 'date_event.csv'
INTO TABLE foo
FIELDS TERMINATED BY ','
(@EventDate, EventId)
SET EventDate = STR_TO_DATE(@EventDate, '%m-%d-%Y')
WHERE ABS(MONTHS_BETWEEN(EventDate, date('2016-10-15'))) < 3;
SELECT * FROM t;
+------------+---------+
| EventDate  | EventId |
+------------+---------+
| 2016-10-01 |       1 |
| 2017-01-10 |       3 |
+------------+---------+

While both column names and variables can be referenced in the WHERE clause column names can only be assigned to in the SET clause. The scope of these clauses is restricted to the current row and therefore SELECT statements cannot be evaluated.

Using REPLACE

This example uses the cust table, which is defined as a columnstore table as follows:

CREATE TABLE cust(name VARCHAR(32), id INT(11), orders INT(11), SORT KEY(id), UNIQUE KEY(id) USING HASH, SHARD KEY(id));

Assume the directory /order_files has one file orders.csv, which contains the following data:

Chris,7214,6
Elen,8301,4
Adam,3412,5
Rachel,9125,2
Susan,8301,7
George,3412,9

Create a LOAD DATA statement with a REPLACE clause:

LOAD DATA INFILE '/order_files/orders.csv' REPLACE INTO TABLE cust FIELDS TERMINATED BY ',';

As LOAD DATA ingests the data from orders.csv into the cust table, it encounters the fifth and sixth records in the file, which contain the duplicate keys 8301 and 3412. The second and third records containing those duplicate keys (which have already been imported into cust), are replaced with the fifth and second records.

SELECT * FROM cust ORDER BY name;
+--------+------+--------+
| name   | id   | orders |
+--------+------+--------+
| Chris  | 7214 |      6 |
| George | 3412 |      9 |
| Rachel | 9125 |      2 |
| Susan  | 8301 |      7 |
+--------+------+--------+

Note

If you want to see more examples of loading data with vectors, refer to How to Bulk Load Vectors.

Loading a Fixed Length File

This example demonstrates how to load the contents of the file fixed_length.csv, whose contents are shown below.

APE602020-06-01
TR 252019-08-07
HSW8 2019-10-11
YTR122020-09-02

LOAD DATA inserts each extracted row from fixed_length.csv into the table foo. Define the table as follows:

CREATE TABLE foo(a CHAR(3), b INT, c DATETIME);

Run the LOAD DATA statement:

LOAD DATA INFILE '/fixed_length.csv'
INTO TABLE foo (@current_row)
SET a = TRIM(SUBSTR(@current_row,1,3)),
b = TRIM(SUBSTR(@current_row,4,2)),
c = TRIM(SUBSTR(@current_row,6,10));

SUBSTR() extracts a substring from a string and TRIM() removes the padding (spaces in this case) from the beginning and the ending of a string. For example, after the LOAD DATA statement extracts the line HSW8 2019-10-11 in fixed_length.csv, it does the following to set b: * It extracts, from HSW8 2019-10-11, the substring starting at position 4 having a length of 2. The resulting substring is 8. * It removes the leading whitespace from 8 to yield 8.

Retrieve the data from foo:

SELECT * from foo ORDER BY a;
+------+------+---------------------+
| a    | b    | c                   |
+------+------+---------------------+
| APE  |   60 | 2020-06-01 00:00:00 |
| HSW  |    8 | 2019-10-11 00:00:00 |
| TR   |   25 | 2019-08-07 00:00:00 |
| YTR  |   12 | 2020-09-02 00:00:00 |
+------+------+---------------------+

Loading Data using Hex Field Terminator Syntax

Loading data into a table can be performed using a hexadecimal field terminator.

Syntax

CREATE TABLE <table name>(a int, b int);
LOAD DATA infile "/tmp/<file name>.csv" 
INTO TABLE test (a, b) fields terminated by 0x2c;
SELECT * FROM <table name>;
**
+------+------+
|  a   | b    |
+------+------+
|    1 |    2 |
+------+------+

Contents of the  hex_test.csv was created via the CLI using the following commands:

echo 1. 2 > hex_test.csv

hexdump -C hex_test.csv
00000000  31 2e 20 32 0a                            |1. 2.|
00000005

Last modified: April 8, 2025

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK