Load CSV Files Examples
On this page
Note
For more information on the LOAD DATA command, refer to LOAD DATA.
Examples
Loading Data when the Order of the Columns in the Destination Table and Source File are Different
If the order of columns in the table is different from the order in the source file, you can name them explicitly.
LOAD DATA INFILE 'foo.tsv'INTO TABLE foo (fourth, third, second, first);
Skipping Columns in the Source File
You can skip columns in the source file using the @
sign.
LOAD DATA INFILE 'foo.tsv'INTO TABLE foo (bar, @, @, baz);
Specifying the Column Delimiter
The default column delimiter is the tab (t
) character, ASCII code 09.COLUMNS TERMINATED BY
clause:
LOAD DATA INFILE 'foo.csv'INTO TABLE fooCOLUMNS TERMINATED BY ',';
In the following example, field and line delimiters are used to read a file that contains fields separated by commas and lines terminated by carriage return/newline pairs:
LOAD DATA INFILE 'foo.csv' INTO TABLE foo FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';
Source File with Unusual Column Separators
The following example demonstrates loading a file that has unusual column separators (|||
):
LOAD DATA INFILE 'foo.oddformat'INTO TABLE fooCOLUMNS TERMINATED BY '|||';
Loading Data from Multiple Files
Using globbing, you can load data from multiple files in a single LOAD DATA
query.
The following query loads data from all the .
files with names starting with a digit:
LOAD DATA INFILE "[0-9]*.csv"INTO TABLE cust(ID,NAME,ORDERS);
The following query loads data from all the .
files with filenames having four characters:
LOAD DATA INFILE "????.csv"INTO TABLE cust(ID,NAME,ORDERS);
The following query loads data from all the .
files with filenames not starting with a number:
LOAD DATA INFILE "[!0-9]*.csv"INTO TABLE cust(ID,NAME,ORDERS);
Note
LOAD DATA LOCAL INFILE
does not support globbing.
LOAD DATA INFILE
supports globbing in filenames, but not in directory names.
CREATE PIPELINE
contains a LOAD DATA
clause.LOAD DATA
supports globbing, both in directory names and filenames.
Using the TRAILING NULLCOLS
Clause
The following example demonstrates how to use the TRAILING NULLCOLS
clause using the file numbers.
, with the following content:
1,2,3
4,5
6
Run the following commands:
CREATE TABLE foo(a INT, b INT, c INT);LOAD DATA INFILE 'numbers.csv' INTO TABLE foo COLUMNS TERMINATED BY ',' TRAILING NULLCOLS;SELECT * FROM foo;
+------+------+------+
| a | b | c |
+------+------+------+
| 1 | 2 | 3 |
| 4 | 5 | NULL |
| 6 | NULL | NULL |
+------+------+------+
Using the NULL DEFINED BY
Clause
The following example demonstrates how to use the NULL DEFINED BY
clause using the data.
file.
cat data.csv
DTB,'',25
SPD,,40
SELECT * FROM stockN;
+------+-------------+-------+
| ID | City | Count |
+------+-------------+-------+
| XCN | new york | 45 |
| ZDF | washington | 20 |
| XCN | chicago | 32 |
+------+-------------+-------+
The following query inserts the un-enclosed empty field as a NULL
value and the enclosed empty field as an empty string.
LOAD DATA INFILE '/data.csv'INTO TABLE stockNCOLUMNS TERMINATED BY ','OPTIONALLY ENCLOSED BY "'"NULL DEFINED BY '';SELECT * FROM stockN;
+------+-------------+-------+
| ID | City | Count |
+------+-------------+-------+
| XCN | new york | 45 |
| ZDF | washington | 20 |
| XCN | chicago | 32 |
| DTB | | 25 |
| SPD | NULL | 40 |
+------+-------------+-------+
If you add the OPTIONALLY ENCLOSED
option to the NULL DEFINED BY
clause in the query above, and run the following query instead, both the empty fields are inserted as a NULL
value:
LOAD DATA INFILE '/data.csv'INTO TABLE stockNCOLUMNS TERMINATED BY ','OPTIONALLY ENCLOSED BY "'"NULL DEFINED BY '' OPTIONALLY ENCLOSED;SELECT * FROM stockN;
+------+-------------+-------+
| ID | City | Count |
+------+-------------+-------+
| XCN | new york | 45 |
| ZDF | washington | 20 |
| XCN | chicago | 32 |
| DTB | NULL | 25 |
| SPD | NULL | 40 |
+------+-------------+-------+
Using the IGNORE LINES
Clause
In the following example, the IGNORE LINES
clause is used to skip the header line that contains column names in the source file:
LOAD DATA INFILE '/tmp/data.txt' INTO City IGNORE 1 LINES;
Using the ESCAPED BY
Clause
The following example demonstrates how to load data into the loadEsc
table using the ESCAPED BY
clause from the file contacts.
, whose contents are shown below.
GALE\, ADAM, Brooklyn
FLETCHER\, RON, New York
WAKEFIELD\, CLARA, DC
DESC loadEsc;
+-------+-------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+------+---------+-------+
| Name | varchar(40) | YES | | NULL | |
| City | varchar(40) | YES | | NULL | |
+-------+-------------+------+------+---------+-------+
Execute the following query:
LOAD DATA INFILE '/contacts.csv'INTO TABLE loadEsc COLUMNS TERMINATED BY ',' ESCAPED BY '\\' ;SELECT * FROM loadEsc;
+-------------------+-----------+
| Name | City |
+-------------------+-----------+
| GALE, ADAM | Brooklyn |
| FLETCHER, RON | New York |
| WAKEFIELD, CLARA | DC |
+-------------------+-----------+
In this query, the \
character escapes the comma (,
) between the first two fields of the contacts.\
(backslash) is the default escape character in a SQL query.\\
(double backslash) is used escape the backslash itself inside the query.
Warning
If you (accidentally) escape the TERMINATED BY
character in a file, the SQL query may return an error.
GALE\, ADAM\, Brooklyn
FLETCHER\, RON, New York
WAKEFIELD\, CLARA, DC
and then execute the following query
LOAD DATA INFILE '/contacts.csv'INTO TABLE loadEsc COLUMNS TERMINATED BY ',' ESCAPED BY '\\' ;
it returns the following error: ERROR 1261 (01000): Row 1 does not contain data for all columns.\
(backslash) escapes both the commas and LOAD DATA
perceives the first row as a single column.
Using the STARTING BY
Clause
The following example demonstrates how to skip the prefix ###
in the stockUpd.
data file using the STARTING BY
clause.
cat stockUpd.txt
###1,"xcg",
3,"dfg"
new product###4,"rfk",5
LOAD DATA INFILE 'stockUpd.txt'INTO TABLE stockFIELDS TERMINATED BY ','LINES STARTING BY '###';SELECT * FROM stock;
+----+------+----------+
| ID | Code | Quantity |
+----+------+----------+
| 1 | xcg | 10 |
| 4 | rfk | 5 |
+----+------+----------+
In this example, the STARTING BY
clause skips the prefix ###
in the first and third lines and anything before it.###
.
Filtering out Rows from the Source File
You can also filter out unwanted rows using the WHERE
clause.
LOAD DATA INFILE 'foo.oddformat'INTO TABLE foo (bar, baz)WHERE bar = 5;
Filtering out and Transforming Rows From the Source File
Complex transformations can be performed in both the SET
and WHERE
clauses.EventDate
field and an EventId
field:
10-1-2016,1
4-15-2016,2
1-10-2017,3
4-10-2017,4
You want to only load the rows with a date that is within three months from a certain date, 10/15/2016, for instance.
CREATE TABLE foo (EventDate date, EventId int);LOAD DATA INFILE 'date_event.csv'INTO TABLE fooFIELDS TERMINATED BY ','(@EventDate, EventId)SET EventDate = STR_TO_DATE(@EventDate, '%m-%d-%Y')WHERE ABS(MONTHS_BETWEEN(EventDate, date('2016-10-15'))) < 3;SELECT * FROM t;
+------------+---------+
| EventDate | EventId |
+------------+---------+
| 2016-10-01 | 1 |
| 2017-01-10 | 3 |
+------------+---------+
While both column names and variables can be referenced in the WHERE
clause column names can only be assigned to in the SET
clause.SELECT
statements cannot be evaluated.
Using REPLACE
This example uses the cust
table, which is defined as a columnstore table as follows:
CREATE TABLE cust(name VARCHAR(32), id INT(11), orders INT(11), SORT KEY(id), UNIQUE KEY(id) USING HASH, SHARD KEY(id));
Assume the directory /order_
has one file orders.
, which contains the following data:
Chris,7214,6
Elen,8301,4
Adam,3412,5
Rachel,9125,2
Susan,8301,7
George,3412,9
Create a LOAD DATA
statement with a REPLACE
clause:
LOAD DATA INFILE '/order_files/orders.csv' REPLACE INTO TABLE cust FIELDS TERMINATED BY ',';
As LOAD DATA
ingests the data from orders.
into the cust
table, it encounters the fifth and sixth records in the file, which contain the duplicate keys 8301
and 3412
.cust
), are replaced with the fifth and second records.
SELECT * FROM cust ORDER BY name;
+--------+------+--------+
| name | id | orders |
+--------+------+--------+
| Chris | 7214 | 6 |
| George | 3412 | 9 |
| Rachel | 9125 | 2 |
| Susan | 8301 | 7 |
+--------+------+--------+
Note
If you want to see more examples of loading data with vectors, refer to How to Bulk Load Vectors.
Loading a Fixed Length File
This example demonstrates how to load the contents of the file fixed_
, whose contents are shown below.
APE602020-06-01
TR 252019-08-07
HSW8 2019-10-11
YTR122020-09-02
LOAD DATA
inserts each extracted row from fixed_
into the table foo
.
CREATE TABLE foo(a CHAR(3), b INT, c DATETIME);
Run the LOAD DATA
statement:
LOAD DATA INFILE '/fixed_length.csv'INTO TABLE foo (@current_row)SET a = TRIM(SUBSTR(@current_row,1,3)),b = TRIM(SUBSTR(@current_row,4,2)),c = TRIM(SUBSTR(@current_row,6,10));
SUBSTR()
extracts a substring from a string and TRIM()
removes the padding (spaces in this case) from the beginning and the ending of a string.LOAD DATA
statement extracts the line HSW8 2019-10-11
in fixed_
, it does the following to set b
: * It extracts, from HSW8 2019-10-11
, the substring starting at position 4 having a length of 2.8
.8
to yield 8
.
Retrieve the data from foo
:
SELECT * from foo ORDER BY a;
+------+------+---------------------+
| a | b | c |
+------+------+---------------------+
| APE | 60 | 2020-06-01 00:00:00 |
| HSW | 8 | 2019-10-11 00:00:00 |
| TR | 25 | 2019-08-07 00:00:00 |
| YTR | 12 | 2020-09-02 00:00:00 |
+------+------+---------------------+
Loading Data using Hex Field Terminator Syntax
Loading data into a table can be performed using a hexadecimal field terminator.
Syntax
CREATE TABLE <table name>(a int, b int);LOAD DATA infile "/tmp/<file name>.csv"INTO TABLE test (a, b) fields terminated by 0x2c;SELECT * FROM <table name>;
**
+------+------+
| a | b |
+------+------+
| 1 | 2 |
+------+------+
Contents of the hex_
was created via the CLI using the following commands:
echo 1. 2 > hex_test.csv
hexdump -C hex_test.csv
00000000 31 2e 20 32 0a |1. 2.|
00000005
Last modified: April 8, 2025