Clickhouse Data Import
Asked Answered
R

3

6

I created a table in Clickhouse:

CREATE TABLE stock
(
    plant Int32,
    code Int32,
    service_level Float32,
    qty Int32
) ENGINE = Log

there is a data file

:~$ head -n 10 /var/rs_mail/IN/qv_stock_20160620035119.csv
2010,646,1.00,13
2010,2486,1.00,19
2010,8178,1.00,10
2010,15707,1.00,4
2010,15708,1.00,10
2010,15718,1.00,4
2010,16951,1.00,8
2010,17615,1.00,13
2010,17616,1.00,4
2010,17617,1.00,8

I am trying to load data:

:~$ cat /var/rs_mail/IN/qv_stock_20160620035119.csv | clickhouse-client --query="INSERT INTO stock FORMAT CSV";

and I get an error

\n2010: 7615,1.00,13ion: Cannot parse input: expected , before: 2010,646,1.00,13

Row 1:
Column 0,   name: plant,         type: Int32,   ERROR: text "2010,64" is not like Int32

: (at row 1)

what am I doing wrong?

File: https://yadi.sk/d/ijJlmnBjsjBVc

Revocable answered 21/6, 2016 at 20:41 Comment(1)
I am not familiar with it, but I would look into what the FORMAT CSV uses for default delineation, maybe it is not a ,Beckner
R
8

Thank you uYSIZfoz:

Your file has BOM (EF BB BF bytes at begin).

In my case was a BOM in the header row of the original file. I simply excluded from loading the first line using the format CSVWithNames.

cat /tmp/qv_stock_20160623035104.csv | clickhouse-client --query="INSERT INTO stock FORMAT CSVWithNames";

Revocable answered 24/6, 2016 at 7:7 Comment(0)
S
3

Int8 type has range -128..127. 2010 (first value) is out of range of Int8.

If you change table definition, everything is Ok:

$ clickhouse-client 
ClickHouse client version 0.0.53720.
Connecting to localhost:9000.
Connected to ClickHouse server version 1.1.53981.

:) CREATE TABLE stock
:-] (
:-]     plant Int16,
:-]     code Int16,
:-]     service_level Float32,
:-]     qty Int8
:-] ) ENGINE = Log

CREATE TABLE stock
(
    plant Int16, 
    code Int16, 
    service_level Float32, 
    qty Int8
) ENGINE = Log

Ok.

0 rows in set. Elapsed: 0.002 sec. 

:) Bye.
$ mcedit qv_stock_20160620035119.csv

$ cat qv_stock_20160620035119.csv | clickhouse-client --query="INSERT INTO stock FORMAT CSV";
$ clickhouse-client 
ClickHouse client version 0.0.53720.
Connecting to localhost:9000.
Connected to ClickHouse server version 1.1.53981.

:) SELECT * FROM stock

SELECT *
FROM stock

┌─plant─┬──code─┬─service_level─┬─qty─┐
│  2010 │   646 │             1 │  13 │
│  2010 │  2486 │             1 │  19 │
│  2010 │  8178 │             1 │  10 │
│  2010 │ 15707 │             1 │   4 │
│  2010 │ 15708 │             1 │  10 │
│  2010 │ 15718 │             1 │   4 │
│  2010 │ 16951 │             1 │   8 │
│  2010 │ 17615 │             1 │  13 │
│  2010 │ 17616 │             1 │   4 │
│  2010 │ 17617 │             1 │   8 │
└───────┴───────┴───────────────┴─────┘

Please note, that Int16, which I specified, could be also not enough for your data. Then specify Int32, Int64...

Shambles answered 22/6, 2016 at 21:29 Comment(6)
I tried to change the type to Int16 to send the first post. The error persists. It is possible that something is wrong with the encoding of a file? I corrected my question, because the error is the same.Revocable
I added a file that is trying to loadRevocable
$ head -n1 test.csv | xxd 0000000: efbb bf32 3031 302c 3634 362c 312e 3030 ...2010,646,1.00 0000010: 2c31 330d 0a ,13..Shambles
Your file has BOM (EF BB BF bytes at begin).Shambles
Thanks for your answer. What if I use windows? When I run in CMD I get "'cat' is not recognized as an internal or external command, operable program or batch file."Andris
@Andris You should know that ClickHouse only runs on Linux Ubuntu and some others OS, excluding Windows.Bindweed
B
0

I think the commas in this ruining the format

2010,646,1.00,13

Try to remove all the commas, then try to insert it back in as an Int.

Bohrer answered 27/10, 2017 at 5:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.