Pickling Error running COPY command: CQLShell on Windows
Asked Answered
W

2

4

We're running a copy command in CQLShell on Windows 7. At first, we ran into an "IMPROPER COPY COMMAND":

COPY ourdata(data_time, data_ID, dataBlob)
FROM 'TestData.csv'
WITH HEADER = true;

We later started receiving this error after running the same command:

Error starting import process:

Can't pickle <type 'thread.lock'>: it's not found as thread.lock
can only join a started process
cqlsh:testkeyspace> Traceback (most recent call last):
               File "<string>", line 1, in <module>
               File "C:\Program Files\DataStax\Community\python\lib\multiprocessing\forking.py",
                      line 373, in main
               prepare(preparation_date)
               File "C:\Program Files\DataStax Community\python\lib\multiprocessing\forking.py",
                      line 482, in prepare
                      file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh

We're not sure if its an issue with the path (no module named cqlsh), or with python pickling objects with the csv file.

Warhead answered 3/6, 2015 at 17:55 Comment(3)
Which version of Python are you using?Tapis
Python Version 2.7.10Warhead
Possible help for others arriving here: #44005712Haught
T
3

So I went and tested this out. I created two simple tables in Cassandra 2.1.5 (BTW- which version are you using?) on both Windows and Linux. I then tested COPY TO/FROM on each.

Linux (Ubuntu 14.04.2 LTS):

Connected to Test Cluster at dockingbay94:9042.
[cqlsh 5.0.1 | Cassandra 2.1.5 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
aploetz@cqlsh> use stackoverflow2;
aploetz@cqlsh:stackoverflow2> COPY dummy3(id,time) TO '/home/aploetz/dummy3.txt' 
    WITH HEADER=true AND DELIMITER='|';

4 rows exported in 0.071 seconds.
aploetz@cqlsh:stackoverflow2> COPY dummy4(id,time) FROM '/home/aploetz/dummy3.txt' 
    WITH HEADER=true AND DELIMITER='|';

4 rows imported in 0.427 seconds.

Windows 8.1:

Connected to Window$ Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.5 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
WARNING: pyreadline dependency missing.  Install to enable tab completion.
aploetz@cqlsh> use stackoverflow;
aploetz@cqlsh:stackoverflow> COPY dummy3(id,time) TO 'e:\dummy3.txt' 
    WITH HEADER=true AND DELIMITER='|';

4 rows exported in 0.020 seconds.
aploetz@cqlsh:stackoverflow> COPY dummy4(id,time) FROM 'e:\dummy3.txt' 
    WITH HEADER=true AND DELIMITER='|';

Error starting import process:

Can't pickle <type 'thread.lock'>: it's not found as thread.lock
can only join a started process
aploetz@cqlsh:stackoverflow> Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "E:\Program Files\DataStax Community\python\lib\multiprocessing\forking.py", line 373, in main
    prepare(preparation_data)
  File "E:\Program Files\DataStax Community\python\lib\multiprocessing\forking.py", line 482, in prepare
    file, path_name, etc = imp.find_module(main_name, dirs)
ImportError: No module named cqlsh

So the COPY TO (export) works fine, but the COPY FROM (import) fails on Windows.

Josh McKenzie of DataStax made a post back in December titled: Cassandra and Windows: Past, Present, and Future. In it, he discusses details some of the longstanding issues that Cassandra has on Windows. Essentially Windows NTFS prevents other processes from changing/deleting files which are in use(locked) by a different process. And these issues directly affect CQLSH's ability to COPY data into Cassandra.

There is a JIRA ticket (CASSANDRA-9670) which addresses a similar issue (running cql scripts with CQLSH on Windows, yields the same error message). I strongly suspect that these two issues are related. In any case, Cassandra is expected to be supported on Windows with version 3.0, which is currently "in development." I tried a few tricks to see if I could find a work-around for this on Windows, and I'll report back if I find one. But for the time-being, you might just have to use Cassandra on Linux to benefit from its full functionality.

Tapis answered 30/6, 2015 at 21:42 Comment(2)
Thanks, I was using version 2.0.Warhead
@Warhead As an alternative, see if this is a viable option for you: github.com/brianmhess/cassandra-loaderTapis
E
1

I met the same problem when I use Cassandra 2.1. When I updated the Cassandra to 2.2, the error disappeared. Try to update your Cassandra.

Eisenstark answered 29/11, 2015 at 5:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.