Insufficient space for shared memory file when I try to run nutch generate command
Asked Answered
V

2

15

I have been running nutch crawling commands for the passed 3 weeks and now I get the below error when I try to run any nutch command:

Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared memory file: /tmp/hsperfdata_user/27050 Try using the -Djava.io.tmpdir= option to select an alternate temp location.

Error: Could not find or load main class ___.tmp.hsperfdata_user.27055

How do I solve this issue?

Victoir answered 12/1, 2013 at 5:19 Comment(3)
How much free space do you have left on your hard disk?Stereograph
Easiest way is to run the df command. Look under the "use%" (sometimes "capacity%") column.Stereograph
/dev/xvda1 is 100% used, /dev/xvdb shows 1% used and 140gb freeVictoir
M
9

I think that the temporary location that was used has got full. Try using some other location. Also, check the #inodes free in each partition and clear up some space.

EDIT: There is no need to change the /tmp at OS level. We want nutch and hadoop to use some other location for storing temp files. Look at this to do that : What should be hadoop.tmp.dir ?

Milker answered 12/1, 2013 at 5:40 Comment(13)
how to change the temporary location? also i dnt know how to check the number of inodes free and clear the space.Victoir
Dont worry man. Just google it out to get the command. If you are not geeky, the safest option will be to clear off all files from /tmp which belong to the user running the nutch process and created long back..like before 24 hours.Milker
there is almost 3gb of data inside /tmp/hadoop-user/mapred/local/taskTracker/user/ folder can i safely delete the content of this folder? it wont affect the nutch crawling right? i am using nutch 2.1 with mysql. Can i also delete the files inside the folder /tmp/hadoop-user/mapred/staging/ ?Victoir
if no nutch and hadoop process is running, then u can go ahead and delete those things.Milker
nothing is crawling right now because of the space error so i guess i can delete the folder content? i want to make sure because we had a lot of trouble installing nutch in the first palceVictoir
I deleted the content of /tmp/hadoop-user/mapred/local/taskTracker/user/ but when i run the nutch command again it shows the disk size as 100% ie 7.5G of the 7.9G is used. so can you help me with the command to change the tmp directory?? tried searching in google but i cant find the hadoop-site.xml file to change the location. I have almost 140G free on the other partition so i want to move the tmp location over there.Victoir
It is not easy to move /tmp on a running system. I suggest you increase you swap space, delete as much as possible in /tmp and set it to mount a tmpfs on a reboot. This will use swap space for temporary storage which is faster and it can use space across multiple filesystems.Tripping
@TejasP Thanks alot i managed to change the directory by insert the following in the nutch-site.xml as there was no hadoop-site.xml or mapred-site.xml file <property> <name>hadoop.tmp.dir</name> <value>/mnt/crawl/</value> </property>Victoir
Now i start getting Error: Could not find or load main class ___.tmp.hsperfdata_user.6777 This looks like a hadoop error. i had added mapred.system.dir property in nutch-site.xml and that had moved the mapred temp files but not the hadoop temp files using the hadoop.temp.dir property. i have tried setting hadoop.job.history.user.location property still i get the error. How do i move the hsperfdata_user folder to another location ?Victoir
@Victoir : were u running in local mode or hadoop mode ?Milker
@TejasP i am running nutch in hadoop mode.Victoir
Hello, I would like to ask if you solved the problem. I have totally same problem right now. When I type "df" to a console, I see that my /dev/mapper/ir-root 32060300 32060300 0 100% / is full... and I have also Error: Could not find or load main class __.tmp.hsperfdata... error.Daphene
@JohnnyGreenwood do you mean that there is no more disk space left on the partition where the tmp files are generated by Nutch ? If so, please clear some space offMilker
F
12

Yeah this is really an issue with the space available on the volume your /tmp is mounted on. If you are running this on EC2, or any cloud platform, attach a new volume and mount your /tmp on that. If running locally, no other option besides cleaning up to make more room.

Try commands like: df -h to see the % used and available space on each volume mounted on your instance. You will see something like:

Filesystem            Size  Used Avail Use% Mounted on
/dev/xvda1            7.9G  7.9G     0 100% /
tmpfs                  30G     0   30G   0% /dev/shm
/dev/xvda3             35G  1.9G   31G   6% /var
/dev/xvda4             50G   44G  3.8G  92% /opt
/dev/xvdb             827G  116G  669G  15% /data/1
/dev/xvdc             827G  152G  634G  20% /data/2
/dev/xvdd             827G  149G  637G  19% /data/3
/dev/xvde             827G  150G  636G  20% /data/4
cm_processes           30G   22M   30G   1% /var/run/cloudera-scm-agent/process

You will begin to see this error when the disk space is full as shown in this dump.

Foursome answered 20/9, 2014 at 21:22 Comment(0)
M
9

I think that the temporary location that was used has got full. Try using some other location. Also, check the #inodes free in each partition and clear up some space.

EDIT: There is no need to change the /tmp at OS level. We want nutch and hadoop to use some other location for storing temp files. Look at this to do that : What should be hadoop.tmp.dir ?

Milker answered 12/1, 2013 at 5:40 Comment(13)
how to change the temporary location? also i dnt know how to check the number of inodes free and clear the space.Victoir
Dont worry man. Just google it out to get the command. If you are not geeky, the safest option will be to clear off all files from /tmp which belong to the user running the nutch process and created long back..like before 24 hours.Milker
there is almost 3gb of data inside /tmp/hadoop-user/mapred/local/taskTracker/user/ folder can i safely delete the content of this folder? it wont affect the nutch crawling right? i am using nutch 2.1 with mysql. Can i also delete the files inside the folder /tmp/hadoop-user/mapred/staging/ ?Victoir
if no nutch and hadoop process is running, then u can go ahead and delete those things.Milker
nothing is crawling right now because of the space error so i guess i can delete the folder content? i want to make sure because we had a lot of trouble installing nutch in the first palceVictoir
I deleted the content of /tmp/hadoop-user/mapred/local/taskTracker/user/ but when i run the nutch command again it shows the disk size as 100% ie 7.5G of the 7.9G is used. so can you help me with the command to change the tmp directory?? tried searching in google but i cant find the hadoop-site.xml file to change the location. I have almost 140G free on the other partition so i want to move the tmp location over there.Victoir
It is not easy to move /tmp on a running system. I suggest you increase you swap space, delete as much as possible in /tmp and set it to mount a tmpfs on a reboot. This will use swap space for temporary storage which is faster and it can use space across multiple filesystems.Tripping
@TejasP Thanks alot i managed to change the directory by insert the following in the nutch-site.xml as there was no hadoop-site.xml or mapred-site.xml file <property> <name>hadoop.tmp.dir</name> <value>/mnt/crawl/</value> </property>Victoir
Now i start getting Error: Could not find or load main class ___.tmp.hsperfdata_user.6777 This looks like a hadoop error. i had added mapred.system.dir property in nutch-site.xml and that had moved the mapred temp files but not the hadoop temp files using the hadoop.temp.dir property. i have tried setting hadoop.job.history.user.location property still i get the error. How do i move the hsperfdata_user folder to another location ?Victoir
@Victoir : were u running in local mode or hadoop mode ?Milker
@TejasP i am running nutch in hadoop mode.Victoir
Hello, I would like to ask if you solved the problem. I have totally same problem right now. When I type "df" to a console, I see that my /dev/mapper/ir-root 32060300 32060300 0 100% / is full... and I have also Error: Could not find or load main class __.tmp.hsperfdata... error.Daphene
@JohnnyGreenwood do you mean that there is no more disk space left on the partition where the tmp files are generated by Nutch ? If so, please clear some space offMilker

© 2022 - 2024 — McMap. All rights reserved.