IS it possible to manage NO FILE error in Pig?
Asked Answered
A

3

6

I'm trying to load simple file:

log = load 'file_1.gz' using TextLoader AS (line:chararray);
dump log

And I get an error:

2014-04-08 11:46:19,471 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2997: Unable to recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input Pattern hdfs://hadoop1:8020/pko/file*gz matches 0 files
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
        at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1054)

Is is possible to manage such situation before error appears?

Assai answered 8/4, 2014 at 9:54 Comment(3)
Pawel, Did you get to know how to handle this ? Even I have the same scenario.ThanksBk
Same here. I also i tried several regular expressions. none works as long as it returns "0 files"Ecphonesis
You could create an empty blank file and load with a pattern like this /pko/{blank,file*gz}. It will load 0 rows when no file*gzs exist.Excaudate
K
0

Input Pattern hdfs://hadoop1:8020/pko/file*gz matches 0 files

The error is the input file doesn't exist in the given hdfs path.

log = load 'file_1.gz' using TextLoader AS (line:chararray); as you haven’t mentioned the absolute path of file_1.gz , it will taken the home hdfs dir of the user with which you are running your pig-script

Kirkuk answered 30/4, 2015 at 16:40 Comment(1)
I know the reason of the error. But my question is : is it possible to manage these kind of errors in Pig. Something like try-catch.Assai
F
0

Unfortunately in the current version of Pig (0.15.0) it is impossible to manage these errors without using UDF's.

I suggest creating a Java or Python script using try and catch to take care of this.

Here's a good website that might be of some use to you: https://wiki.apache.org/pig/PigErrorHandlingInScripts

Good luck learning Pig!

Fran answered 11/2, 2016 at 22:30 Comment(1)
Can you provide the website you mentioned?Assai
A
0

I'm facing this issue as well. My load command is:

DATA = LOAD '${qurwf_folder_input}/data/*/' AS (...);

I want to load all files from the data subfolders, but the data folder is empty and I got the same error as you. What I did, in my particular case, was to create an empty folder in the data directory. So the LOAD returns an empty dataset and the script did not fail.

By the way, I'm using Oozie workflow to run the scripts, and in the prepare, I create the empty folders.

Alessandraalessandria answered 19/1, 2017 at 11:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.