Pig Batch mode: how to set logging level to hide INFO log messages?
Asked Answered
D

3

10

Using Apache Pig version 0.10.1.21 (rexported). When I execute a pig script, there are a lots of INFO logging lines which looks like that:

2013-05-18 14:30:12,810 [Thread-28] INFO  org.apache.hadoop.mapred.Task - Task 'attempt_local_0005_r_000000_0' done.
2013-05-18 14:30:18,064 [main] WARN  org.apache.pig.tools.pigstats.PigStatsUtil - Failed to get RunningJob for job job_local_0005
2013-05-18 14:30:18,094 [Thread-31] WARN  org.apache.hadoop.mapred.JobClient - No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
2013-05-18 14:30:18,114 [Thread-31] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-05-18 14:30:18,254 [Thread-32] INFO  org.apache.hadoop.mapred.Task -  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@3fcb2dd1
2013-05-18 14:30:18,265 [Thread-32] INFO  org.apache.hadoop.mapred.MapTask - io.sort.mb = 10

Is there a SET command within the pig script or a command line flag to allow the logging level? Basically I would like to hide the [Thread-xx] INFO messages. Only showing WARNING and ERROR. I have tried the command line debug flag. Unfortunately, the INFO messages still show up:

pig -x local -d WARN MyScript.pig

Hope there is a solution. Thanks in advance for any help.

SOLVED: Answer by Loran Bendig, set the log4j.properties. Summarized here for convenience

Step1: copy the log4j config file to the folder where my pig scripts are located.

cp /etc/pig/conf.dist/log4j.properties log4j_WARN

Step2: Edit log4j_WARN file and make sure these two lines are present

log4j.logger.org.apache.pig=WARN, A
log4j.logger.org.apache.hadoop = WARN, A

Step3: Run pig script and instruct it to use the custom log4j

pig -x local -4 log4j_WARN MyScript.pig
Dunleavy answered 18/5, 2013 at 18:45 Comment(4)
This seems to be a duplicate question: stackoverflow.com/a/16414020Torpedoman
@LorandBendig agreed, that one has a better answer (by you), but this one has a much better title that will be easier for people to find.Elixir
possible duplicate of How do I suppress the bloat of useless information when using the DUMP command while using grunt via 'pig -x local'?Metzgar
when setting -4 log4j_WARN option you need to pass the full path to the log4j file otherwise it won't find itRelive
O
6

Another setting could be also like this:

Create a file named nolog.conf, with the following content

log4j.rootLogger=fatal

and then run pig as follows

pig -x local -4 nolog.conf
Ornate answered 1/3, 2016 at 4:2 Comment(0)
S
0

You can override the default log configuration (which includes INFO messages) like this:

pig -4 log4j.properties MyScript.pig
Shabuoth answered 19/5, 2013 at 13:11 Comment(3)
NOT WORKING. I copied the original log4j.properties to the same folder than the pig scripts. Then execute the script by pig -x local -4 myLog4J Myscript.pig. There are the same amount of INFO lines. Only the [Thread-XX] has been removed in the log lines. This occurred whether log4j.logger.org.apache.pig=WARN, A or log4j.logger.org.apache.pig=ERROR, ADunleavy
@Lorand Bendig's answer is a bit more concise - you'll need to do a bit more configuration for this to work - look at his link above.Shabuoth
I did read Lorand's post but failed to noticed a similarity in the syntax (.hadoop vs .pig). Actually the line "log4j.logger.org.apache.hadoop = error, A" must be added. For the sake of clarity, I am going to edit the original post to include the answer. Thanks.Dunleavy
U
0

You need to set rootLogger too:

log4j.rootLogger=ERROR, A
log4j.logger.org.apache.pig=ERROR, A
log4j.logger.org.apache.hadoop = ERROR, A
Udela answered 27/6, 2015 at 23:2 Comment(1)
What does the , A do?Amitosis

© 2022 - 2024 — McMap. All rights reserved.