apache-pig - 4

3

Solved

I'm trying to write a pig latin script to pull the count of a dataset that I've filtered. Here's the script so far: /* scans by title */ scans = LOAD '/hive/scans/*' USING PigStorage(',') AS (t...

apache-pig

Neutrality asked 22/3, 2012 at 16:19

4

Solved

how to call a pig script within another pig script

I have a file in hdfs with 100 columns, which i want to proces using pig. I want to load this file into a tuple with columns names in a separate pig script, and reuse this script from other pig scr...

hadoop apache-pig

Siren asked 26/9, 2011 at 15:33

1

Solved

Check if an element is present in a bag?

How can I check in piglatin, if a bag contains an element? Example : In a bag of chararray, how can I check if a token is present?

apache-pig

Thriller asked 15/10, 2014 at 19:9

6

Solved

What is the best Pig plugin for Eclipse?

I'm about to start playing around with PIG-latin, and I was hoping to get some text highlighting and such for it in Eclipse. Doing a quick Google search, I saw a couple of Eclipse plugins for it. A...

eclipse eclipse-plugin editor apache-pig

Artery asked 25/8, 2011 at 16:59

1

org.apache.hadoop.mapred.LocalClientProtocolProvider not found

I wrote a program to execute a embeded Pig sentence in Java. I executed the java sentence registryQuery. But when I try on to store the result, I give a error of org.apache.hadoop.mapred.localClien...

java hadoop apache-pig

Compost asked 7/6, 2014 at 11:48

2

Solved

Pig - How to cast datetime to chararray

I'm using CurrentTime(), which is a datetime data type. However, I need it as a chararray. I have the following: A = LOAD ... B = FOREACH A GENERATE CurrentTime() AS todaysDate; I've tried vario...

apache-pig

Orthodontist asked 29/5, 2013 at 16:37

3

How do I make Hadoop find imported Python modules when using Python UDFs in Pig?

I am using Pig (0.9.1) with UDFs written in Python. The Python scripts import modules from the standard Python library. I have been able to run the Pig scrips that call the Python UDFs successfully...

python hadoop jython apache-pig

Vickievicksburg asked 20/10, 2011 at 5:47

1

Solved

How to remove duplicate columns after a JOIN in Pig?

Let's say I JOIN two relations like: -- part looks like: -- 1,5.3 -- 2,4.9 -- 3,4.9 -- original looks like: -- 1,Anju,3.6,IT,A,1.6,0.3 -- 2,Remya,3.3,EEE,B,1.6,0.3 -- 3,akhila,3.3,IT,C,1.3,0.3 j...

java hadoop join apache-pig

Hypoglycemia asked 20/4, 2014 at 5:13

3

Solved

Splitting a tuple into multiple tuples in Pig

I like to generate multiple tuples from a single tuple. What I mean is: I have file with following data in it. >> cat data ID | ColumnName1:Value1 | ColumnName2:Value2 so I load it by the...

hadoop apache-pig

Field asked 2/7, 2012 at 3:1

3

Can I pass parameters to UDFs in Pig script?

I am relatively new to PigScript. I would like to know if there is a way of passing parameters to Java UDFs in Pig? Here is the scenario: I have a log file which have different columns (each repre...

apache-pig

Piracy asked 31/10, 2012 at 17:38

1

Solved

What is the difference between GROUP and COGROUP in PIG?

I understood Group didn't work with multiple tuples and hence we had COGROUP in PIG. However, while checking today the GROUP command works for me. I am using PIG-0.12.0. My commands and outputs are...

hadoop apache-pig

Allbee asked 30/7, 2014 at 4:9

3

Solved

StrSplit in Pig functions

Can Some one explain me on getting this below output in Pigscript my input file is below a.txt aaa.kyl,data,data bbb.kkk,data,data cccccc.hj,data,data qa.dff,data,data I am writing the pig sc...

apache-pig

Ebberta asked 27/7, 2014 at 13:26

2

Solved

Pig: Control number of mappers

I can control the number of reducers by using PARALLEL clause in the statements which result in reducers. I want to control the number of mappers. The data source is already created, and I can not...

hadoop apache-pig

Magalymagan asked 16/6, 2014 at 7:13

1

Error 1121 importing external library in Pig UDF in Jython

I'm having a problem using the python library simplejson in jython to write a Pig UDF. I need because jython-standalone-2.5.2.jar doesn't come with a JSON library. I'm using Apache Pig version 0.11...

python apache-pig jython user-defined-functions

Deutzia asked 1/2, 2014 at 21:42

5

Can I generate nested bags using nested FOREACH statements in Pig Latin?

Let's say I have a data set of restaurant reviews: User,City,Restaurant,Rating Jim,New York,Mecurials,3 Jim,New York,Whapme,4.5 Jim,London,Pint Size,2 Lisa,London,Pint Size,4 Lisa,London,Rabbit Wh...

apache-pig

Swedenborgian asked 8/2, 2011 at 11:53

3

Solved

How to flatten a group into a single tuple in Pig?

From this: (1, {(1,2), (1,3), (1,4)} ) (2, {(2,5), (2,6), (2,7)} ) ...How could we generate this? ((1,2),(1,3),(1,4)) ((2,5),(2,6),(2,7)) ...And how could we generate this? (1, 2, 3, 4) (2, ...

hadoop apache-pig

Interstitial asked 31/8, 2013 at 4:48

2

Define tuple datas in the pig script

I am currently debugging a pig script. I'd like to define a tuple in the Pig file directly (instead of the basic "Load" function). Is there a way to do it? I am looking for something like that: ...

hadoop apache-pig

Manual asked 14/9, 2012 at 11:14

2

Solved

Pig: Get top n values per group

I have data that's already grouped and aggregated, it looks like so: user value count ---- -------- ------ Alice third 5 Alice first 11 Alice second 10 Alice fourth 2 ... Bob second 20 Bob third 1...

hadoop hdfs apache-pig

Clamp asked 15/7, 2013 at 13:56

2

Solved

Usage of Apache Pig rank function

Am using Pig 0.11.0 rank function and generating ranks for every id in my data. I need ranking of my data in a particular way. I want the rank to reset and start from 1 for every new ID. Is it pos...

apache-pig

Standardbearer asked 10/4, 2014 at 11:42

2

Solved

Pig: loading a data file using an external schema file

I have a data file and a corresponding schema file stored in separate locations. I would like to load the data using the schema in the schema-file. I tried using A= LOAD '<file path>' USING ...

load schema gruntjs apache-pig

Gallinule asked 24/11, 2013 at 10:6

1

Solved

properly loading datetime in pig

I'm loading a tsv file with a datetime column and long column with: A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:datetime, userid:long); DUMP A; An example line of input: Tue Feb...

hadoop apache-pig

Ainslee asked 26/2, 2014 at 20:31

1

Editing a multi million row file on Hadoop cluster

I am trying to edit a large file on Hadoop cluster and trim white spaces and special characters like ¦,*,@," etc from the file. I dont want to copyToLocal and use a sed as i have 1000's of such fil...

hadoop apache-pig

Fortunetelling asked 20/2, 2014 at 19:28

6

Installing PIG on single node

I installed Hadoop (1.0.2) for a single node on Windows 7 with Cygwin, and it is working. However, I cannot get PIG (0.10.0) to see the Hadoop. 1) "Error: JAVA_HOME is not set." I added this lin...

hadoop apache-pig

Propeller asked 13/7, 2012 at 11:46

1

Solved

Hadoop, Hive, Pig, HBase, Cassandra - when to use what? [closed]

First of all I am relatively new to Big Data and the Hadoop world and I have just started to experiment a little with the Hortonworks Sandbox (Pig and Hive so far). I was wondering in which c...

hadoop cassandra hive apache-pig

Ambush asked 29/1, 2014 at 18:2

0

Unable to find region for hello_world

Versions: Hadoop 2.2, Hbase 0.96.1, Pig 0.12 Whenever I run this pig script raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS ( listing_id: chararray, fname: chararray, lname: chara...

hadoop hbase apache-pig

Bindweed asked 15/1, 2014 at 13:23

apache-pig Questions

Recommended topics

Hot tags