Cloudera : Apache Pig tutorial

> hdfs dfs -put /etc/passwd /user/cloudera

> pig -x mapreduce

> clear

grunt> A= load ‘/user/cloudera/passwd’ using PigStorage(‘:’);

grunt> dumb A;   # execute map reduce jobs

grunt> B= foreach A generate $0, $4, $5 ;

grunt> dumb B;   # execute map reduce jobs

grunt> store B into ‘file.out’   # store in hdfs

Leave a Reply