(1)安装–xzfPig-0.9.2-tar.gz,,并重命名为pig;
(2)在/etc/profile文件里面添加下面内容:
vim/etc/profile
#setpigpath
exportPIG_HOME=/usr/pig
exportPATH=${PIG_HOME}/bin:${PATH}
然后source/etc/profile使其生效;
(3)在pig.properties,在其中添加指向logs的路径:pig.logfile=/usr/pig/logs
(4)然后测试
[hadoop@Masterpc~]$pig
2015-03-2518:56:41,348[main]INFOorg.apache.pig.Main-Loggingerrormessagesto:/usr/pig/logs/pig_1427281001347.log
2015-03-2518:56:41,551[main]INFOorg.apache.pig.backend.hadoop.executionengine.HExecutionEngine-Connectingtohadoopfilesystemat:hdfs://172.16.2.42:9000
2015-03-2518:56:41,786[main]INFOorg.apache.pig.backend.hadoop.executionengine.HExecutionEngine-Connectingtomap-reducejobtrackerat::9001
grunt>
(5)接下来就进行实战了:问题说明:由于电信公司一般把通话记录
(6)假设关系数据库有一个
1312444562132131244458000613712341234137123412351
1312444159800131244417000013712345678100862
假设:查询任务是查询开始时间是XXX,被叫号码是由于cdr.txt存放后如下所示:
grunt>cat/user/hadoop/cdr.txt
1312444562132131244458000613712341234137123412351
1312444159800131244417000013712345679100862
1312544159800131255417000013712345677100862
1312644159800131265417000013712345676100862
1312744159800131276417000013712345675100862
1312844159800131285417000013712345674135521324532
1312944159800131297417000013712345673100862
1313444159800131345417000013712345672100862
1314444159800131454417000013712345678100862
grunt>
grunt>A=LOAD’hdfs://172.16.2.42:9000/user/hadoop/cdr.txt’USINGPigStorage(‘\t’)AS(start_time:long,end_time:long,calling_number:long,called_number:long,cdr_type:int);
2015-03-2519:27:58,624[main]ERRORorg.apache.pig.tools.grunt.Grunt-ERROR2998:Unhandledinternalerror.name
Detailsatlogfile:/usr/pig/logs/pig_1427282702115.log
grunt>
出现上述错误:经过
经过一番的调试后,终于找出了问题的所在:为之前搭建,现在只需要把hadoop/lib执行下面命名:
[hadoop@Masterpclib]$rm-rantlr-runtime-3.0.1.jar
删除后,重新执行pig程序:
grunt>A=LOAD’hdfs://172.16.2.42:9000/user/hadoop/cdr.txt’USINGPigStorage(‘\t’)AS(start_time:long,end_time:long,calling_number:long,called_number:long,cdr_type:int);
grunt>B=FILTERABYstart_time>1300835802ANDcalled_number==10086ANDcdr_type==2;
2015-03-2520:13:00,021[main]WARNorg.apache.pig.PigServer-EncounteredWarningIMPLICIT_CAST_TO_LONG2time(s).
grunt>
2015-03-2520:14:42,788[main]WARNorg.apache.pig.PigServer-EncounteredWarningIMPLICIT_CAST_TO_LONG2time(s).
。。。。。。
2015-03-2520:14:45,026[main]INFOorg.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher-1map-reducejob(s)waitingforsubmission.
****hdfs://172.16.2.42:9000/user/hadoop/cdr.txt
。。。。。。
2015-03-2520:15:10,767[main]INFOorg.apache.pig.tools.pigstats.SimplePigStats-ScriptStatistics:
我希望你能知道,我的心永远只为你跳动。