蝈蝈的编程世界

本文说明hadoop安装之后验证安装和配置的方法,hadoop-1.2.1安装方法参考:hadoop-1.2.1安装方法详解1、为方便运行hadoop命令,首先配置一下hadoop的环境变量打开hadoop用户目录下的 .bashrc 文件,添加修改以下内容:

export HADOOP_HOME=/home/hadoop/hadoop-1.2.1export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin配置之后注意要重新登录或者执行命令 source .bashrc 使配置生效配置HADOOP_HOME之后,运行命令会报 ”Warning: $HADOOP_HOME is deprecated. “警告,解决方法参考hadoop1.2.1报Warning: $HADOOP_HOME is deprecated. 的解决方法

2、在本地创建测试文件,为上传方便,我们先创建一个input文件,,然后建立test1.txt和test2.txt文件[hadoop@mdw temp]$mkdir input[hadoop@mdw temp]$cd input/[hadoop@mdw input]$echo "hello world hello hadoop" > test1.txt[hadoop@mdw input]$echo "hello hadoop" > test2.txt[hadoop@mdw input]$lltotal 8-rw-rw-r– 1 hadoop hadoop 25 May 29 01:19 test1.txt-rw-rw-r– 1 hadoop hadoop 13 May 29 01:20 test2.txt[hadoop@mdw input]$cat test1.txthello world hello hadoop[hadoop@mdw input]$cat test2.txthello hadoop红色内容为两个文本文件的内容,我们要通过MapReduce作业统计单词的个数3、上传这两个文本文件到hdfs文件系统(我这里上传到hdfs文件系统的 in 文件夹中)[hadoop@mdw input]$hadoop dfs -put ../input/ in上传之后,查看hdfs文件系统[hadoop@mdw input]$hadoop dfs -lsFound 1 itemsdrwxr-xr-x – hadoop supergroup 0 2015-05-29 01:31 /user/hadoop/in我就可以看到,/user/hadoop文件夹中多了一个in文件夹,执行命令查看文件内容如下[hadoop@mdw input]$hadoop dfs -ls ./inFound 2 items-rw-r–r– 2 hadoop supergroup 25 2015-05-29 01:31 /user/hadoop/in/test1.txt-rw-r–r– 2 hadoop supergroup 13 2015-05-29 01:31 /user/hadoop/in/test2.txt如果上传时报以下错误:put: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/hadoop/in. Name node is in safe mode.运行命令关闭安全模式即可:[hadoop@mdw input]$hadoop dfsadmin -safemode leave

4、运行hadoop自带的单词计数程序统计单词个数[hadoop@mdw input]$hadoop jar ~/hadoop-1.2.1/hadoop-examples-1.2.1.jar wordcount in out15/05/29 01:41:13 INFO input.FileInputFormat: Total input paths to process : 215/05/29 01:41:13 INFO util.NativeCodeLoader: Loaded the native-hadoop library15/05/29 01:41:13 WARN snappy.LoadSnappy: Snappy native library not loaded15/05/29 01:41:14 INFO mapred.JobClient: Running job: job_201505290130_000115/05/29 01:41:15 INFO mapred.JobClient:map 0% reduce 0%15/05/29 01:41:19 INFO mapred.JobClient:map 50% reduce 0%15/05/29 01:41:20 INFO mapred.JobClient:map 100% reduce 0%15/05/29 01:41:26 INFO mapred.JobClient:map 100% reduce 33%15/05/29 01:41:27 INFO mapred.JobClient:map 100% reduce 100%15/05/29 01:41:27 INFO mapred.JobClient: Job complete: job_201505290130_000115/05/29 01:41:27 INFO mapred.JobClient: Counters: 2915/05/29 01:41:27 INFO mapred.JobClient: Job Counters15/05/29 01:41:27 INFO mapred.JobClient: Launched reduce tasks=115/05/29 01:41:27 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=537415/05/29 01:41:27 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=015/05/29 01:41:27 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=015/05/29 01:41:27 INFO mapred.JobClient: Launched map tasks=215/05/29 01:41:27 INFO mapred.JobClient: Data-local map tasks=215/05/29 01:41:27 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=814215/05/29 01:41:27 INFO mapred.JobClient: File Output Format Counters15/05/29 01:41:27 INFO mapred.JobClient: Bytes Written=2515/05/29 01:41:27 INFO mapred.JobClient: FileSystemCounters15/05/29 01:41:27 INFO mapred.JobClient: FILE_BYTES_READ=6815/05/29 01:41:27 INFO mapred.JobClient: HDFS_BYTES_READ=25415/05/29 01:41:27 INFO mapred.JobClient: FILE_BYTES_WRITTEN=16560415/05/29 01:41:27 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=2515/05/29 01:41:27 INFO mapred.JobClient: File Input Format Counters15/05/29 01:41:27 INFO mapred.JobClient: Bytes Read=3815/05/29 01:41:27 INFO mapred.JobClient: Map-Reduce Framework15/05/29 01:41:27 INFO mapred.JobClient: Map output materialized bytes=7415/05/29 01:41:27 INFO mapred.JobClient: Map input records=215/05/29 01:41:27 INFO mapred.JobClient: Reduce shuffle bytes=7415/05/29 01:41:27 INFO mapred.JobClient: Spilled Records=1015/05/29 01:41:27 INFO mapred.JobClient: Map output bytes=6215/05/29 01:41:27 INFO mapred.JobClient: CPU time spent (ms)=196015/05/29 01:41:27 INFO mapred.JobClient: Total committed heap usage (bytes)=33778073615/05/29 01:41:27 INFO mapred.JobClient: Combine input records=615/05/29 01:41:27 INFO mapred.JobClient: SPLIT_RAW_BYTES=21615/05/29 01:41:27 INFO mapred.JobClient: Reduce input records=515/05/29 01:41:27 INFO mapred.JobClient: Reduce input groups=315/05/29 01:41:27 INFO mapred.JobClient: Combine output records=515/05/29 01:41:27 INFO mapred.JobClient: Physical memory (bytes) snapshot=32422297615/05/29 01:41:27 INFO mapred.JobClient: Reduce output records=315/05/29 01:41:27 INFO mapred.JobClient: Virtual memory (bytes) snapshot=112825958415/05/29 01:41:27 INFO mapred.JobClient: Map output records=6注意:hadoop自带的hadoop-examples-1.2.1.jar这个jar文件在hadoop的安装目录下通过上面的信息,我们可以看到,MapReduce任务成功执行,我们再次查看hdfs文件系统,会发现里面多了一个out文件夹[hadoop@mdw input]$hadoop dfs -lsFound 2 itemsdrwxr-xr-x – hadoop supergroup 0 2015-05-29 01:31 /user/hadoop/indrwxr-xr-x – hadoop supergroup 0 2015-05-29 01:41 /user/hadoop/out执行ls命令查看out文件夹里的内容:[hadoop@mdw input]$hadoop dfs -ls ./outFound 3 items-rw-r–r– 2 hadoop supergroup 0 2015-05-29 01:41 /user/hadoop/out/_SUCCESSdrwxr-xr-x – hadoop supergroup 0 2015-05-29 01:41 /user/hadoop/out/_logs-rw-r–r– 2 hadoop supergroup 25 2015-05-29 01:41 /user/hadoop/out/part-r-00000MapReduce任务输出的数据主要存在part-r-00000文件中,我们可以执行命令查看这几个文件的具体内容:[hadoop@mdw input]$hadoop dfs -cat ./out/*hadoop2hello 3world 1cat: File does not exist: /user/hadoop/out/_logs这里红色部分就是MapReduce的执行结果,两个文本文件中的单词被正确统计出来,到此已能充分说明hadoop已经成功安装配置

版权声明:本文为博主原创文章,转载请注明本文链接。

什么天荒地老,什么至死不渝。都只是锦上添花的借口…

蝈蝈的编程世界

相关文章:

你感兴趣的文章:

标签云: