shell操作文件的几条命令:删除最后一列、删除第一行、diff等

删除文件第一行: sed ‘1d’ filename

删除文件最后一列: awk ‘{print $NF}’ filename

比较文件的两种方法:

1)comm -3 –nocheck-order file1 file2

2) grep -v -f file1 file2 :输出file2中有file1中没有的行

当然还有diff file1 file2

贴一段昨天写的shell脚本~

#!/bin/bashdate_time=“yesterday=`+`today=“date_day_time=“mkdir /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/same_similiar_log/$today# begin to get input files which haven’t been deal withtoday_input=/home/crawler/petabyte/crawllog/news_data/$todayyesterday_input=/home/crawler/petabyte/crawllog/news_data/$yesterday/opt/hadoop/program/bin/hadoop fs -ls $yesterday_input/ > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get/opt/hadoop/program/bin/hadoop fs -ls $today_input/ >> /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get/home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get_without_first_line/home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get_without_first_line > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input#comm -3 –nocheck-order /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diffgrep -v -f /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diff/home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diff > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_new_inputmv /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done# begin to compute same_similary_newsinputfile1=””while read linedo inputfile1=$inputfile1,${line}done < /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_doneecho $inputfile1,香港服务器,网站空间,网站空间天才是百分之一的灵感加上百分之九十九的努力

shell操作文件的几条命令:删除最后一列、删除第一行、diff等

相关文章:

你感兴趣的文章:

标签云: