Linux脚本Bash中的文本利器-awk

awk确实很复杂,平常用的也是很少的一部分。边查边用,把平常用的做做笔记,也是方便自己的查找。*调用方式awk [-F field-separator] ‘commands’ input-file(s)默认空格作为field-separator。*模式awk ‘BEGIN{} {command} END{}’ input.txt*正则表达式\ ^ $ . [] | () * + ?但+(一个或多个) ?(出现频率)不适应于grep和sed。*匹配与不匹配awk ‘if ($3~/pattern/) actions’ input.txtawk ‘if ($3!~/pattern/) actions’ input.txtawk ‘if ($3==”abc”) actions’ input.txt*awk内置变量—————————————————–A R G C 命令行参数个数A R G V 命令行参数排列E N V I R O N 支持队列中系统环境变量的使用FILENAME a w k浏览的文件名F N R 浏览文件的记录数F S 设置输入域分隔符,等价于命令行- F选项N F 浏览记录的域个数N R 已读的记录数O F S 输出域分隔符O R S 输出记录分隔符R S 控制记录分隔符—————————————————–*awk内置字符串函—————————————————–g s u b ( r, s ) 在整个$ 0中用s替代rg s u b ( r, s , t ) 在整个t中用s替代ri n d e x ( s , t ) 返回s中字符串t的第一位置l e n g t h ( s ) 返回s长度m a t c h ( s , r ) 测试s是否包含匹配r的字符串s p l i t ( s , a , f s ) 在f s上将s分成序列as p r i n t ( f m t , e x p ) 返回经f m t格式化后的e x ps u b ( r, s ) 用$ 0中最左边最长的子串代替ss u b s t r ( s , p ) 返回字符串s中从p开始的后缀部分s u b s t r ( s , p , n ) 返回字符串s中从p开始长度为n的后缀部分—————————————————–$1, $2…依次表示第一个,第二个。。。内部自动变量,,$0表示整条记录。首先执行BEGIN,当awk读完所有的输入行后,执行END(如果有的化)。And now for a grand example:# This awk program collects statistics on two # “random variables” and the relationships # between them. It looks only at fields 1 and # 2 by default Define the variables F and G # on the command line to force it to look at# different fields. For example: # awk -f stat_2o1.awk F=2 G=3 stuff.dat \# F=3 G=5 otherstuff.dat# or, from standard input: # awk -f stat_2o1.awk F=1 G=3# It ignores blank lines, lines where either # one of the requested fields is empty, and # lines whose first field contains a number # sign. It requires only one pass through the# data. This script works with vanilla awk # under SunOS 4.1.3.BEGIN{ F=1; G=2;}length($F) > 0 && \length($G) > 0 && \$1 !~/^#/ { sx1+= $F; sx2 += $F*$F; sy1+= $G; sy2 += $G*$G; sxy1+= $F*$G; if( N==0 ) xmax = xmin = $F; if( xmin > $F ) xmin=$F; if( xmax < $F ) xmax=$F; if( N==0 ) ymax = ymin = $G; if( ymin > $G ) ymin=$G; if( ymax < $G ) ymax=$G; N++;} END { printf(“%d # N\n” ,N ); if (N <= 1) { printf(“What’s the point?\n”); exit 1; } printf(“%g # xmin\n”,xmin); printf(“%g # xmax\n”,xmax); printf(“%g # xmean\n”,xmean=sx1/N); xSigma = sx2 – 2 * xmean * sx1+ N*xmean*xmean; printf(“%g # xvar\n”,xvar =xSigma/ N ); printf(“%g # xvar unbiased\n”,xvaru=xSigma/(N-1)); printf(“%g # xstddev\n”,sqrt(xvar )); printf(“%g # xstddev unbiased\n”,sqrt(xvaru)); printf(“%g # ymin\n”,ymin); printf(“%g # ymax\n”,ymax); printf(“%g # ymean\n”,ymean=sy1/N); ySigma = sy2 – 2 * ymean * sy1+ N*ymean*ymean; printf(“%g # yvar\n”,yvar =ySigma/ N ); printf(“%g # yvar unbiased\n”,yvaru=ySigma/(N-1)); printf(“%g # ystddev\n”,sqrt(yvar )); printf(“%g # ystddev unbiased\n”,sqrt(yvaru)); if ( xSigma * ySigma <= 0 ) r=0; else r=(sxy1 – xmean*sy1- ymean * sx1+ N * xmean * ymean)/sqrt(xSigma * ySigma); printf(“%g # correlation coefficient\n”, r); if( r > 1 || r < -1 ) printf(“SERIOUS ERROR! CORRELATION COEFFICIENT”); printf(” OUTSIDE RANGE -1..1\n”); if( 1-r*r != 0 ) printf(“%g # Student’s T (use with N-2 degfreed)\n&”, \t=r*sqrt((N-2)/(1-r*r)) ); else printf(“0 # Correlation is perfect,”); printf(” Student’s T is plus infinity\n”); b = (sxy1 – ymean * sx1)/(sx2 – xmean * sx1); a = ymean – b * xmean; ss=sy2 – 2*a*sy1- 2*b*sxy1 + N*a*a + 2*a*b*sx1+ b*b*sx2 ; ss/= N-2; printf(“%g # a = y-intercept\n”, a); printf(“%g # b = slope\n”, b); printf(“%g # s^2 = unbiased estimator for sigsq\n”,ss); printf(“%g + %g * x # equation ready for cut-and-paste\n”,a,b); ra = sqrt(ss * sx2 / (N * xSigma)); rb = sqrt(ss/ ( xSigma)); printf(“%g # radius of confidence interval “); printf(“for a, multiply by t\n”,ra); printf(“%g # radius of confidence interval “); printf(“for b, multiply by t\n”,rb);}

你有没有这样的感觉,坐在一列火车上,

Linux脚本Bash中的文本利器-awk

相关文章:

你感兴趣的文章:

标签云: