使用 Map-Reduce 统计Web 服务器 access.log 日志文件

1.6.Map-Reduce 1.6.1.使用 Map-Reduce 统计Web 服务器 access.log 日志文件

首先将web服务器access.log倒入到mongodb,参考http://netkiller.github.io/article/log.html。格式如下:

{"_id" : ObjectId("51553efcd8616be7e5395c0d"),"remote_addr" : "192.168.2.76","remote_user" : "-","time_local" : "29/Mar/2013:09:20:31 +0800","request" : "GET /tw/ad.jpg HTTP/1.1","status" : "200","body_bytes_sent" : "5557","http_referer" : "http://www.example.com/tw/","http_user_agent" : "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17","http_x_forwarded_for" : "-"}

创建map方法

var mapFunction1 = function() {    emit(this.remote_addr, {count:1});};

创建reduce方法

var reduceFunction1 = function(key, values) {var total = 0;values.forEach(function (value) {total += value.count;});    return {ipaddr: key, count:total};};

分析数据

db.access.mapReduce(mapFunction1, reduceFunction1, {out : "resultCollection"});

输出结果

db.resultCollection.find();

Demo 数据库

> db.resultCollection.find();{ "_id" : "192.168.2.109", "value" : { "count" : 554 } }{ "_id" : "192.168.2.38", "value" : { "count" : 26 } }{ "_id" : "192.168.2.39", "value" : { "count" : 72 } }{ "_id" : "192.168.2.40", "value" : { "count" : 3564 } }{ "_id" : "192.168.2.49", "value" : { "count" : 955 } }{ "_id" : "192.168.2.5", "value" : { "count" : 2 } }{ "_id" : "192.168.2.76", "value" : { "count" : 60537 } }{ "_id" : "192.168.3.12", "value" : { "count" : 9577 } }{ "_id" : "192.168.3.14", "value" : { "count" : 343 } }{ "_id" : "192.168.3.18", "value" : { "count" : 1006 } }{ "_id" : "192.168.3.26", "value" : { "count" : 2714 } }{ "_id" : "192.168.6.19", "value" : { "count" : 668 } }{ "_id" : "192.168.6.2", "value" : { "count" : 123760 } }{ "_id" : "192.168.6.30", "value" : { "count" : 1196 } }{ "_id" : "192.168.6.35", "value" : { "count" : 1050 } }

一旦有了意志,脚步也会轻松起来。

使用 Map-Reduce 统计Web 服务器 access.log 日志文件

相关文章:

你感兴趣的文章:

标签云: