panguoyuan的专栏

一、 用Maven搭建Mahout的开发环境

package com.panguoyuan.mahout.itemcf;import java.io.File;import java.io.IOException;import java.util.List;import org.apache.mahout.cf.taste.common.TasteException;import org.apache.mahout.cf.taste.impl.common.LongPrimitiveIterator;import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;import org.apache.mahout.cf.taste.impl.similarity.EuclideanDistanceSimilarity;import org.apache.mahout.cf.taste.model.DataModel;import org.apache.mahout.cf.taste.recommender.RecommendedItem;import org.apache.mahout.cf.taste.recommender.Recommender;import org.apache.mahout.cf.taste.similarity.UserSimilarity;public class UserCF {final static int NEIGHBORHOOD_NUM = 2;final static int RECOMMENDER_NUM = 3;public static void main(String[] args) throws IOException, TasteException {String file = "inputdata/item.csv";DataModel model = new FileDataModel(new File(file));UserSimilarity user = new EuclideanDistanceSimilarity(model);NearestNUserNeighborhood neighbor = new NearestNUserNeighborhood(NEIGHBORHOOD_NUM, user, model);Recommender r = new GenericUserBasedRecommender(model, neighbor, user);LongPrimitiveIterator iter = model.getUserIDs();while (iter.hasNext()) {long uid = iter.nextLong();List<RecommendedItem> list = r.recommend(uid, RECOMMENDER_NUM);System.out.printf("uid:%s", uid);for (RecommendedItem ritem : list) {System.out.printf("(%s,%f)", ritem.getItemID(), ritem.getValue());}System.out.println();}}}

(8)在eclipse里运行结果如下

二、用案例的数据集,基于Mahout,任选一种算法,对任意一个女性用户进行协同过滤推荐,并解释推荐结果是否合理,解释过程可以写成一文档说明。1、选择基于用户的协同过滤算法:UserCF2、算法模型:DataModel+UserSimilarity+UserNeighborhood+UserBasedRecommenderpackage com.panguoyuan.mahout.itemcf;import java.io.File;import java.util.List;import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;import org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender;import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;import org.apache.mahout.cf.taste.model.DataModel;import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;import org.apache.mahout.cf.taste.recommender.RecommendedItem;import org.apache.mahout.cf.taste.recommender.UserBasedRecommender;import org.apache.mahout.cf.taste.similarity.UserSimilarity;public class BasedUserBookRecommender2 {public static void main(String[] args) throws Exception {long userId = 188;//构建数据模型DataModel model = new FileDataModel(new File("inputdata/rating.csv"));//创建相似度UserSimilarity itemSimilarity = new PearsonCorrelationSimilarity(model);//UserSimilarity itemSimilarity = new EuclideanDistanceSimilarity(model);//GenericUserSimilarity genericItemSimilarity = new GenericUserSimilarity(itemSimilarity, model);//构建近邻算法UserNeighborhood neighborhood = new NearestNUserNeighborhood(3, itemSimilarity, model);//构建推荐模型UserBasedRecommender userBasedRecommender = new GenericUserBasedRecommender(model, neighborhood, itemSimilarity);//计算并返回图书推荐结果List<RecommendedItem> recommendations = userBasedRecommender.recommend(188, 5);//打印推荐结果showItems(userId, recommendations, true);}public static void showItems(long uid,List<RecommendedItem> recommendations, boolean skip) {if (skip || recommendations.size() > 0) {System.out.printf("userId:%s,", uid);for (RecommendedItem r : recommendations) {System.out.printf("(%s,%f)", r.getItemID(), r.getValue());}System.out.println();}}}

4、输出结果

userId:188,(885,9.500000)(396,7.000000)(688,6.000000)

5、用R语言对推荐结果进行人工分析(1)导入分析数据(rating.csv为评分数据,user.csv为用户信息)ratings=read.csv("F:\workspace1\mahout\inputdata\rating.csv",FALSE)users=read.csv("F:\workspace1\mahout\inputdata\user.csv",FALSE)

(2)修改列名

ratings=data.frame('userid'=ratings$V1,'bookid'=ratings$V2,'grade'=ratings$V3)users=data.frame('userid'=users$V1,'sex'=users$V2,'age'=users$V3)

(3)查看用户188都看了哪些书> ratings[c(ratings$userid==188),]userid bookid grade3760 188 79863761 188 65333762 188 42663763 188 74273764 188 54923765 188 52083766 188 31223767 188 213 103768 188 95453769 188 121 103770 188 20493771 188 68433772 188 49343773 188 45213774 188 62233775 188 2988

(4)图书885推荐分数最高,下面查看该图书有哪些人评过分

ratings[c(ratings$bookid==885),]userid bookid grade1829 8858122560 885 103691 184 8859

(5)查看这用户9,用户60,用户184,用户188的信息

> users[c(9,60,184,188),] userid sex age99 M 506060 F 49184 184 M 27188 188 F 24

(6)查看这用户9,用户60,,用户184与用户188都共同看了哪些图书

联系朋友别欠费,天空辽阔任你飞,再多困难别后退! “

panguoyuan的专栏

相关文章:

你感兴趣的文章:

标签云: