首页 文章 精选 留言 我的

精选列表

搜索[hadoop],共8437篇文章
优秀的个人博客,低调大师

mahout demo——本质上是基于Hadoop的分步式算法实现,比如多节点的数据合并,数据排序,网路通信的效率,节点宕机重算,数据分步式...

摘自:http://blog.fens.me/mahout-recommendation-api/ 测试程序:RecommenderTest.java 测试数据集:item.csv 1,101,5.0 1,102,3.0 1,103,2.5 2,101,2.0 2,102,2.5 2,103,5.0 2,104,2.0 3,101,2.5 3,104,4.0 3,105,4.5 测试程序:org.conan.mymahout.recommendation.job.RecommenderTest.java package org.conan.mymahout.recommendation.job; import java.io.IOException; import java.util.List; import org.apache.mahout.cf.taste.common.TasteException; import org.apache.mahout.cf.taste.eval.RecommenderBuilder; import org.apache.mahout.cf.taste.impl.common.LongPrimitiveIterator; import org.apache.mahout.cf.taste.model.DataModel; import org.apache.mahout.cf.taste.recommender.RecommendedItem; import org.apache.mahout.common.RandomUtils; public class RecommenderTest { final static int NEIGHBORHOOD_NUM = 2; final static int RECOMMENDER_NUM = 3; public static void main(String[] args) throws TasteException, IOException { RandomUtils.useTestSeed(); String file = "datafile/item.csv"; DataModel dataModel = RecommendFactory.buildDataModel(file); slopeOne(dataModel); } public static void userCF(DataModel dataModel) throws TasteException{} public static void itemCF(DataModel dataModel) throws TasteException{} public static void slopeOne(DataModel dataModel) throws TasteException{} ... 每种算法都一个单独的方法进行算法测试,如userCF(),itemCF(),slopeOne()…. 5. 基于用户的协同过滤算法UserCF 基于用户的协同过滤,通过不同用户对物品的评分来评测用户之间的相似性,基于用户之间的相似性做出推荐。简单来讲就是:给用户推荐和他兴趣相似的其他用户喜欢的物品。 举例说明: 基于用户的 CF 的基本思想相当简单,基于用户对物品的偏好找到相邻邻居用户,然后将邻居用户喜欢的推荐给当前用户。计算上,就是将一个用户对所有物品的偏好作为一个向量来计算用户之间的相似度,找到 K 邻居后,根据邻居的相似度权重以及他们对物品的偏好,预测当前用户没有偏好的未涉及物品,计算得到一个排序的物品列表作为推荐。图 2 给出了一个例子,对于用户 A,根据用户的历史偏好,这里只计算得到一个邻居 – 用户 C,然后将用户 C 喜欢的物品 D 推荐给用户 A。 上文中图片和解释文字,摘自:https://www.ibm.com/developerworks/cn/web/1103_zhaoct_recommstudy2/ 算法API: org.apache.mahout.cf.taste.impl.recommender.GenericUserBasedRecommender @Override public float estimatePreference(long userID, long itemID) throws TasteException { DataModel model = getDataModel(); Float actualPref = model.getPreferenceValue(userID, itemID); if (actualPref != null) { return actualPref; } long[] theNeighborhood = neighborhood.getUserNeighborhood(userID); return doEstimatePreference(userID, theNeighborhood, itemID); } protected float doEstimatePreference(long theUserID, long[] theNeighborhood, long itemID) throws TasteException { if (theNeighborhood.length == 0) { return Float.NaN; } DataModel dataModel = getDataModel(); double preference = 0.0; double totalSimilarity = 0.0; int count = 0; for (long userID : theNeighborhood) { if (userID != theUserID) { // See GenericItemBasedRecommender.doEstimatePreference() too Float pref = dataModel.getPreferenceValue(userID, itemID); if (pref != null) { double theSimilarity = similarity.userSimilarity(theUserID, userID); if (!Double.isNaN(theSimilarity)) { preference += theSimilarity * pref; totalSimilarity += theSimilarity; count++; } } } } // Throw out the estimate if it was based on no data points, of course, but also if based on // just one. This is a bit of a band-aid on the 'stock' item-based algorithm for the moment. // The reason is that in this case the estimate is, simply, the user's rating for one item // that happened to have a defined similarity. The similarity score doesn't matter, and that // seems like a bad situation. if (count <= 1) { return Float.NaN; } float estimate = (float) (preference / totalSimilarity); if (capper != null) { estimate = capper.capEstimate(estimate); } return estimate; } 测试程序: public static void userCF(DataModel dataModel) throws TasteException { UserSimilarity userSimilarity = RecommendFactory.userSimilarity(RecommendFactory.SIMILARITY.EUCLIDEAN, dataModel); UserNeighborhood userNeighborhood = RecommendFactory.userNeighborhood(RecommendFactory.NEIGHBORHOOD.NEAREST, userSimilarity, dataModel, NEIGHBORHOOD_NUM); RecommenderBuilder recommenderBuilder = RecommendFactory.userRecommender(userSimilarity, userNeighborhood, true); RecommendFactory.evaluate(RecommendFactory.EVALUATOR.AVERAGE_ABSOLUTE_DIFFERENCE, recommenderBuilder, null, dataModel, 0.7); RecommendFactory.statsEvaluator(recommenderBuilder, null, dataModel, 2); LongPrimitiveIterator iter = dataModel.getUserIDs(); while (iter.hasNext()) { long uid = iter.nextLong(); List list = recommenderBuilder.buildRecommender(dataModel).recommend(uid, RECOMMENDER_NUM); RecommendFactory.showItems(uid, list, true); } } 程序输出: AVERAGE_ABSOLUTE_DIFFERENCE Evaluater Score:1.0 Recommender IR Evaluator: [Precision:0.5,Recall:0.5] uid:1,(104,4.333333)(106,4.000000) uid:2,(105,4.049678) uid:3,(103,3.512787)(102,2.747869) uid:4,(102,3.000000) 本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/7243282.html,如需转载请自行联系原作者

资源下载

更多资源
优质分享App

优质分享App

近一个月的开发和优化,本站点的第一个app全新上线。该app采用极致压缩,本体才4.36MB。系统里面做了大量数据访问、缓存优化。方便用户在手机上查看文章。后续会推出HarmonyOS的适配版本。

腾讯云软件源

腾讯云软件源

为解决软件依赖安装时官方源访问速度慢的问题,腾讯云为一些软件搭建了缓存服务。您可以通过使用腾讯云软件源站来提升依赖包的安装速度。为了方便用户自由搭建服务架构,目前腾讯云软件源站支持公网访问和内网访问。

Spring

Spring

Spring框架(Spring Framework)是由Rod Johnson于2002年提出的开源Java企业级应用框架,旨在通过使用JavaBean替代传统EJB实现方式降低企业级编程开发的复杂性。该框架基于简单性、可测试性和松耦合性设计理念,提供核心容器、应用上下文、数据访问集成等模块,支持整合Hibernate、Struts等第三方框架,其适用范围不仅限于服务器端开发,绝大多数Java应用均可从中受益。

Sublime Text

Sublime Text

Sublime Text具有漂亮的用户界面和强大的功能,例如代码缩略图,Python的插件,代码段等。还可自定义键绑定,菜单和工具栏。Sublime Text 的主要功能包括:拼写检查,书签,完整的 Python API , Goto 功能,即时项目切换,多选择,多窗口等等。Sublime Text 是一个跨平台的编辑器,同时支持Windows、Linux、Mac OS X等操作系统。

用户登录
用户注册