User-based and Item-based Collaborative Filtering algorithms written in Python
- Language: Python3
- IDE: Eclipse PyDev
- Prerequisite libraries: Numpy
- If you use a built-up model, the recommender system considers only the nearest neighbors existing in the model. Otherwise, the recommender looks for K-similar neighbors for each target user by using the given similarity measure and the number(K) of nearest neighbors.
- In unary data, the predicted score of the item is the average similarity of the nearest neighbors who rated on the item.
- User similarity does not include those of neighbors whose similarity is zero or lower value.
- The cosine similarity basically considers only co-rated items. (Another measures such as the basic cosine similarity and Pearson correlation coefficient are also applicable.)
UserID \t ItemID \t Rating \n
>>> import tool
>>> data = tool.loadData("/home/changuk/data/MovieLens/movielens.dat")
>>> from recommender import UserBased
>>> ubcf = UserBased()
>>> ubcf.loadData(data)
>>> import similarity
>>> simMeasure = similarity.cosine_intersection
>>> for user in data.keys():
... recommendation = ubcf.Recommendation(user, simMeasure=simMeasure, nNeighbors=30)
>>> import tool
>>> data = tool.loadData("/home/changuk/data/MovieLens/movielens.dat")
>>> from recommender import ItemBased
>>> ibcf = ItemBased()
>>> ibcf.loadData(data)
>>> model = ibcf.buildModel(nNeighbors=20)
>>> for user in data.keys():
... recommendation = ibcf.Recommendation(user, model=model)
>>> import tool
>>> trainSet = tool.loadData("/home/changuk/data/MovieLens/u1.base")
>>> testSet = tool.loadData("/home/changuk/data/MovieLens/u1.test")
>>> from recommender import UserBased
>>> ubcf = UserBased()
>>> ubcf.loadData(trainSet)
>>> model = ubcf.buildModel(nNeighbors=30)
>>> import validation
>>> result = validation.evaluateRecommender(testSet, ubcf, model=model, topN=10)
>>> print(result)
{'Precision': 0.050980392156862, 'Recall': 0.009698538130460, 'Hit-rate': 0.5098039215686}
- Support binary data
- Implement similarity normalization in Item-based CF