Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

如何获得推荐系统领域常用的评价指标 #9

Open
guchi-ac opened this issue Dec 28, 2024 · 2 comments
Open

如何获得推荐系统领域常用的评价指标 #9

guchi-ac opened this issue Dec 28, 2024 · 2 comments

Comments

@guchi-ac
Copy link

感谢您的开源代码,非常简洁而且配置方便,但我有一个小问题。我阅读了您的论文EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender Systems,在5.2小节这样写道“”Furthermore, EasyRL4Rec provides abundant evaluation metrics that are commonly used in the field of Reinforcement Learning and Recommender Systems, which are summarised in Table 3.“我想知道如何得到MAE, MSE, RMSE等等这些指标,目前我使用的语句是这样的:python examples/policy/run_DQN.py --env KuaiEnv-v0 --seed 2023 --cuda 0 --which_tracker avg --reward_handle "cat" --window_size 3 --target-update-freq 80 --explore_eps 0.001 --read_message "pointneg" --message "DQN"
,我观察到在输出的log文件中并没有关于MAE、MSE这些指标,只有强化学习相关的指标。非常感谢如果您能告诉我如何获得这些指标。

@yuyq18
Copy link
Collaborator

yuyq18 commented Dec 31, 2024

Thank you for your kind words and for using EasyRL4Rec!

Regarding your question on how to obtain metrics like MAE, MSE, and RMSE, I’d like to clarify that in Section 5.2 of our paper, we mentioned that We also provide commonly used metrics in traditional RSs for evaluating user models or other recommendation models, such as Normalized Discounted Cumulative Gain (NDCG), HitRate, etc. For RL scenarios, however, due to the nature of multi-step interactions, metrics like $R_{cumu}$, $Interaction Length$, and others that track the progress of the agent are generally more appropriate. These metrics help assess the learning and decision-making process over time rather than individual errors like MAE or RMSE.

That being said, our repository does support adding custom metrics. The code for implementing NDCG, MRR, and other metrics can be found in src/core/evaluation/metrics.py. If you'd like to integrate these traditional metrics into the RL policy evaluation, you can refer to the implementation of Evaluator_User_Experience() in src/core/evaluation/evaluator.py, specifically the on_epoch_end callback function. This function is called at the end of each epoch during training and allows you to compute additional metrics. You can modify this to include MAE, MSE, or any other metric you require.

@guchi-ac
Copy link
Author

guchi-ac commented Jan 3, 2025

感谢回复,我明白了

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants