Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

【平行语料来源探索】歌词语料 #92

Open
voidf opened this issue Nov 6, 2024 · 2 comments
Open

【平行语料来源探索】歌词语料 #92

voidf opened this issue Nov 6, 2024 · 2 comments
Assignees

Comments

@voidf
Copy link
Member

voidf commented Nov 6, 2024

人学外语的一个常见途径是看电影听歌,看电影我们已经有字幕语料这个单子了,在想是不是收集歌词也可以是一个单子。

可能需要评估一下各大音乐平台的歌词能否允许我们收集(或者背地里收集)。

@voidf
Copy link
Member Author

voidf commented Nov 23, 2024

给一下需要调研的文章:

网易云歌词下载:

请负责这块的小组成员浏览这些文章,尝试里面给出的网页链接,尝试搜索歌词,然后将至少含有中英的歌词网站整理为类似如下的格式:

  • <网站链接1>
  • <网站链接2>
  • <网站链接3>

以方便爬虫组员收集下载这些网页上的歌词

有可行性的网站:

@voidf
Copy link
Member Author

voidf commented Feb 15, 2025

确认了目前以上歌词网站都没有适合做成平行语料的数据。

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants