🎼 JamendoMaxCaps: A Large-Scale Music-Caption Dataset with Imputed Metadata

📌 Overview

JamendoMaxCaps is a large-scale dataset of 200,000+ instrumental tracks sourced from the Jamendo platform. It includes generated music captions and enhanced imputed metadata. We also introduce a retrieval system that leverages both musical features and metadata to identify similar songs, which are then used to fill in missing metadata using a local large language model (LLLM). This dataset supports research in music-language understanding, retrieval, representation learning, and AI-generated music tasks.

✨ Features

✅ 200,000+ Instrumental Tracks from Jamendo
✅ State-of-the-Art Music Captions generated using a cutting-edge model
✅ Metadata Imputation using a retrieval-enhanced LLM (Llama-2)
✅ Comprehensive Musical and Metadata Features:

🎵 MERT-based audio embeddings
📝 Flan-T5 metadata embeddings
🔍 Imputed metadata fields (genre, tempo, mood, instrumentation)

⚡ Installation Guide

git clone https://github.com/AMAAI-Lab/JamendoMaxCaps.git
cd JamendoMaxCaps
conda create -n jamendomaxcaps python=3.10
pip install -r requirements.txt

🚀 Usage

🎼 Extract MERT Features

python extract_mert.py

Ensure input and output folders are correctly configured.

📝 Get Metadata Features

python process_metadata.py

Adjust input and output folder paths accordingly.

🔍 Build Unified Retrieval System

python build_retrival_system.py --weight_audio <weight_audio> --weight_metadata <weight_metadata>

🎶 Find Top Similar Songs

python retrieve_similar_entries.py --config <config_file_path>

🛠️ Run Metadata Imputation

python metadata_imputation.py

📖 Citation

If you use JamendoMaxCaps, please cite:

@article{royjamendomaxcaps2025,
  author    = {Abhinaba Roy, Renhang Liu, Tongyu Lu, Dorien Herremans},
  title     = {JamendoMaxCaps: A Large-Scale Music-Caption Dataset with Imputed Metadata},
  year      = {2025},
  journal   = {arXiv:xxxxx}
}

🤝 Acknowledgments

JamendoMaxCaps is built upon Creative Commons-licensed music from the Jamendo platform and leverages advanced AI models, including MERT, Flan-T5, and Llama-2. Special thanks to the research community for their invaluable contributions to open-source AI development!

📜 Read the Paper | 🎵 Explore the Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
config		config
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_retrieval_system.py		build_retrieval_system.py
check_noisiness.py		check_noisiness.py
dataloader.py		dataloader.py
extract_mert.py		extract_mert.py
metadata_imputation.py		metadata_imputation.py
process_metadata.py		process_metadata.py
requirements.txt		requirements.txt
retrieve_similar_entries.py		retrieve_similar_entries.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎼 JamendoMaxCaps: A Large-Scale Music-Caption Dataset with Imputed Metadata

📌 Overview

✨ Features

⚡ Installation Guide

🚀 Usage

🎼 Extract MERT Features

📝 Get Metadata Features

🔍 Build Unified Retrieval System

🎶 Find Top Similar Songs

🛠️ Run Metadata Imputation

📖 Citation

🤝 Acknowledgments

About

Releases

Packages

Languages

License

AMAAI-Lab/JamendoMaxCaps

Folders and files

Latest commit

History

Repository files navigation

🎼 JamendoMaxCaps: A Large-Scale Music-Caption Dataset with Imputed Metadata

📌 Overview

✨ Features

⚡ Installation Guide

🚀 Usage

🎼 Extract MERT Features

📝 Get Metadata Features

🔍 Build Unified Retrieval System

🎶 Find Top Similar Songs

🛠️ Run Metadata Imputation

📖 Citation

🤝 Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages