Skip to content

AMAAI-Lab/JamendoMaxCaps

Repository files navigation

🎼 JamendoMaxCaps: A Large-Scale Music-Caption Dataset with Imputed Metadata

📌 Overview

JamendoMaxCaps is a large-scale dataset of 200,000+ instrumental tracks sourced from the Jamendo platform. It includes generated music captions and enhanced imputed metadata. We also introduce a retrieval system that leverages both musical features and metadata to identify similar songs, which are then used to fill in missing metadata using a local large language model (LLLM). This dataset supports research in music-language understanding, retrieval, representation learning, and AI-generated music tasks.

✨ Features

200,000+ Instrumental Tracks from Jamendo
State-of-the-Art Music Captions generated using a cutting-edge model
Metadata Imputation using a retrieval-enhanced LLM (Llama-2)
Comprehensive Musical and Metadata Features:

  • 🎵 MERT-based audio embeddings
  • 📝 Flan-T5 metadata embeddings
  • 🔍 Imputed metadata fields (genre, tempo, mood, instrumentation)

⚡ Installation Guide

git clone https://github.com/AMAAI-Lab/JamendoMaxCaps.git
cd JamendoMaxCaps
conda create -n jamendomaxcaps python=3.10
pip install -r requirements.txt

🚀 Usage

🎼 Extract MERT Features

python extract_mert.py

Ensure input and output folders are correctly configured.

📝 Get Metadata Features

python process_metadata.py

Adjust input and output folder paths accordingly.

🔍 Build Unified Retrieval System

python build_retrival_system.py --weight_audio <weight_audio> --weight_metadata <weight_metadata>

🎶 Find Top Similar Songs

python retrieve_similar_entries.py --config <config_file_path>

🛠️ Run Metadata Imputation

python metadata_imputation.py

📖 Citation

If you use JamendoMaxCaps, please cite:

@article{royjamendomaxcaps2025,
  author    = {Abhinaba Roy, Renhang Liu, Tongyu Lu, Dorien Herremans},
  title     = {JamendoMaxCaps: A Large-Scale Music-Caption Dataset with Imputed Metadata},
  year      = {2025},
  journal   = {arXiv:xxxxx}
}

🤝 Acknowledgments

JamendoMaxCaps is built upon Creative Commons-licensed music from the Jamendo platform and leverages advanced AI models, including MERT, Flan-T5, and Llama-2. Special thanks to the research community for their invaluable contributions to open-source AI development!


📜 Read the Paper | 🎵 Explore the Dataset

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages