AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

597 45 Updated Jan 15, 2025

csteinmetz1 / ai-audio-startups

Community list of startups working with AI in audio and music technology

1,603 142 Updated Aug 9, 2024

emo-box / EmoBox

[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Python 189 8 Updated Jun 17, 2024

john852517791 / awesome-fake-audio-detection

A list of tools, papers and code related to Fake Audio Detection.

52 Updated Jan 20, 2025

nethermanpro / transvip

Python 153 13 Updated Nov 29, 2024

naginoa / LLMs_interview_notes

Forked from jackaduma/awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型（LLMs）算法工程师相关的面试题和参考答案

418 110 Updated Oct 16, 2023

jackaduma / awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型（LLMs）算法工程师相关的面试题和参考答案

1,182 290 Updated Dec 14, 2023

unilight / sheet

Speech Human Evaluation Estimation Toolkit (SHEET)

Python 49 6 Updated Nov 13, 2024

openaudiolab / LLaST

LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Python 23 1 Updated Aug 11, 2024

yangdongchao / RSTnet

Real-time Speech-Text Foundation Model Toolkit (wip)

Python 126 11 Updated Oct 14, 2024

janhq / ichigo

Local realtime voice AI

Python 2,170 117 Updated Jan 17, 2025

yzGuu830 / efficient-speech-codec

[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers

Jupyter Notebook 99 4 Updated Nov 28, 2024

hazukieq / rime-hakka

客家话输入方案(广西高峰乡)

1 1 Updated May 12, 2021

e1tts / e1tts.github.io

Python 6 Updated Sep 16, 2024

feizc / FluxMusic

Text-to-Music Generation with Rectified Flow Transformers

Python 1,656 128 Updated Dec 10, 2024

CanCLID / cantonese_orthography

粵語正字法

13 6 Updated Jul 22, 2020

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,092 270 Updated Nov 5, 2024

tangyixuan / 2ndPassContextASR

Python 3 Updated Aug 8, 2024

zjysteven / lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Python 225 27 Updated Dec 23, 2024

0nutation / SpeechGPT

SpeechGPT Series: Speech Large Language Models

Python 1,326 89 Updated Jul 22, 2024

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 165 11 Updated Jul 12, 2024

just-ai / speechflow

Python 15 3 Updated Jan 19, 2025

cyanbx / Prompt-Singer

Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).

Python 96 13 Updated Jan 17, 2025

facebookresearch / audioseal

Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector

Python 508 65 Updated Oct 26, 2024

6drf21e / ChatTTS_Speaker

ChatTTS 2000条音色稳定性打分🥇+区分男女年龄👧+在线试听🔈 ChatTTS 2K Speaker Stability Score & Categorized by Gender and Age & Audio Preview

Python 585 32 Updated Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xzm2004 xzm2004260

Block or report xzm2004260

Stars

Yuan-ManX / MusicLLM-PyTorch

innnky / emotional-vits

FunAudioLLM / InspireMusic

X-LANCE / BER

CjangCjengh / TTSModels

Yuan-ManX / ai-audio-datasets