Rethinking CNN Models for Audio Classification

This repository contains the PyTorch code for our paper Rethinking CNN Models for Audio Classification. The experiments are conducted on the following three datasets which can be downloaded from the links provided:

Preprocessing

The preprocessing is done separately to save time during the training of the models.

For ESC-50:

python preprocessing/preprocessingESC.py --csv_file /path/to/file.csv --data_dir /path/to/audio_data/ --store_dir /path/to/store_spectrograms/ --sampling_rate 44100

For UrbanSound8K:

python preprocessing/preprocessingUSC.py --csv_file /path/to/csv_file/ --data_dir /path/to/audio_data/ --store_dir /path/to/store_spectrograms/

For GTZAN:

python preprocessing/preprocessingGTZAN.py --data_dir /path/to/audio_data/ --store_dir /path/to/store_spectrograms/ --sampling_rate 22050

Training the Models

The configurations for training the models are provided in the config folder. The sample_config.json explains the details of all the variables in the configurations. The command for training is:

python train.py --config_path /config/your_config.json

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
config		config
dataloaders		dataloaders
models		models
preprocessing		preprocessing
README.md		README.md
train.py		train.py
utils.py		utils.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rethinking CNN Models for Audio Classification

Preprocessing

Training the Models

About

Releases

Packages

Contributors 2

Languages

kamalesh0406/Audio-Classification

Folders and files

Latest commit

History

Repository files navigation

Rethinking CNN Models for Audio Classification

Preprocessing

Training the Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages