DEGram - Torch

This repository contains the pytorch implementation of DEGramNet: Effective audio analysis based on a fully learnable time-frequency representation.

Table of Content

Method description
Requirements
Usage
Getting started
Citations

Method description

DEGramNet is an innovative convolutional architecture for audio analysis tasks, addressing the limitations of current state-of-the-art algorithms. Unlike traditional hand-crafted Spectrogram-like representations, DEGramNet utilizes a novel, compact, and trainable time-frequency representation called DEGram. This representation overcomes the drawbacks of fixed filter parameters by dynamically learning the frequencies of interest specific to the audio analysis task. DEGramNet incorporates a custom time-frequency attention module within the DEGram representation, enabling it to perform denoising on the audio signal in both the time and frequency domains. By amplifying the relevant frequency and time components of the sound, DEGramNet effectively improves the generalization capabilities, especially when training data is limited. Moreover, this flexibility allows the representation to adapt easily to different audio analysis problems, such as emphasizing voice frequencies for speaker recognition.

Requirements

To install the requirements:

git clone https://github.com/robertanto/DEGramNet-torch.git
cd DEGramNet-torch
pip install -r requirements.txt

Usage

Try it on Google Colab here!

model = DEGramBasedModel(
    backbone,
    sincgram=True,
    attention=True,
    filters=64,
    order=4,
    num_spectrogram_bins=1025,
    sample_rate=16000,
    reduction_ratio=1/16,
    normalization=True,
    device='cpu'
)

backbone: Neural model taking as input a tensor of shape (batch, 1, freq, time)
sincgram: True to use SincGram, False instead
attention: True to apply the TF-DELayer, False instead
filters: Number of filters in the SincGram layer
order: Butterworth window order
num_spectrogram_bins: Number of input spectrogram bins (n_fft//2+1)
sample_rate: Input sample rate
reduction_ratio: Compression ratio to use in the TF-DELayer
normalization: True to normalize the output representation, False instead
device: Torch module device
other: See the source code for more details

Getting started

import torch

from degramnet_torch.audio_utils import waveform_to_Spectrogram
from degramnet_torch.degram import DEGramBasedModel
from degramnet_torch.resnet import resnet18

# CNN backbone
backbone = resnet18(num_classes=10)

# Get the DEGram based model
model = DEGramBasedModel(
    backbone,
    sincgram=True,
    attention=True,
    num_spectrogram_bins=1025,
    sample_rate=16000,
    device='cpu'
)

# 10 Seconds audio sampled at 16KHz
batch_size = 1
time_size = 160000

# Spectrogram processing
spectTransform = waveform_to_Spectrogram(
    sr=16000,
    win_sec=0.032,
    hop_sec=0.010,
    n_fft =2048
)

# Random signal transformation
x = torch.randn(batch_size, time_size)
spect = spectTransform(x).unsqueeze(1) # add channel dim

# Model output
out = model(spect)
out[0].shape

Citations

If you use this code in your research, please cite these papers.

@article{foggia2023degramnet,
  title={Degramnet: effective audio analysis based on a fully learnable time-frequency representation},
  author={Foggia, Pasquale and Greco, Antonio and Roberto, Antonio and Saggese, Alessia and Vento, Mario},
  journal={Neural Computing and Applications},
  pages={1--13},
  year={2023},
  publisher={Springer}
}

@article{greco2021denet,
  title={DENet: a deep architecture for audio surveillance applications},
  author={Greco, Antonio and Roberto, Antonio and Saggese, Alessia and Vento, Mario},
  journal={Neural Computing and Applications},
  doi={10.1007/s00521-020-05572-5},
  pages={1--12},
  year={2021},
  publisher={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
degramnet_torch		degramnet_torch
.gitignore		.gitignore
DEGram-torch-tutorial.ipynb		DEGram-torch-tutorial.ipynb
README.md		README.md
image.png		image.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DEGram - Torch

Table of Content

Method description

Requirements

Usage

Getting started

Citations

About

Releases

Packages

Languages

robertanto/DEGramNet-torch

Folders and files

Latest commit

History

Repository files navigation

DEGram - Torch

Table of Content

Method description

Requirements

Usage

Getting started

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages