This repository hosts a working Python3 implementation of the reassignment algorithm for generating hyper-resolution spectrogram from audio signals, described in Gardner & Magnasco, 2006. The resultant histograms are sprase time-frequency representations of the original signal, with resolution much higher than that of conventional histograms (usually generated with Short Time Fourrier Transform). This algorithm can be applied to analyzing a wide range of signals, ranging from click-like to tonal--it has been used in projects that examine dophin vocal communication and automatic disease detection through voice recordings, for examples.
There are several implementations within the reassignment folder, and reassignment_linear.py contains the implemenation closest to the original Matlab code, provided by Dr. Magnasco. For more detailed information, please read the specifications within each file.
An example routine is provided in the example.py file. You can run
$ python3 example.py --data_dir your_dir
for batch-processing of all audio signals in the directory of your choice. The routine also saves the histogram matrix (in .npy format) and an image representation for each audio (ending in .wav, but can be easily changed to any other extension).
The current implementation is based off of the original one and the Matlab code. The previous implementation by Radek Osmulski is deprecated due to an update in the PyTorch package. The author of this repository
- adapted the original implementation to the newer PyTorch methods
- simplified the code (removed some unnecessary tensor allocations)
- changed the functions to take in mono-channeled signals (instead of 2-channeled, in the original implementation)1
All this work was completed under the supervision of Dr. Magnasco (co-author of the method) and support from Mr. Osmulski in summer 2021. The author would like to thank Mr. Osmulski and Dr. Magnasco for their code and patient explanations.
Footnotes
-
This is to make the method more generic. To process multi-channeled signals, simply create a loop and call the method to analyze each channel separately. ↩