A simple audio processing with MATLAB
The human ear/brain is able to perform complex sound processing many of which are imitated and reproduced in technology. One of these capabilities is the power to focus on the sound of a specific source and extract that sound from varied background noises. This ability is an example of Auditory Scene Analysis (ASA) and is used in technology as a preprocessing for speech recognition in unconstrained auditory environments. In this project, a binaural recording of multiple speakers is provided. Using the theory of Head Related Transfer Functions (HRTF)[2], the voice of one given speaker is extracted from a mixed recording. The database used in this project is The ShATR Multiple Simultaneous Speaker Corpus.
- A. Bregman, Auditory SceneAnalysis, Cambridge, MA:MIT Press, 1990.
- R.Duda, “ModelingHead Related Transfer Functions,” IEEE Proceedings of ASILOMAR1993.
- E. Tessier and F. Berthommier, “Speech Enhancement and Segregation Based on the Localisation Cue for Cocktail-Party Processing,”
- This project is a computer assignment for the Digital Signal Processing course by Professor M. Akhaee in the spring of 2020 at the University of Tehran.
- I did not design the project; however, the solution which is provided in this page is written by me.