Add C++ runtime for speech enhancement GTCRN models #1977

csukuangfj · 2025-03-10T09:37:19Z

See also
https://github.com/Xiaobin-Rong/gtcrn

CC @yuyun2000 @Xiaobin-Rong

Usage

Build sherpa-onnx from source

cd /path/to/sherpa-onnx

mkdir build
cd build
cmake ..
make

Download models and test files

cd /path/to/sherpa-onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav

ls -lh gtcrn_simple.onnx
-rw-r--r--  1 fangjun  staff   523K Mar 10 12:55 gtcrn_simple.onnx

Run it

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline-denoiser \
  --debug=1 \
  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx \
  --input-wav=./speech_with_noise.wav \
  --output-wav=./enhanced_speech_16k.wav

speech_with_noise.mov

enhanced_16k.mov

Test 2

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav

./build/bin/sherpa-onnx-offline-denoiser \
  --debug=1 \
  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx  \
  --input-wav=./inp_16k.wav \
  --output-wav=./enhanced_16k-2.wav

inp_16k.mov

enhanced_16k-2.mov

yuyun2000 · 2025-03-10T10:29:45Z

it is too 强

altunenes · 2025-03-10T14:50:32Z

Is there a method you recommend for overcoming parallel/overlap speech scenarios? segmentation-3.0 is quite inadequate for parallel speech and causes problems especially for identifying the speakers (speaker recognition).

csukuangfj · 2025-03-11T02:05:59Z

Is there a method you recommend for overcoming parallel/overlap speech scenarios? segmentation-3.0 is quite inadequate for parallel speech and causes problems especially for identifying the speakers (speaker recognition).

I suggest that you ask this question in https://github.com/pyannote/pyannote-audio
@altunenes

altunenes · 2025-03-11T10:22:11Z

thanks

csukuangfj added 6 commits March 10, 2025 12:31

Begin to add C++ runtime for gtcrn speech denoiser models

edcdaab

Update kaldi-native-fbank

26c46b5

First working version

aee9290

Add CI

47464f4

Update README

950dd2a

Fix style issues

13a3ffa

csukuangfj merged commit 488a6e6 into k2-fsa:master Mar 10, 2025
170 of 214 checks passed

csukuangfj deleted the cpp-gtcrn branch March 10, 2025 10:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add C++ runtime for speech enhancement GTCRN models #1977

Add C++ runtime for speech enhancement GTCRN models #1977

csukuangfj commented Mar 10, 2025

yuyun2000 commented Mar 10, 2025

altunenes commented Mar 10, 2025

csukuangfj commented Mar 11, 2025

altunenes commented Mar 11, 2025

Add C++ runtime for speech enhancement GTCRN models #1977

Add C++ runtime for speech enhancement GTCRN models #1977

Conversation

csukuangfj commented Mar 10, 2025

Usage

Build sherpa-onnx from source

Download models and test files

Run it

Test 2

yuyun2000 commented Mar 10, 2025

altunenes commented Mar 10, 2025

csukuangfj commented Mar 11, 2025

altunenes commented Mar 11, 2025