Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add C++ runtime for speech enhancement GTCRN models #1977

Merged
merged 6 commits into from
Mar 10, 2025

Conversation

csukuangfj
Copy link
Collaborator

See also
https://github.com/Xiaobin-Rong/gtcrn

CC @yuyun2000 @Xiaobin-Rong

Usage

Build sherpa-onnx from source

cd /path/to/sherpa-onnx

mkdir build
cd build
cmake ..
make

Download models and test files

cd /path/to/sherpa-onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav
ls -lh gtcrn_simple.onnx
-rw-r--r--  1 fangjun  staff   523K Mar 10 12:55 gtcrn_simple.onnx

Run it

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline-denoiser \
  --debug=1 \
  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx \
  --input-wav=./speech_with_noise.wav \
  --output-wav=./enhanced_speech_16k.wav
speech_with_noise.mov
enhanced_16k.mov
Screenshot 2025-03-10 at 17 12 47 Screenshot 2025-03-10 at 17 12 26

Test 2

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav
./build/bin/sherpa-onnx-offline-denoiser \
  --debug=1 \
  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx  \
  --input-wav=./inp_16k.wav \
  --output-wav=./enhanced_16k-2.wav
inp_16k.mov
enhanced_16k-2.mov
Screenshot 2025-03-10 at 17 31 51 Screenshot 2025-03-10 at 17 32 34

@csukuangfj csukuangfj merged commit 488a6e6 into k2-fsa:master Mar 10, 2025
170 of 214 checks passed
@csukuangfj csukuangfj deleted the cpp-gtcrn branch March 10, 2025 10:11
@yuyun2000
Copy link

it is too 强

@altunenes
Copy link

Is there a method you recommend for overcoming parallel/overlap speech scenarios? segmentation-3.0 is quite inadequate for parallel speech and causes problems especially for identifying the speakers (speaker recognition).

@csukuangfj
Copy link
Collaborator Author

Is there a method you recommend for overcoming parallel/overlap speech scenarios? segmentation-3.0 is quite inadequate for parallel speech and causes problems especially for identifying the speakers (speaker recognition).

I suggest that you ask this question in https://github.com/pyannote/pyannote-audio
@altunenes

@altunenes
Copy link

thanks

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants