Index

eSpeak NG

Flite

Matcha-icefall-zh-baker：中文专用模型，合成速度快，语音自然度良好 Vits-melo-tts-zh_en：中英双语支持，合成速度中等，支持跨语言混合合成 Kokoro-multi-lang-v1.1：多语言模型(支持日、韩、英等)，合成速度较慢但音质最佳

python

pip install sounddevice sherpa_onnx

asr

Real-time speech recognition from a microphone

python ./speech-recognition-from-microphone-with-endpoint-detection.py   --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt   --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx   --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx   --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx

拼音词组匹配替换

windows

```
conda install -c conda-forge pynini
```

下载lexicon.txt https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files
生成replace.fst

import pynini
from pynini import cdrewrite
from pynini.lib import byte, utf8

sigma = utf8.VALID_UTF8_CHAR.star

rule1 = pynini.cross("xiao4yuan2wei4shi4", "校园卫士")

# 针对前鼻音和后鼻音不分的情况
#
# 注意：可以指定多个规则，都替换成同一个词组
rule2 = pynini.cross("xuan2jie4xing1pian4", "玄戒芯片")

rule3 = pynini.cross("fu2nan2ren2", "湖南人")

rule4 = pynini.cross("gong1tou2an1zhuang1", "弓头安装")

rule5 = pynini.cross("ji1zai3chuan2gan3qi4", "机载传感器")

# 可以指定多个规则，覆盖可能的发音
rule6 = pynini.cross("ji1zai4chuan2gan3qi4", "机载传感器")

# 本例子只有6条规则。你可以添加任意多条规则。
rule = (rule1 | rule2 | rule3 | rule4 | rule5 | rule6).optimize()
rule = cdrewrite(rule, "", "", sigma)

rule.write("replace.fst")

python ./speech-recognition-from-microphone-with-endpoint-detection.py   --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt   --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx   --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx   --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx  --hr-lexicon=./lexicon.txt  --hr-rule-fsts=./replace.fst

阿里云

使用语音合成 AliyunNLSSpeechServiceAccess

复刻音色权限只支持流式调用

AliyunNLSFullAccess

复刻参考

☁️ 部署建议

如果你打算长期运行项目（博客 / API / 自动化脚本），建议直接用云服务器，会比本地稳定很多。

👉 查看云服务器（新用户优惠）