speech_recognition - PythonIDE Docs

语音转文字（SFSpeechRecognizer）：实时麦克风听写，或转写音频文件。

边界：与 speech（文字→语音）方向相反。实时模式需要语音识别 + 麦克风系统授权；start() 未授权时会报 denied。统一 permission 的 request() 暂不支持 speech/microphone 弹窗，需在系统设置中开启权限。

#模块概览

项	说明
导入	`import speech_recognition as sr`
适合做什么	语音输入、听写备忘、语音搜索、音频转文字
调用时机	`start()` / `recognize_file()` 放在按钮回调
实时模式	`start` → 轮询 `text()` → `stop()` 拿最终文本
文件模式	`recognize_file(path)` 一次性返回完整稿

#快速开始

下面脚本检查能力并开始中文实时听写（需在系统设置中已授权）：

python

import speech_recognition as sr

print("支持:", sr.is_available())
print("语言示例:", sr.supported_locales()[:5])

try:
    sr.start(locale="zh-CN")
    # 用户说话期间可轮询 sr.text()
    print("中间结果:", sr.text())
    final = sr.stop()
    print("最终结果:", final)
except sr.SpeechRecognitionError as exc:
    print("失败:", exc, f"code={exc.code}")

#AppUI 示例

开始/停止放在按钮回调；用 Timer 轮询中间结果。

python

import appui
import speech_recognition as sr

state = appui.State(
    listening=False,
    transcript="点击开始听写",
    status="空闲",
)


def tick():
    if state.listening:
        state.transcript = sr.text() or "（正在聆听…）"


poll_timer = appui.Timer(interval=0.3, repeats=True, action=tick)


def toggle_listening():
    if state.listening:
        poll_timer.stop()
        state.transcript = sr.stop() or "（无识别内容）"
        state.batch_update(listening=False, status="已停止")
        return

    try:
        sr.start(locale="zh-CN", partial=True)
        state.batch_update(
            listening=True,
            status="聆听中",
            transcript="（正在聆听…）",
        )
        poll_timer.start()
    except sr.SpeechRecognitionError as exc:
        state.transcript = (
            f"无法开始: {exc}\n"
            "请在系统设置中开启「语音识别」和「麦克风」"
        )
        state.status = str(exc.code or "error")


def cancel_listening():
    if state.listening:
        poll_timer.stop()
        sr.cancel()
        state.batch_update(
            listening=False,
            status="已取消",
            transcript="听写已取消",
        )


def check_locales():
    locales = sr.supported_locales()
    state.status = f"支持 {len(locales)} 种语言"


def body():
    label = "停止听写" if state.listening else "开始听写"
    return appui.NavigationStack(
        appui.Form([
            appui.Section("听写", [
                appui.Text(state.transcript).foreground_color("secondaryLabel"),
                appui.LabeledContent("状态", value=state.status),
            ]),
            appui.Section("操作", [
                appui.Button(label, action=toggle_listening)
                .button_style("bordered_prominent"),
                appui.Button("取消", action=cancel_listening, role="destructive"),
                appui.Button("查看支持语言", action=check_locales),
            ], footer="真机测试最可靠；需语音识别与麦克风权限。"),
        ]).navigation_title("语音听写")
    )


appui.run(body, state=state)

#API 参考

#速查

API	作用
`is_available()`	设备/语言是否支持
`supported_locales()`	支持的 BCP-47 语言列表
`start(locale, on_device, partial)`	开始实时听写
`text()` / `status()`	当前识别文本 / 完整状态
`stop()`	停止并返回最终文本
`cancel()`	立即停止并丢弃结果
`recognize_file(path, locale)`	转写音频文件

#实时听写

python

import speech_recognition as sr

sr.start(locale="zh-CN", on_device=False, partial=True)
while sr.is_listening():
    print(sr.text())
final = sr.stop()

参数	说明
`locale`	BCP-47，如 `zh-CN`、`en-US`
`on_device`	优先离线识别（设备支持时）
`partial`	是否返回中间结果

status() 返回：listening、transcript、is_final、error。

已在听写时再次 start() 会抛 SpeechRecognitionError(code="already_listening")。

#文件转写

只需语音识别权限（不需麦克风）：

python

text = sr.recognize_file("/path/to/audio.m4a", locale="zh-CN")

#异常

失败时抛出 SpeechRecognitionError，可通过 exc.code 区分：

`code`	含义
`denied`	语音识别未授权
`already_listening`	重复 `start()`
`unavailable`	语言或设备不支持

#常见错误

错误写法	后果	修正
在 `body()` 里 `start()`	刷新时重复启动	放进按钮回调
未授权就 `start()`	`denied` 异常	系统设置开启权限
把 `permission.request()` 当 bool	判断错误且暂不支持弹窗	用 `try/except` 处理 `start()`
高频轮询 `text()`	浪费 CPU	用 `Timer(interval=0.3)` 节流

#相关文档

文档	用途
speech	文字转语音（相反方向）
audio_recorder	录音到文件再转写
permission	权限状态查询
原生能力入口	MiniApp 场景配方

#预期效果

运行示例后，界面应出现文档描述的目标结果；若与预期不符，先看「失败路径」并按返回值或日志排查。