Transcribe Speech to Text


Transcribe speech to text with Python or the Web Speech API:

Python

Make sure you have Python installed:

$ python --version

Python version 3 is recommended.

Install SpeechRecognition module:

$ pip install SpeechRecognition

Create script speech_to_text.py that transcribes audio file Hello World.wav to text:

# speech_to_text.py
import speech_recognition as sr

r = sr.Recognizer()
filename = "Hello World.wav"

with sr.AudioFile(filename) as source:
    audio = r.listen(source)

text = r.recognize_google(audio)
print(text)

Run script:

$ python speech_to_text.py
hello world

Library

SpeechRecognition supports the following engines/API’s:

  • recognize_sphinx (works offline)
  • recognize_google
  • recognize_wit
  • recognize_bing
  • recognize_api
  • recognize_houndify
  • recognize_ibm
Pros Cons
Free API limitations (e.g., network timeout, file too big, rate limiting)
Fairly accurate Transcript can be off
  No punctuation marks

See guide for more details.

Demo

Repl.it:

See GitHub repository for more details.

Web Speech API

If audio input can be directed to your microphone, then you can use the JavaScript Web Speech API:

See Web Speech API Demonstration.



If you enjoyed this post, please consider supporting this site!