How to read text aloud with Piper and Python

Piper is a neural text-to-speech system that can run locally and deliver great sounding audio clips even on underpowered computers. Piper is optimized to run on the Raspberry Pi 4, and you can easily import it to your application as a library.

I stumbled upon Piper TTS after looking for a simple text-to-speech application where I could input text and read it out loud on Linux Minut. Coming from using Microsoft online voices on Microsoft Edge, I was looking for a more natural sound than the robotic voices from programs like Festival or eSpeak, and I was impressed by the natural sound of Piper and its capabilities to run smoothly on almost any kind of modern computer.

You can listen to samples generated using Piper here.

Installing Piper on Linux

You can install Piper through pip with this command:

pip install piper-tts

Afterwards, you should be able to import piper into your program with import piper.

You can quickly test piper from the terminal by piping the output of a program for it to read, like this:

echo "Hello world! This is text to speech" | piper \
--model en_US-lessac-medium \
--output_file audio.wav

The resulting audio will be saved in audio.wav and can be played with any media player.

Adding models to Piper

The Piper repository includes a variety of pre-trained voice models sorted by language that you can use in your projects. These models determine how the synthesized speech will sound, in other words, each model is a different “voice” you can use with Piper (Although occasionally a model will contain multiple voices).

You can also create your own voice model using the Piper Recording Studio, a web application that you can run locally to generate a Piper dataset by recording clips with your voice. However, make sure to have a decent graphics card on your device, or training your model could be a very slow task. For more information on the process of creating a model for Piper, look at this article from Sam Howell.

In order to add models to Piper, you need to obtain the .onnx format and the .onnx.json file. These JSON files contain important metadata about the models, such as their sample rate and phoneme set, and must always have the same name as the .onnx file, and be located within the same directory.

For example:

directory/
|-- ...
|-- es_MX-claude-high.onnx
|-- es_MX-claude-high.onnx.json

Generating audio files from text

To generate audio files programatically using Piper, you’ll need to import the PiperVoice class and use the appropiate methods, like this (based on this answer):

import wave
from piper.voice import PiperVoice

model = "/path/to/model.onnx"
voice = PiperVoice.load(model)
text = "This is an example of text to speech"
wav_file = wave.open("output.wav", "w")
audio = voice.synthesize(text, wav_file)

In this code, we:

Import the wave module to create WAV files, and PiperVoice to generate the audio from our text.
Specify the model file we want to use and load it into Piper.
Create a wav_file object where the program will write the synthesized audio data.
Define the text we want to convert to speech.
We call the synthesize method of PiperVoice to generate audio from the text and save it to the WAV file.

Streaming text to speech with Piper

It is possible to stream the audio directly to an audio device without having to save it to a file first, as seen in this answer.

import numpy as np
import sounddevice as sd
from piper.voice import PiperVoice

model = "/path/to/model.onnx"
voice = PiperVoice.load(model)
text = "This is an example of text to speech"

# Setup a sounddevice OutputStream with appropriate parameters
# The sample rate and channels should match the properties of the PCM data
stream = sd.OutputStream(samplerate=voice.config.sample_rate, channels=1, dtype='int16')
stream.start()

for audio_bytes in voice.synthesize_stream_raw(text):
    int_data = np.frombuffer(audio_bytes, dtype=np.int16)
    stream.write(int_data)

stream.stop()
stream.close()

If you get an error OSError: PortAudio library not found, you can fix it by installing the portaudio library. You can do this in Ubuntu and Debian-based distributions with this command:

sudo apt-get install libportaudio2

The previous code is similar to the one we used to create WAV files from text. This time, we:

Import sounddevice for audio streaming, PiperVoice generate the audio from our text, and numpy to interpret the data as an array.
Define the model we want to use and load it into Piper.
Provide the text we want to convert to speech.
Set up a sounddevice OutputStream with parameters matching the properties of the PCM (Pulse Code Modulation) data produced by Piper. This stream will be used to play the audio generated by Piper.
Iterate over the raw audio data generated by voice.synthesize_stream_raw, convert it to an array of integers, and write it to the stream for real-time playback.

Conclusion

In summary, Piper offers a powerful solution for local text-to-speech synthesis.

Although the speech quality is not as high as tools like Coqui, the fact that Piper can generate audio quickly in devices with limited resources make it, in my opinion, the best local text to speech tool currently.

By importing Piper as a library with Python, you can easily integrate it into your programs, and deliver natural sounding voices while barely affecting performance. If you want to see an example, take a look at this simple read aloud program that I wrote with Python and Tkinter.

Update: The original code mistakenly called PiperVoice(model), which was incorrect and didn’t work as intended because the .load() method was missing. The corrected code now calls PiperVoice.load(model). Thanks to everyone who pointed this out! 🙂️

15 comments

lolcoder says:

May 19, 2024 at 3:49 pm

good and simple

1. Otmar TCHENGA says:
  
  April 18, 2025 at 3:21 am
  
  Thank you very much
  
AA says:

May 19, 2024 at 6:42 pm

this line
voice = PiperVoice(model)

results in positional character error “config”

do you know how to resolve that? I tried looking at the __main__.py ___init__.py of .voice but cannot figure out how to resolve it. I’ve searched forums and found some hits but couldn’t apply what they’re talking about to this particular error for this particular module.

1. Noe says:
  
  July 2, 2024 at 8:02 am
  
  Hello,
  I looked at the code but I could not replicate that error. Do you think you can share more information about your system like your version of Python, your OS, and the traceback output when the error happens?
  
  1. jedd says:
    
    September 25, 2024 at 3:24 pm
    
    I’m getting the same error when I try to load a model as well. Any thoughts?
    TypeError: PiperVoice.__init__() missing 1 required positional argument: ‘config’
    
    code is below:
    
    import os
    import openai
    from dotenv import load_dotenv
    import time
    import speech_recognition as sr
    import pyttsx3
    from piper.voice import PiperVoice
    import numpy as np
    import sounddevice as sd
    load_dotenv()
    
    #tried with both .json or just .onnx
    #model = “/home/jedd/jarvis/en_US-lessac-medium.onnx.json”
    model = “/home/jedd/jarvis/en_US-lessac-medium.onnx”
    
    voice = PiperVoice(model)
    text = “This is an example of text to speech”
    wav_file = wave.open(“output.wav”, “w”)
    audio = voice.synthesize(text, wav_file)
    
  2. jedd says:
    
    September 25, 2024 at 3:27 pm
    
    Raspberry Pi 5 (raspberry 64-bit os)
    Python 3.11.2
    
    Traceback (most recent call last):
    File “/home/jedd/jarvis/jv.py”, line 19, in
    voice = PiperVoice(model)
    ^^^^^^^^^^^^^^^^^
    TypeError: PiperVoice.__init__() missing 1 required positional argument: ‘config’
    
  3. jedd says:
    
    September 26, 2024 at 11:33 am
    
    There’s a omission in the PiperVoice(model) statement. It needs the .load method to work.
    
    PiperVoice.load(model) is the correct.
    
  4. Darryl says:
    
    September 26, 2024 at 1:12 pm
    
    Should be PiperVoice.load(model)
    
2. WalterT says:
  
  October 13, 2024 at 9:12 am
  
  The syntax is wrong.
  
  You should write:
  “PiperVoice.load(model)”
  
3. yo says:
  
  October 19, 2024 at 2:56 pm
  
  I had the same problem. The right syntax is:
  voice = PiperVoice.load(model)
  Now it works.
  
Graham says:

August 13, 2024 at 7:06 am

Also got positional character error “config”. Resolved by replacing voice = PiperVoice(model) with voice = PiperVoice.load(model)

Bryan says:

August 27, 2024 at 10:21 pm

At least for the latest Piper, this line in your example:
voice = PiperVoice(model)
should be changed to:
voice = PiperVoice.load(model)
This will eliminate the error that AA was describing.

Sujit Vasanth says:

October 28, 2024 at 3:45 am

thanks for the tutorial the github repo doesn’t explain how to use the python piper API at all! works perfect on ubuntu 20.09 python 3.9

Farouk tif says:

April 17, 2025 at 1:23 pm

the code throws the error “Illegal instruction” when running the line from piper.voice import PiperVoice
i am using raspberry pi. i don’t seem to understand why this is happening! in the article above it says Piper should run fine on raspberry pi 4

Nick says:

July 23, 2025 at 1:51 pm

I am getting an AttributeError: ‘PiperVoice’ object has no attribute ‘synthesize_stream_raw’. I have upgraded to the latest version 1.3.0 and this issue persists