Piper is a text-to-speech engine based on local neural nets. At this time of writing, it produces significantly better quality speech than traditional local systems, such as Festival, Espeak, and Pico.
However, I couldn’t find instructions to integrate Piper with the speech dispatcher on Linux, so it would be used as the TTS engine in Firefox. After some research, I figured out the required steps. They are documented below.
Piper is distributed as a Python package that can be installed using
the standard package manager pip. The preferred approach is
to install it in a virtual environment, which keeps it separate it from
other installed Python packages. I’ve installed it in the directory
/opt:
mkdir /opt/piper-tts
cd /opt/piper-tts
python3 -m venv venv
source venv/bin/activate
pip install --require-virtualenv piper-tts
We then still need to install voices. We can list them first:
python3 -m piper.download_voices
We can install the ones we’re interested in in a separate directory
voices:
mkdir voices
cd voices
python3 -m piper.download_voices en_US-ryan-medium
python3 -m piper.download_voices en_US-john-medium
python3 -m piper.download_voices en_US-amy-medium
We can install different voices, for different languages. In its simplest form, Piper runs as a command that reads a snippet of text aloud.
We can already test the Piper command:
/opt/piper-tts/venv/bin/piper \
-m en_US-ryan-medium \
--data-dir /opt/piper-tts/voices \
'This is a test.'
You should hear the speech through your speakers. Great first step!
Piper needs some time to start up, due to the size of its neural net. If you plan to have it synthesize a lot of speech, it’s recommended to install it as a web server. The server waits for text in the background, has it synthesized by the Piper engine, and sends the resulting audio stream to the requester. We can install the separate package:
pip install 'piper-tts[http]'
The web server returns the synthesized speech as a WAV stream. We’ll
be playing it with paplay, a command to play audio streams
on the standard PulseAudio sound server. If you don’t have it, you can
install it with apt:
sudo apt install pulseaudio-utils
We can now start the Piper web server, listening on the default port 5000. We add the debug option to get some logging output in the terminal:
/opt/piper-tts/venv/bin/python3 \
-m piper.http_server \
-m en_US-ryan-medium \
--data-dir /opt/piper-tts/voices \
--debug
We can then check that the server also works, by sending a speech
command from a different terminal window. Using curl to
send the HTTP POST request with a short JSON payload:
curl -X POST \
-H 'Content-Type: application/json' \
-d '{ "voice": "en_US-ryan-medium", "length_scale": "1.0", "text": "This is a test." }' \
-o - \
localhost:5000 \
| paplay
You should hear the same speech as before, this time coming from the web server. Progress!
The speech dispatcher is the standard shared interface for programs to synthesize speech. For example, Firefox calls the speech dispatcher when it reads web pages aloud.
The speech dispatcher supports various speech synthesizers. For Piper, we need to install a new configuration file piper-generic.conf:
sudo cp piper-generic.conf /etc/speech-dispatcher/modules
You should still edit this file to match the voices and languages that you have downloaded.
We then need to edit the main configuration file
/etc/speech-dispatcher/speechd.conf of the speech
dispatcher to use this Piper configuration file, by adding these
lines:
AddModule "piper" "sd_generic" "piper-generic.conf"
DefaultModule "piper"
This setup lets the speech dispatcher forward requests to the Piper web server.
The speech dispatcher is managed by systemd. We can
restart it with the new configuration:
systemctl --user restart speech-dispatcher
Assuming the Piper web server is still running from our previous test, we can now send our speech through the speech dispatcher, to the Piper web server, to Piper:
spd-say -y en_US-ryan-medium -r 0 'This is a test.'
You should again hear the same speech, this time through the speech dispatcher and the web server.
Should anything fail, the systemd logs may contain more
details:
journalctl -xeu speech-dispatcher.service
systemctl --user status speech-dispatcher.service
If that isn’t sufficient to resolve the problem, you can temporarily run the speech dispatcher in debug mode from the command line:
systemctl --user stop speech-dispatcher
systemctl --user disable speech-dispatcher
rm -fr /tmp/speechd-debug
/usr/bin/speech-dispatcher -s -D -t 0
The following log files then contain details about the speech commands:
/tmp/speechd-debug/piper.log/tmp/speechd-debug/speech-dispatcher.log/run/user/1000/speech-dispatcher/log/speech-dispatcher.log/run/user/1000/speech-dispatcher/log/piper.logAfter resolving any problems, you can let systemd manage
the speech dispatcher again:
killall -9 speech-dispatcher
systemctl --user enable speech-dispatcher
systemctl --user start speech-dispatcher
So far, we’ve started the Piper web server manually. The proper
approach is to have systemd start it automatically. We need
to install two configuration files: piper-tts.service and piper-tts.socket:
sudo cp piper-tts.{service,socket} /etc/systemd/user
systemd will then start the server when speech synthesis
is invoked for the first time.
Make sure the Piper web server isn’t running anymore, since
systemd will be managing it now.
We can let systemd verify the configuration files:
sudo systemd-analyze verify /etc/systemd/user/piper-tts.{service,socket}
We can then enable and start the service:
systemctl --user daemon-reload
systemctl --user enable piper-tts
systemctl --user start piper-tts
We can finally test the chain of the speech dispatcher (managed by
systemd) to the Piper web server (also managed by
systemd) to Piper:
spd-say -y en_US-ryan-medium -r 0 'This is a test.'
At this point, the speech dispatcher will delegate to Piper for all text-to-speech conversion.
Copyright © 2026 Eric Lafortune.