A text-to-speech (TTS) system is one attributed to the efficiency of seamlessly converting an input text file to an output audio file with reasonable clarity. Such a solution makes it possible for users to engage with a computerized environment without having to manually read through a text file or documentation file.
For instance, a text-to-speech tool is a priceless solution for users with both reading and hearing difficulties making it a perfect inclusion in an e-learning project. It is also an alternative to hiring a voice-over artist since it saves on hiring costs.
We can therefore summarize the benefits of text-to-speech solutions with the following bullet points:
- Better user experience since a transcript of any video can be converted to a natural-sounding audio file.
- Improved accessibility and understanding of learning materials, especially for users with reading and hearing difficulties.
- Improved reading and learning skills since most text-to-speech solutions can be executed in multiple languages.
gosling is a natural-sounding text-to-speech tool that can be implemented on a Linux operating system terminal environment. If you are familiar with Google’s Cloud Text-to-Speech API, then gosling is more of a wrapper around it.
Prerequisites
- An up-to-date Linux operating system distribution.
- Sudoer/root user privileges.
- Familiarity with using the Linux command-line environment.
- A GCP account with billing enabled (you get 1 million characters free per month and are only charged if you exceed them).
- Once you have a GCP account, enable the TTS API and get a service account.
This article will walk us through the installation and testing of gosling as a natural-sounding text-to-speech solution.
Install gosling text-to-speech in Linux
We will need a sample text file to demonstrate our text-to-speech solution.
$ nano speech.txt
Next, go to the gosling releases page and download the tar.gz file that best suits the system architecture of the Linux operating system distribution you are using.
On my end, I will execute the following wget command for the download of gosling version 0.1.1 (latest release as per the publication of this article).
$ wget https://github.com/Samyak2/gosling/releases/download/v0.1.1/gosling-v0.1.1-linux-amd64.tar.gz
Proceed to decompress the archive using tar command.
$ tar -xvzf gosling-v0.1.1-linux-amd64.tar.gz
The extracted file gosling from the tar.gz archive is a binary file and therefore executable. Therefore, to run gosling, we will need to implement the following command syntax on the Linux terminal.
Also, make sure you are on the same directory with the extracted gosling binary file while referencing the following gosling usage syntax.
Generating Audio from Text File in Linux
In our case, the implementation of the above gosling command syntax will look like the following:
$ ./gosling speech.txt speech.mp3
The output audio file should be created and playable on any media player.
$ ls -l speech.mp3
Generating Audio from Standard Input
We do not necessarily need to generate an audio file from a text file. We can achieve the same objective from standard input as demonstrated below:
$ echo "Hello and welcome to linuxshelltips" | ./gosling - new.mp3
Generating and Playing Audio Directly from Standard Input
Here, we could use ffplay from ffmpeg in the following manner:
$ echo "LinuxShellTips is awesome" | ./gosling - - | ffplay -nodisp -autoexit -
More gosling options can be found by running the command:
$ ./gosling --help
If you have go programming language installed, you can install gosling directly from the command:
$ go install github.com/Samyak2/gosling@latest
All the best in your gosling text-to-speech projects. If you found this article guide useful, feel free to leave a comment or feedback.