Belarusian and Russian text-to-speech synthesizer for stationary, mobile and web-based platforms


Regular people-to-people communication is performed through hearing and voice. This method of interaction is also desirable for the human-machine relationship. Such a technology as Speech Synthesis allows your mobile or stationary computer to convert separate words, sentences and other text fragments in one of two official languages ​​of the Republic of Belarus, Belarusian or Russian, into speech.

The tts-based systems may be presented by such multimedia products as talking electronic answering machines (fault and warning messages voicing, voicing of sms, e-mails, chat messages, tts voicing in queue management systems), audiobooks (sequential reading, educational dialogues, audio guides); Internet radio (RSS readers, and website readers); multimedia presentations “text-image-sound”.

The intended audience group covers almost all the people of the country: any attentive listener in a particular room or a passersby of any age (children, adults and senior citizens in the streets, in the bank, at school, office, apartment, or in a vehicle); people with disabilities (visually impaired, with weakened vocal cords or hearing) may become users of a tts-based system.

Text-to-speech synthesizer for stationary platforms

The interface for stationary platforms asks a user in which language one enters a text. Then the entered text arrives at the input of specialized processors (linguistic, intonational (prosodic), phonetic, or acoustic). Finally, the processed text is converted into an audio signal.

The novelty and originality of the TSS design lie in the following: the text-to-speech synthesizer uses the same algorithms and their realizations slightly altered according to language-dependent linguistic resources. As a result, this significantly saves computer resources. 

The developed algorithms for turning numbers into ordinal or cardinal numerals and their modifications, allow voicing designations of dates (e.g., 25.10.2011), time, temperatures (e.g., 212 °F), scales, technical names (e.g., the Yakovlev Yak-15), abbreviations, and acronyms (e.g., UNIX, Android 4.3). These algorithms, unlike the existing ones, take into account the declination of ordinal numbers according to the categories of gender, number and case, and, therefore, can increase the natural sounding of synthesized speech.

Text-to-Speech Synthesizer for Mobile Platforms

The system of text-to-speech synthesis has been implemented on the J2ME platform used in a wide variety of mobile phones. The synthesizer has low requirements for this platform, which allows using it for the majority of mobile phones. Another distinctive feature that favours the use of the SST is the inbuilt technology of the placement of accents in words, which, unlike its known analogues, considers heuristic and statistical accent characteristics. Thanks to this, the volume of the grammatical dictionary is significantly reduced without loss of accuracy of the accent placement.


Text-to-Speech Synthesizer for the Internet

The system has been implemented on the free scripting programming language PHP that is considered the most popular one on the web. Users can visit the website at any time and synthesize speech for any text in the appropriate language. After the generation of speech, it will become possible not only to play the resulting sound file, or to download and save it, but also to share the electronic link to it with friends via e-mail or a social network. For example, at it is possible to quickly create language riddle tests.

Text-to-Speech Synthesizer is a Widely-Used Tool

The text-to-speech synthesizer is capable of providing the voicing of texts in the Belarusian and Russian languages for a wide range of users. Thanks to the language-independent architecture, it can speak the same voice (male or female) in different languages. The ability to launch the SST on different platforms allows its almost universal use.

