Speech Duration Predictor


The service «Speech Duration Predictor» allows the user to know the approximate time of the online speech. An electronic text is delivered to the service entrance in Belarusian, English or Russian, the text can be entered manually or copied. At the output, the user receives the result in the form of an approximate speech duration in the HH:XX:SS format, as well as information on the number of words and symbols used in the text.

 

Basic terms and concepts

Allophone — realization of a phoneme, a variant due to the specific phonetic environment.

Text-to-speech synthesizer (CMT) [1] is a system capable of generating speech in text. It contains two blocks: a block of linguistic processing of text to the  phoneme view with labels pressed, intonations (prosody) and rhythm, as well as a speech signal processing unit that converts previously artymany fans into a sound signal of speech [2].

 

Features of the service

Speech duration is calculated using a speech synthesizer in the text: the text entered is divided into allophones (the smallest sound units), then the lengths of sound of all allophones of the input text are summarized.

 

Practical value

Service will be useful for users who need to make a report. Usually at such events, there is a regulation, in which the speaker has to answer. The service will help the user to see in advance how much time it will take to pronounce the entered amount of text so as not to shift the time limit for the speech. The service will also help in creating acoustic resources for the development and improvement of speech synthesizer in the text, it will help to predict the duration of the audio recording of the read text for the acoustic base.

 

User Interface Description

The graphical interface of the service includes the following parts, presented in Figure 1.

Figure 1. User interface of the service «Speech Duration Predictor»

 

Area A includes the following buttons:

  • «EN», «RU» and «RU» — localization selection buttons;
  • «?» — open the service help (this page).

Area B includes the following buttons:

  • The arrow button is used to return the original example to the input field;
  • A button with a cross clears the input field.

Area C — text input field.

Area D — a corner with which you can change the size of the input field.

Area E — the button «Predict speech duration», with which you can get the result.

Area F — information that is the result of the service.

 

Customer scripts for working with the service

  1. Enter the text in Belarusian, English or Russian in the input field.
  2. Click on «Predict speech duration!» to get the result (Figure 1).

The result is the duration of the synthesized input text in the format HH:XX:SS, as well as the number of visual signs: words and punctuation.

 

Links to sources

Service page — https://corpus.by/SpeechDurationPredictor/?lang=en

Text-to-Speech Synthesizer — https://corpus.by/TextToSpeechSynthesizer/?lang=en

 

External links

  1. Сінтэзатар маўлення па тэксце // Платформа для апрацоўкі тэкставай і гукавой інфармацыі розных тэматычных даменаў [Электронны рэсурс]. — 2017. Рэжым доступу : http://corpus.by/TextToSpeechSynthesizer/?lang=be. — Дата доступу : 30.03.2017.
  2. Алгарытмы лінгвістычнай апрацоўкі тэкстаў для сінтэзу маўлення на беларускай і рускай мовах : дысертацыя на атрыманне навуковай ступені кандыдата тэхнічных навук : спецыяльнасць 05.13.01 Сістэмны аналіз, кіраванне і апрацоўка інфармацыі / Гецэвіч Юрый Станіслававіч ; навуковы кіраўнік Лабанаў Б. М. ; Аб’яднаны інстытут праблем інфарматыкі Нацыянальнай акадэміі навук Беларусі. — Мінск, 2012. — 184, [6] л. : іл., табл., схемы. — Ч. тэксту рус. — Бібліягр.: л. 153-164.

Калі Вы знайшлі ў тэксце памылку правапісу, калі ласка, выдзеліце гэты тэкст і націсніце Ctrl+Enter.