Yuras Hetsevich. Interview for NLProc.by, part 1/2


Yuras Hetsevich, Head of the Speech Recognition and Synthesis Laboratory.

Good afternoon, Yuras. Today we’d like you to tell us about your experience in computational linguistics. Please, tell about your experience in this area. When did you start, at what level?

Hello! My name is Yuras Hetsevich, have PhD in Engineering, work as senior lecturer in BSU. I work in United Institute of Informatics Problems in the Academy of Science where I’m a head of the Speech Recognition and Synthesis Laboratory.

As for computational linguistics I’ve walked to this sphere quickly and slowly at the same time. It all began when after graduating BSU Faculty of Applied Mathematics and Computer Science I worked for one IT-company. I was bored with the endless php-sites and also bothered by the question of army service. But suddenly I’ve got an opportunity to apply for MA course. Thank’s to God I was able to get all the documents in one day, learn all the exam questions during one night and enrolled in Master studies of the Academy of Science. In the list of topics for Master research paper I’ve found a topic connected with the synthesis of Belarusian speech. I got interested. I’m a Christian and I believe that the Belarusian language was given to this land by God. I went to my research supervisor – that was Boris Labanov, DSc in Engineering, who dedicated all his life to speech synthesis. After he showed me the work of the laboratory I’ve decided to take this topic and develop a synthesizer for the Belarusian language.

Actually I didn’t understand in the beginning that it is a sphere of computational linguistics. I just wanted to make machine to speak Belarusian.

As the result of my Master research paper we worked out a prototype. It wasn’t fast, but it could utter one word in one second (text-to-speech synthesizer). It was enough to get a Master degree.

After it I’ve decided to get a PhD degree. There were already two reasons: firstly, I did liked the sphere, secondly – army serving. Although the salary was pretty small but it was better to stay in the Academy then to dig trenches in the army. That’s why I’ve decided to continue to do science. I was given the theme of research – “Linguistic text processing algorithm for speech synthesis of the Belarusian and Russian languages”.

If you look at the general scheme, you may notice that the scheme of synthesizer speech actually covers any computer-linguistic program: Sentiment Analysis, Machine translation and other linguistic processors, and maybe even more. Unfortunately, very few people in NLProc work with phonetics and sound signals. Sometimes they even postpone this area from computational linguistics, but it is not right, it is even more: it is not only work with the text, but also with sound (speech) signals.

Time passed by. I tried to defend my dissertation. It was difficult, but possible. Then I became the head of the laboratory. Over the years of teaching in the Belarusian universities by the Lord’s grace we were able to assemble a staff of 17 people at different loading on different occupations in the fields of speech synthesis and recognition. Now as a team together we specify every part of speech synthesizer for Belarusian, Russian, Yakut, and so on. And we are also developing university programs for specific knowledge that a person must possess in order to get into our industry and, in the long term, be broad specialist.

I’m more practice-oriented and less theoretical. It’s both an advantage and disadvantage.


Advantage is in the fact that we often have problems that have multiple solutions. It happens that there is a solution, and we are trying to rewrite it completely step by step to make sure that we possess the product.

And to know that all was done properly?

Yes. Our idea is this: we do not have any serious investor, although it happens that they occur once in 2-3 years. For instance, a synthesizer for the Yakut language we made when an investor was found. But we also did a project absolutely free of charge for children with poor sight to install the updated vote Belarusian speech synthesis. So we make decisions slowly, but we open fragments of our work to others. Thus it is easier for us to teach the students, and if they got interested, they come to us – we are opening 30-40% to make it clear that this is not something closed, and the device is useful for education.

Today we try to expand our work and to do researches in computational linguistics. By the way, we are having a conference on computational linguistics very soon. Everyone is invited.

So that’s our history in short.

Very interesting. You’ve said you work mainly with speech synthesis. Is it so?

No, actually we also work with speech recognition. Results of our work can be found on our site.

Now it gets more vivid that  we need corpus linguistics and semantics as well, because speech synthesizers do not have enough intelligence. The texts are very complex, they need to be understandable and able to be parsed. For example, people write with contractions, with the numbers, the icons – and we have to handle it for people who do not see or can not see right now. Therefore it is necessary to make a very deep linguistic analysis, with which you can solve almost any linguistic processing.

For instance, people ask me: “What can you add to Machine Translation?”

It’s obvious. Abbreviations are difficult to translate in Machine Translation. And we interpret them. For example: “The plane was flying at a speed of 1000 km / h”. Our program turns this abbreviation “km / h” into words and only send to Machine Translation.

It turns out that you have added yet another step in the solution of the problem. But what do use – dictionaries or something else?

There are 2 ways. The first one – to use dictionaries. But we faced a fact that the dictionary is not enough, you need to apply the Rule-based method. It is critical for Slavic languages.

The second part is coming soon.

During the preparation of the material we used Spell Checker Service.

Source: http://nlproc.by/post/120011690090

Download (PDF, 350KB)