Yuras Hetsevich. Interview for NLProc.by, part 2/2

The second part of the conversation we start with a reference to some examples of a speech synthesizer.


In the previous part you said that your lab is open to anyone interested. How many people did come to you last year? To whom can one contact to get to the lab? Which level of knowledge needed?

A lot of people come through our lab. We work with universities, and in 2014 there were 7 diploma theses, 2 term papers, 13 students undergoing practical training. 

On the average, 30 people are those who get practical tasks from their universities and come to us to solve it. The laboratory staff is 15-18 people.

We visit different schools, conferences, and many people know us. Geography of conferences in 2014 and 2015 includes Kiev, Tallinn, Belarus, Ukraine, Czech Republic, Italy, Moscow. Some time ago a specialist from France visited us. He had meetings with students in BSU and BSUIR, where he had the opportunity to give lectures and conduct workshops.

We get more and more invitations. Sometimes it is hard to plan, because we cannot tell what will happen next week: invitations and meetings may appear any time. We are trying to be open.

It is very good. Moreover, it correlates with the goal of NLProc.by: spreading of knowledge and the development of industry in Belarus.

Who else in our country is engaged in computational linguistics?

There are such people. For example, our leader, Boris Lobanov raised a whole school in Belarus, his students are A.B.Karnevskaya, B.V.Panchanka, A.S.Rylov, T.V.Levkovskaya, G.V.Losik, L .I.Tsyrulnik, A.G.Davydav, I.E.Heydarav, M.P.Degtsyarov, V.U.Kiselev and many others. We are the fifth generation of the school. Some of them are related to the well-known company Sakrament; someone – to a branch of the speech information center. Our difference is that we are working constantly considering our educational goal.

We deal with the compilation method of synthesis in contrast to the known unit selection method based on allophones. It is cheaper, worth 1-2 man a year to create the first version, and it can be improved indefinitely. Unit Selection method takes about 10 man a year to develop. Apart from the synthesis, we also work with speech recognition.

Now we are adding it to mobile robots.

That is really interesting, would you tell us more?

Together with the sector of robotics we are working on robots that speak Belarusian. One of them was demonstrated at TIBO few years ago.


Where can one use them?

For example, in education, or home robots that can recognize commands from man and other robots.

At what stage the work is?

Working together with the sector of robotics, we have added the first version of the synthesis. It seems that the robot is now dismantled, but there is a video of its work. We try to add speech and hearing as well. Especially we are working on the problem of electronic hearing, because during the movement of the robot there are sounds from the environment, which are not commands, and it is important to distinguish them.

Thank you, that’s very interesting, we are waiting for a demonstration robot who speaks Belarusian on some of the meetings of our community. The next question would be: what, in your opinion, are the current trends of development in computational linguistics in our country and the world?

Yes, it is necessary to differentiate, because a lot is done in Belarus to the world, but the Belarusians do not know that. Recently I have learned that some companies here make products for Samsung for named entity recognition – it is a good result.

Of course, our market is small.

For example, I can name IHS (former Invention Machine), Sakrament, several other offshore companies, Yandex. All of them are engaged in search and іnformation retrieval.

By the way, recently, the latter company has made its implementation of Tomita Parser open source. And the last question. Those who are interested in this field – how can they start their career?

We have accumulated base of laboratory works, including those on the synthesis. Also, many materials on NooJ can be found on our website or website of the laboratory prototypes. But the practice is the most reliable way. I advise you to look for any open source project, work there and get help of those who have experience. About the books: there are a lot of them, some are worse, some are better. But the practice is the best!

Thank you for the interview. 

Thank you.

Original text: http://nlproc.by/post/120085283115

Download (PDF, 307KB)