{"id":2,"date":"2013-11-11T15:41:51","date_gmt":"2013-11-11T12:41:51","guid":{"rendered":"http:\/\/corpus1.by\/?page_id=2"},"modified":"2023-04-17T12:05:41","modified_gmt":"2023-04-17T09:05:41","slug":"navukovy-nakirunak","status":"publish","type":"page","link":"https:\/\/ssrlab.by\/en\/navukovy-nakirunak","title":{"rendered":"Fields of Research"},"content":{"rendered":"<p><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">The speech synthesis and recognition laboratory was created in 1974 <span id=\"result_box\" lang=\"en\"><span class=\"hps\">initially<\/span> <span class=\"hps\">as a department of<\/span> <span class=\"hps\">the <\/span><\/span><span class=\"st\">Central Scientific-Research Institute of Communications <\/span><span lang=\"en\"><span class=\"hps atn\">(<\/span>CSRIC), and <span class=\"hps\">since 1986<\/span> it is<span class=\"hps\"> a laboratory<\/span> <span class=\"hps\">of the Institute<\/span> <span class=\"hps\">of Technical Cybernetics of NAS of Belarus<\/span><span class=\"hps\">.<\/span><\/span><\/span><\/span>The laboratory focuses on the following research areas: the theory of speech recognition and synthesis and the development of human-machine systems on the basis of speech communication.<\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><span lang=\"en\"><span class=\"hps\">.<\/span><\/span><\/span><\/span><\/p>\n<h5 style=\"text-align: justify;\"><strong><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">The main research fields of the laboratory include:<\/span><\/span><\/strong><\/h5>\n<ul>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">High-quality text-to-speech synthesis;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Computer-assisted personal voice cloning;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Multilingual speech synthesis;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Robust recognition of sequences of discrete and run-together words;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\"><span class=\"st\">Computer telephony integration<\/span>;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Computer-assisted rehabilitation systems for <span class=\"st\">people who are deaf, hard of hearing or are visually impaired<\/span>;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Computational linguistics;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Natural language processing;<\/span><\/span><\/li>\n<li style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Text pre-processing.<\/span><\/span><\/li>\n<\/ul>\n<h5 style=\"text-align: justify;\"><strong><span style=\"font-size: 14px;\">Scientific approaches and research methodology<\/span><\/strong><\/h5>\n<p style=\"text-align: justify;\">High-quality multilingual and multi-voiced text-to-speech synthesis is based on \u00a0natural speech allophone elements usage (around 1000 allophones) and on high level of specified male and female voices&#8217; imitation<span style=\"font-size: 14px;\">.\u00a0The task of\u00a0synthetic\u00a0speech\u00a0&#8220;personalization&#8221;\u00a0(computer\u00a0cloning)\u00a0has been successfully solved\u00a0by satisfying\u00a0the following conditions:<\/span><\/p>\n<ol>\n<li>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\"><span style=\"font-family: arial,helvetica,sans-serif;\">Maximally accurate<\/span> modelling\u00a0of\u00a0acoustic,\u00a0phonetic and\u00a0prosodic\u00a0features\u00a0of an individual\u00a0person&#8217;s voice and speech;<\/span><\/p>\n<\/li>\n<li>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">The lowest level of distortion of compilation elements in the process of\u00a0their\u00a0recording, playing and\u00a0prosodic\u00a0modification;<\/span><\/p>\n<\/li>\n<li>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">The absence of any\u00a0additional transformation\u00a0of\u00a0speech\u00a0elements of\u00a0PSOLA\u00a0type\u00a0(abbreviated from:\u00a0Pitch Synchronous Overlap and Add)\u00a0or\u00a0FFT\u00a0type\u00a0(abbreviated from:\u00a0Fast Fourier Transform).<\/span><\/p>\n<\/li>\n<\/ol>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">In order to solve the problem of natural language texts&#8217; pre-processing, the linguistic development environment called Nooj is used<cite>\u00a0(<u><a href=\"http:\/\/www.nooj-association.org\/\">http:\/\/www.nooj-association.org\/<\/a><\/u>).<\/cite>\u00a0This software allows to develop syntactic and morphological grammars, or so called finite automaton, and to test them on a large number of texts. For that purpose,\u00a0the\u00a0Belarusian\u00a0module for NooJ\u00a0was\u00a0developed, which includes several\u00a0texts,\u00a0demo-versions of\u00a0grammars\u00a0and\u00a0a set of dictionaries\u00a0(<u><a href=\"http:\/\/www.nooj4nlp.org\/resources\/be.zip\">http:\/\/www.nooj4nlp.org\/resources\/be.zip<\/a><\/u>).<\/span><\/p>\n<p><span style=\"font-size: 14px;\">Basic\u00a0algorithms for speech recognition\u00a0and making\u00a0verbal\u00a0decisions are implemented\u00a0on the basis\u00a0of\u00a0dynamic\u00a0matching of\u00a0signals, which is a new method proposed by the laboratory\u00a0and modified\u00a0for\u00a0word recognition\u00a0in connected\u00a0speech. <\/span>The method allows to carry out dynamic alignment of the time scales of word reference description \u00a0and its realization in a speech flow, with the beginning and the end of the undefined word.<\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">The main\u00a0advantage of this method\u00a0is the ability to\u00a0determine the probability of\u00a0a\u00a0word&#8217;s\u00a0presence\u00a0in the running\u00a0speech flow\u00a0and\u00a0the\u00a0assessment of this word&#8217;s\u00a0time position in the\u00a0presence of\u00a0different\u00a0acoustic disturbances.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">The solution to the problem of\u00a0robust\u00a0speech recognition\u00a0is based on\u00a0the implementation of two\u00a0basic approaches:<\/span><\/p>\n<ol>\n<li>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">The application of\u00a0currently-known techniques\u00a0of robust\u00a0estimation\u00a0of statistical parameters\u00a0for\u00a0solving specific problems\u00a0related to\u00a0analysis,\u00a0feature extraction, training and\u00a0speech recognition;<\/span><\/p>\n<\/li>\n<li>\n<p style=\"text-align: justify;\"><span style=\"font-size: 14px;\">The application of\u00a0collective\u00a0recognition methods,\u00a0where the final\u00a0decision is based\u00a0on the results of\u00a0the collective recognition of\u00a0appropriate\u00a0rules\u00a0with a different\u00a0set of speech signal\u00a0characteristics.<\/span><\/p>\n<\/li>\n<\/ol>\n<div><\/div>\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>The speech synthesis and recognition laboratory was created in 1974 initially as a department of the Central Scientific-Research Institute of Communications (CSRIC), and since 1986 it is a laboratory of the Institute of Technical Cybernetics of NAS of Belarus.The laboratory focuses on the following research areas: the theory of speech recognition and synthesis and the [&hellip;]<\/p>\n<a class = \"excerpt\" href=\"https:\/\/ssrlab.by\/en\/navukovy-nakirunak\">Read more...<\/a>","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":46,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/pages\/2"}],"collection":[{"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/comments?post=2"}],"version-history":[{"count":66,"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/pages\/2\/revisions"}],"predecessor-version":[{"id":9061,"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/pages\/2\/revisions\/9061"}],"wp:attachment":[{"href":"https:\/\/ssrlab.by\/en\/wp-json\/wp\/v2\/media?parent=2"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}