Allophone Frequency Counter


The Allophone Frequency Counter was designed for specific professional range of users who are engaged in the process of improving the TTS system work.

The service serves the purpose of making systematized (alphabetically or by number) lists of allophones from an input allophonic text. In particular, it is used to create a minimum set of words to cover all existing allophones of the Belarusian language, that would help to avoid wasting time and money on creating new voices for the speech synthesizer.

PrintScrin_ServiceOfSearchingAndSortingAllophones

To use the service, one need to input a text with allophones in a text input field. In the input field of “stop-allophones”, which is located to the right, the user can input a list of allophones to be ignored when searching and sorting. It should be noted that the service also aims at searching and sorting diphones.

 

Access to the service via the API

To access the service «Allophone Frequency Counter» via the API, you should send a AJAX-request (type: POST) to the address https://corpus.by/AllophoneFrequencyCounter/api.php. With an input array data the following parameters are passed:

  • text — input text representing allophonic text, diphonic text or mixed allophonic/diphonic text.
  • stopWords — a list of allophones/diphones that in the calculation of the frequency will not be counted, being entered by spaces or newlines.
  • phonesType — type of base units, which are counted. There are three types:
    • allophones — for only counting the frequency of allophones;
    • diphones — for only counting the frequency of diphones;
    • all — for counting the frequency of both allophones and diphones.
  • allophonesType — type of allophones, which will be counted:
    • full — full form of allophones;
    • short — shortened form of allophones.
  • examplesNumber — limitation of the number of contexts that are shown in the resulting table.
  • contextSize — limitation of the number of characters that make up the resulting context.

Example of AJAX-request:

$.ajax({
   type: “POST”,
   url: “https://corpus.by/AllophoneFrequencyCounter/api.php”,
   data:{
      “text”: “M004O113,J’013,/,R032O022,D001,N004Y322,/,K001,U032,T000”,
      “stopWords”: “K001 U032 T000”,
      “phonesType”: “all”,

      “allophonesType”: “full”,
      “examplesNumber”: 1,
      “contextSize”: 30
   },
   success: function(msg){ },
   error: function() { }
});

The server returns a JSON-array with the following parameters:

  • text — input text.
  • AllPhonesCnt — the number of all phonemes.
  • UniquePhonesCnt — the number of unique phonemes.
  • ResultTable — the resulting frequency table.

For example, the following reply will be formed on the above listed AJAX-request:

[
   {
      “text”: “M004O113,J’013,/,R032O022,D001,N004Y322,/,K001,U032,T000”,
      “AllPhonesCnt”: 5,
      “UniquePhonesCnt”: 5,
      “ResultTable”: “<table id=”resultTableId” class=”sortable”><thead><tr><td>Sound</td><td>Frequency</td><td>Contexts (max. 1)</td></tr></thead><tbody><tr valign=”top”><td width=”5%”><b>D001</b></td><td width=”5%” align=”center”>1</td><td>M004O113,J’013,/,R032O022,<font color=”red”>D001</font>,N004Y322,/,K001,U032,T000<br>
</td></tr><tr valign=”top”><td width=”5%”><b>J’013</b></td><td width=”5%” align=”center”>1</td><td>M004O113,<font color=”red”>J’013</font>,/,R032O022,D001,N004Y322,/,K0<br>
</td></tr><tr valign=”top”><td width=”5%”><b>M004O113</b></td><td width=”5%” align=”center”>1</td><td><font color=”red”>M004O113</font>,J’013,/,R032O022,D001,N004Y32<br>
</td></tr><tr valign=”top”><td width=”5%”><b>N004Y322</b></td><td width=”5%” align=”center”>1</td><td>004O113,J’013,/,R032O022,D001,<font color=”red”>N004Y322</font>,/,K001,U032,T000<br>
</td></tr><tr valign=”top”><td width=”5%”><b>R032O022</b></td><td width=”5%” align=”center”>1</td><td>M004O113,J’013,/,<font color=”red”>R032O022</font>,D001,N004Y322,/,K001,U032,T00<br>
</td></tr></tbody><tfoot></tfoot></table>”

   }
]

 

The webpage of the service – https://corpus.by/AllophoneFrequencyCounter/?lang=en

Калі Вы знайшлі ў тэксце памылку правапісу, калі ласка, выдзеліце гэты тэкст і націсніце Ctrl+Enter.