Word Frequency Counter

The service Word Frequency Counter solves the problem of obtaining statistics on the use of arbitrary character sequences in a text. A particular case of this problem is counting the frequency of words in a text.

To obtain the necessary statistics, one can input any text and choose two sets of characters to customize the operation settings of the service for a particular task. The first set includes the characters that can compose a word. Here a user can place an alphabet or a set of characters that will be used for recognizing words in the text.

Frequency of Words

By default, this field contains all alphabetic characters of Windows-1251 encoding and number characters:







The second set includes the characters that can compose a word, but can not be in the beginning. Main and partial stresses, apostrophe and hyphen are placed here by default.

After clicking on “Get frequency of Words!”, a user gets a list of words (or character sequences) with the frequency indicators. The list is presented in three forms: ordered by frequency of words, ordered alphabetically and sorted reverse.

For example, if we input The New Testament and mark case-sensitive analysis and put “0” in the Number of Examples field, we will get the following results:


If we unmark case-sensitive analysis, we will get different results, as same words in different letter cases will be counted as one. The analysis of the same text gives us the following results sorted in alphabetic order.

і    9155

ў    2536

што    2099

а    2033

ня    1640

на    1542

у    1489

не    1485

яго    1375 etc.

To get contexts of words (or character sequences) one should mention it in the “Number of examples” field. The context here means any text fragment that contains 7 words: 3 words before the found word, the word itself, and 3 words after the found word. For example, if the number of the needed context examples is 2, the result of the analysis will be the following:


The service webpage – http://corpus.by/WordFrequencyCounter/?lang=en

