UDC Code Finder


The «UDC Code Finder» service allows the user to get a list of Universal Decimal Classification codes, in the descriptions of which one or another word occurs. The input of the service is the word that needs to be found. At the output, the user receives the following information about the UDC classes, where the entered word is found:

  • class code;
  • class description in English;
  • class description in Belarusian.

 

Basic terms and concepts

UDC (Universal decimal classification) is a document indexing language, which is a classification system that covers all areas of human knowledge. UDC is organized as a holistic system in which all areas of knowledge are interconnected, and is designed to describe and index the content of information resources regardless of carrier, form, format and language. UDC is an analytical-synthetic and facet classification system with a detailed dictionary and syntax. This fact provides powerful indexing of information and greatly simplifies the search for information in case of its large volumes.

 

Practical value

Based on the database which «UDC Code Finder» service uses, a Belarusian-language edition of UDC 2016 was created. The publication contains a translation from English into Russian of the data of all UDC classes that were used in the world practice of indexing documents up to the release date.

The service will be high-demand to meet the needs of libraries, organizations that sell print media, information centers involved in the systematization of documents, organizations of document funds and electronic services for finding information.

Also, the use of the service will greatly facilitate the work of scientists preparing their articles for publication in periodicals that are included in the list of the Higher Attestation Commission or international abstract databases.

 

Service features

At the moment, the service has the following restrictions:

  • The search is carried out by the service only in Belarusian and only by one word. A word is any combination of characters entered in the corresponding field, so it is necessary to enter the word without leading and trailing spaces or other extra characters.
  • The maximum number of results is limited to 30 positions. Nevertheless, if the service managed to find a larger number of entries in the database, information about this would be displayed to the user. Codes and their decryption, not included in the issuance, can be obtained through direct contact with the developers in order to work on improving the Belarusian-language version of UDC.

 

Service operation algorithm

Algorithm input data:

  • User text input, UText;
  • User limit on the number of results, Limit;
  • Limitation of results used by the program (default 30), ActualLimit;
  • A set of Cyrillic and Latin characters in upper and lower case, Letters;
  • A set of accent characters, Accents;
  • A set of characters of the apostrophes, Apostrophes;
  • A set of results, Result.

The beginning of the algorithm.

Step 1.1. Creating the set LettersOfWord by merging the Letters, Accents, and Apostrophes sets.

Step 1.2. Bringing user-defined text input UText to a form suitable for programmatic processing, and recording processed UText into the variable SearchData. A valid SearchData is a sequence of characters, each element of which belongs to the LettersOfWord set or is a space. The implementation of steps 1.2.1 – 1.2.2.

Step 1.2.1. If SearchData contains apostrophe characters, cast them to the unified character «’».

Step 1.2.2. Cast all SearchData characters to lowercase.

Step 1.3. If 0 < Limit ⩽ ActualLimit, make ActualLimit equal to Limit, otherwise leave ActualLimit unchanged.

Step 2.1. Making a query to the database: search in the UDC table for all occurrences where SearchData appears in the Belarusian field at any position in the line. Sort the totals according to the UTF-8 character table. Record of the found data in the Found set, each element of which consists of the Notation (UDC code), English (data from the English version of UDC), Belarusian (data from the Belarusian version of UDC) fields. Record the total number of Found elements in the FoundCount variable.

Step 2.2. Color highlighting of SearchData in the Belarusian field of each Found element using the corresponding HTML tags and their attributes.

Step 2.3. Creating the variable Step=0 for the subsequent writing of data to the Result set.

Step 2.4. If Step < ActualLimit, create a triple <Found.Notation[Step], Found.English[Step], Found.Belarusian[Step]>, record the triple into the Result set, increment Step and complete step 2.4 again, otherwise go to step 3.

Step 3. Displaying the values of the ActualLimit, FoundCount variables and the contents of the Result set in a special format.

The end of the algorithm.

 

User interface description

The user interface of the service is shown in Figure 1.

 

Figure 1 – The graphical interface of the service «UDC Code Finder»

 

The interface contains the following areas:

  • field for request entering;
  • field for entering the number of results limit;
  • the «Search!» button, which starts processing and makes it possible to obtain results in the output field.

After clicking on the button «Search!» the user receives the following information for each result found:

  • class code;
  • class description in English;
  • class description in Belarusian.

 

User scenarios of work with the service.

Scenario 1. Search with the default number of results.

  1. Enter the word in the query input field.
  2. Leave the «Number of results limit» field unchanged or clear it.
  3. Click the «Search!» button.
  4. Get information.

Scenario 2. Search with the certain number of results.

  1. Enter the word in the query input field.
  2. Enter in the field «Number of results limit» the required number of results (maximum 30)
  3. Click the «Search!» button.
  4. Get information.

A possible result of the service work is presented in Figure 2.

 

Figure 2 – The result of the service «UDC Code Finder» work

 

Access to the service via the API

To access the «UDC Code Finder» service via the API, you should send an AJAX request of the POST type to the address https://corpus.by/UdcCodeFinder/api.php. The following parameters are passed through the data array:

  • localization – localization of the result. The following options are available: be, en, ru.
  • text – the word for search.
  • limit – the number of results limit.

Example of AJAX request:

$.ajax({
   type: “POST”,
   url: “https://corpus.by/UdcCodeFinder/api.php”,
   data:{
      “localization”: “en”,
      “text”: “аб’ект”,
      “limit”: “30”
   },
   success: function(msg){ },
   error: function() { }
});

The server will return a JSON array with the following parameters:

  • text – input text – word for search;
  • limit – input limit of the number of results;
  • result – search result in html-format;
  • resultCnt – the number of results found.

For example, using the above AJAX request, the following response will be generated:

[
   {
      “text”: “аб’ект”,
      “limit”: “30”,
      “result”: “<p><b>Notation:</b> 165.3<br><b>English:</b> Object, scope and limits of knowledge<br><b>Belarusian:</b> <b><font color=”blue”>Аб’ект</font></b>, аб’ём і межы ведаў</p><br><p><b>Notation:</b> 2-13<br><b>English:</b> The Holy. The sacred. The supernatural. Object(s) of religion/worship<br><b>Belarusian:</b> Святое. Сакральнае. Звышнатуральнае. <b><font color=”blue”>Аб’ект</font></b>(ы) рэлігіі/культу</p><br><p><b>Notation:</b> 316.1<br><b>English:</b> Object and scope of sociology<br><b>Belarusian:</b> <b><font color=”blue”>Аб’ект</font></b> і сфера дзейнасці сацыялогіі</p><br><p><b>Notation:</b> 368.025.2<br><b>English:</b> Object of insurance: persons or things at risk of damage or injury<br><b>Belarusian:</b> <b><font color=”blue”>Аб’ект</font></b> страхавання: асобы або рэчы, якім пагражаюць пашкоджанні або страты</p><br><p><b>Notation:</b> 368.025.3<br><b>English:</b> Object of insurance: property or item of property at risk of loss<br><b>Belarusian:</b> <b><font color=”blue”>Аб’ект</font></b> страхавання: прадметы ўласнасці або маёмасць у цэлым, якія могуць быць страчаныя</p><br><p><b>Notation:</b> 523.31<br><b>English:</b> The Earth as an astronomical body<br><b>Belarusian:</b> Зямля як астранамічны <b><font color=”blue”>аб’ект</font></b></p><br>”,
      “resultCnt”: “6”
   }
]

 

Source references

Service page: https://corpus.by/UdcCodeFinder/?lang=en

UDC Consortium website: https://www.udcc.org/

UDC tables in Belarusian: https://www.udcsummary.info/php/index.php?lang=be&pr=Y

Cross references

  1. Драгун, А. Я. Аўтаматызаваныя сродкі атрымання расшыфроўкі і спісаў кодаў беларускамоўнай Універсальнай дзесятковай класіфікацыі / А. Я. Драгун, Я. С. Зяноўка, М. С. Галаўчак, С. С. Маеўскі, Ю. С. Гецэвіч // Материалы VII Международного конгресса «Библиотека как феномен культуры» : Краеведение и страноведение в сохранении культурного разнообразия, Минск, 21–22 октября 2020 г. / Национальная библиотека Беларуси ; [сост.: Т. В. Кузьминич, А. А. Суша]. – Минск, 2020. – С. 300-306.
  2. Станіславенка, Г.Р. Выкарыстанне камп’ютарна-лінгвістычных сродкаў для перакладу ўніверсальнай дзесятковай класіфікацыі дамена “тэатр” з англійскай на беларускую мову і генерацыя алфавітна-прадметнага паказальніка / Г.Р. Станіславенка, Ю.С. Гецэвіч, С.І. Лысы // Актуальные вопросы германской филологии и лингводидактики: материалы XX Междунар. науч.-практ. конф. / Брест. гос. ун-т  имени А.С. Пушкина; редкол.: Е. Г. Сальникова [и др.]. — Брест : Альтернатива, 2016. — C. 264-266.
  3. Станиславенко, А.Г. Этапы подготовки первого издания УДК на беларусском языке / А.Г. Станиславенко, С.И. Лысы, Ю.С. Гецевич // Информация в современном мире : доклады Международной конференции, Москва, 25-26 октября 2017 г. / ВИНИТИ РАН. — Москва : ВИНИТИ РАН, 2017. — C. 297-303.

If you have found a spelling error, please, notify us by selecting that text and pressing Ctrl+Enter.