New collections and resources available in the VLO CLARIN

With the Virtual Language Observatory (VLO), CLARIN provides a means of exploring language resources and tools. The content made available through the VLO is provided by CLARIN centres as well as a number of external sources. This content is far from static: three times a week, CLARIN “harvests” resource descriptions (metadata) from these sources to ensure an up-to-date reflection of the status of available language resources, services and tools. New sources are introduced on a regular basis as well. As a result, the number of resources and collections that can be discovered by searching and browsing through the VLO has been steadily expanding over the years.

This post provides an update on the content that can be discovered through the VLO by highlighting a number of interesting recent additions. First, we will illustrate newly selected content aggregated by and retrieved from Europeana. Then we will showcase four centres that have started providing metadata to the VLO in the past half year: the Lund University Humanities Lab, the ZIM Centre for Information Modelling in Graz, the CLARIN centre for Latvian language resources and tools (Riga) and the Speech Synthesis and Recognition Laboratory of UIIP NAS Belarus.

Note that many CLARIN centres and other providers are constantly adding new resources to their repositories and catalogues. So when visiting the VLO, make sure to also look for resources matching your interest beyond the collections highlighted below!


The Speech Synthesis and Recognition Laboratory of UIIP NAS Belarus works in the fields of text and speech processing on the basis of human-human, human-machine and machine-machine communications. The Lab has expertise in the building of systems for stationary, mobile and web-based platforms for Belarusian, Russian and English languages.

The special platform has been developed and is being constantly updated further to provide users with a set of 50+ tools (services) for text, voice and other data processing. The developed services are then grouped into thematic domains for more convenient use in specific fields of application.

The approach to the development of each service, is simple, the user can run the service by clicking only on one button, with this action the input test data will be processed and the results will be shown. After this, the user is offered to input his own data and adjust the setting before running the tool.This approach helps students and researchers to get up to speed on NLP and test new hypotheses faster.

The platform provides tools for tokenization, morphological analysis, voiced electronic grammatical dictionary, part-of-speech tagging, frequency counter, spell checking, text-to-speech and many others.

The lab has recently started the process of CMDI metadata creation for all online resources, which means that part of the services are now available via the VLO. All services can also be accessed through the platform directly. More information is available on the Speech Synthesis and Recognition Laboratory of UIIP NAS Belarus website.

Yuras Hetsevich


Download (PDF, 32KB)