HR-CLARIN research infrastructure and open science repository

poster presentation × friday × 13.30-15.00

Daša Farkaš

University of Zagreb Faculty of Humanities and Social Sciences 
Zagreb, Croatia

Vanja Štefanec

University of Zagreb Faculty of Humanities and Social Sciences 
Zagreb, Croatia

Marko Tadić

University of Zagreb Faculty of Humanities and Social Sciences 
Zagreb, Croatia

One of the biggest challenges for any researcher is keeping up with the large amount of research data and results and selecting the relevant from the less relevant or completely useless data. Research infrastructures, thanks to their full commitment to open science, provide their communities with access to available resources, maximising the possibilities of finding, retrieving, interoperability and reusing data. Research infrastructures are considered to be centres of knowledge and innovation and are one of the fundamental pillars of the European Research Area. According to the Research Infrastructure Development Roadmap in the Republic of Croatia 2023–2027 published by the Ministry of Science and Education, research infrastructures “provide unique knowledge, expertise, comprehensive resources and services to research communities to conduct research and stimulate the development of innovation. They include scientific equipment or sets of instruments, knowledge-based resources such as collections, archives or scientific data infrastructures, computing systems, communication networks and other infrastructure of a unique nature.” (Roadmap, 2023:1)

They should also be open to external users, as they require, attract and retain high-quality researchers. According to the Ministry’s Plan they are key to achieving excellence in research and innovation. Croatian research infrastructures should promote the use of the FAIR principles for research data, enable open and transparent access to infrastructure for all relevant stakeholders under equal conditions, strengthen international cooperation and the visibility of Croatian scientists and their success in international projects, encourage excellence in science, and consolidate research communities at the national level.

CLARIN ERIC, a research infrastructure for language resources and technology is creating and maintaining an infrastructure to support the sharing, use and sustainability of language data and tools for researchers in the humanities and social sciences. It has grown into a network of 25 member and observer third-party countries, with 70 CLARIN centres, over 900,000 records in its repositories, and an immeasurable number of contributors, users, and trainers. One of the members of the CLARIN research infrastructure is HR-CLARIN, a Croatian research infrastructure that provides language resources, technologies and expertise, as well as knowledge transfer to researchers in the humanities and social sciences , with a focus on Croatian language resources and tools. It also develops and stores language resources for other languages, e.g. Latin and Old- Church Slavonic.

Building a community of users and engaging with them mainly takes place through the activities of Croatina – CLARIN’s Knowledge Center (K-Center) for the Croatian language, which was founded in 2024 and involves two institutions that are both members of the national consortium HR-CLARIN: the Institute of Linguistics of the Faculty of Humanities and Social Sciences, University of Zagreb (FFZG) and the Institute of the Croatian Language (IHJ). Croatina provides relevant knowledge about the Croatian language and promotes the use of language technologies for the Croatian language, offers users with support via a helpdesk providing relevant information on topics related to the Croatian language, and advises users on building and storing their own language resources.

The backbone of HR-CLARIN is a repository for storing language resources whose structure strongly supports open science. Once the HR-CLARIN repository was established, it opened up access to the storage and sharing of language resources for Croatian scholars. In terms of a software solution, the HR-CLARIN is the first CLARIN repository launched on Lindat DSpace v7. Users can store their language resources in the repository, with each language resource receiving a unique persistent identifier (PID) and it is recommended to include it in the citation of used language resources. Since the citation of data sources is still an unresolved issue within the Croatian scientific community, CLARIN also offers recommendations for best practice citation using persistent identifiers so that, on the one hand, the work of the authors of the data is adequately validated, and on the other hand, the dataset used is uniquely identified for the purpose of ensuring the reproducibility of research. Digital datasets are not, like publications, fixed entities. They can change over time and experience several versions, for example, if the authors decide to upgrade or expand their language resource. The persistent identifier system ensures that the version used will always be available, while at the same time drawing users’ attention to possible newer versions. Finally, if for some reason the location of the resource changes, persistent identifiers ensure that the user will always be directed to the current location of the resource. Users can also search for and access available resources in a manner that respects the licensing terms specified by the resource author. Croatian language resources stored in the HR-CLARIN repository are directly included in the European federation of CLARIN ERIC repositories and are visible and, depending on the chosen license, accessible to all authenticated researchers and/or the wider interested public.

keywords

academic publishing; journal evaluation; journal ranking; Open Access; predatory journals; research integrity

References

Danzin, A. (1992). Towards a European Language Infrastructure. Brussels: Commission of the European Communities, DG XIII.

Jong de, F., Fišer, D., Frontini, F., Van Uytvanck, D., Witt, A. (2022). Language matters. In D. Fišer, A. Witt (Ed.), CLARIN – The Infrastructure for Language Resources (pp. 31–42).
Berlin/Boston: De Groyer.

Krauwer, S., Maegaard, B. (2022). CLARIN – How It Started. In D. Fišer, A. Witt (Ed.),
CLARIN – The Infrastructure for Language Resources (pp. 3–24). Berlin/Boston: De Groyer..

Ministry of Science and Education. (n.d.). Research Infrastructure Development Roadmap of the Republic of Croatia 2023 – 2027.

Skip to content