Tool – L2 Finnish model

L2 Finnish model L2 Finnish model is a classification model trained with CEFR annotated data containing fictional and non-fictional texts written by Finnish as a second language (L2) speakers. With the model you can classify texts into the following CEFR…

Tool – Finnish Forum Scrapers

Finnish Forum Scrapers This is an application for scraping comment-data from Finnish resources with high user traffic. Access: Tutorial/Demo Developed by University of Helsinki Contact Matti Nelimarkka

Tool – Document understanding

Document Understanding Tools The archival data team, consisting of Venla and Ida, has worked on producing tools for document understanding. This refers to various kinds of processing of documents, such as named entity recognition and document type classification. Named entity…

Tool – Educational resource development

Educational resource development This document provides an updated report on the educational resource development in DARIAH-FI for the 2024–2025 funding period. Developed by Tampere University

Tool – Educational material

Educational material This document includes information regarding the educational materials relevant to the DARIAH-FI research infrastructure and guidance on which courses might be relevant to use its resources more efficiently. The document also includes an overview of the state of…

Tool – Protocol for collecting workshop data

Guideline for collecting user experiences from workshops and training sessions This document is intended to serve as an initial guide for collecting user experience data from workshops and training sessions related to the resources developed by the FIN-CLARIAH consortium. Developed…

Tool – Research Data Management handbooks

Research Data Management handbooks A collection of open access digital handbooks for research data management for SSH fields edited by the Helsinki Institute for Social Sciences and Humanities in Spring/Autumn 2024. The five guides cover: Texts, register data, surveys, social…

Tool – DARIAH-FI Zotero library

DARIAH-FI Zotero library A public directory of publications (research articles, conference proceedings, data publications) that point at, explain or introduce use cases for the infrastructures developed by the DARIAH-FI partners for the FIN-CLARIAH project. Link Contact Inés Matres

Tool – Forensic Linguistics Corpus and Search Interface C.R.I.M.E.

Forensic Linguistics Corpus and Search Interface C.R.I.M.E. This resource is a structured, searchable corpus comprising audio and ASR-generated transcripts from investigative interviews, courtroom interactions, and related media. Access the database: the static dataset: Additional information (user guide, proceedings…