Resources

This resource consists of two tools: one to classify toxic data in Finnish (e.g., insults, obscene language) from datasets retrieved from social media platforms; and another to identify registers (genres, e.g., reviews, interviews, news reports) from web content in diverse languages.

Developed by

This resource collects chat data from the live stream service Twitch and YouTube. Thanks to this resource, researchers will be able to retrieve and analyze larger samples of chat data from the livestream services Twitch and YouTube.

The tools sidebar contains multiple ways to collect data, but also sections for chat content classification based on machine learning and video clip analysis based on Multimodal Large Language Models.

Resource developed by the University of Jyväskylä with collaboration from Tampere University.
Guidance can be found in the website of the resource.

Tutorial/Demo

Named entity recognition (UI): https://arkkiivi.fi/ 

Named entity recognition (Huggingface): https://huggingface.co/Kansallisarkisto/finbert-ner 

Document type classification https://huggingface.co/jyu-digihum/findoctype 

Most of the tool development has been conducted in collaboration with the National Archives of Finland.

L2 Finnish model is a classification model trained with CEFR annotated data containing fictional and non-fictional texts written by Finnish as a second language (L2) speakers. With the model you can classify texts into the following CEFR classes: A1, A2, B1, B2, and C.

This resource provides a framework for building customizable and responsive user interfaces for semantic portals without the necessity of having broad coding skill.

Developed by

Contact

This resource provides tools based on network analysis for the analysis of political text. With these tools, researchers will be able, for example, to analyze keyword embeddings of the FinParl corpus and identify how phrases or longer text passages are re-used over time in he MPs plenary debates of the Finnish parliament.

Resource developed by the University of Turku in partnership with Aalto University. Collaborators: the University of Jyväskylä.

Developed by

Contact

Developed by

Contact

Access: https://github.com/uh-dcm/finnish-forum-scrapers

Developed by

This resource allows to download copyright-free materials from the National Library of Finland through the CSC.

Contact

kk-tutkijapalvelut@helsinki.fi

Developed by

Contact

Developed by

Contact

Tutorial/Demo

Developed by

Contact

Information and guidance can be found in the webistes of the resources.

Tutorial/Demo

Developed by

Contact

Developed by

Contact

A public directory of publications (research articles, conference proceedings, data publications) that point at, explain or introduce use cases for the infrastructures developed by the DARIAH-FI partners for the FIN-CLARIAH project.

Contact

UX questionnaire developed within DARIAH-FI to test and evaluate tools, datasets or workflows developed for the project. The questionnaire was created and updated in several phases between 2022-2023 from a literature review, semi-structured interviews, and tests with end-users.

Developed by

Developed by

Developed by

Developed by

Developed by