FIN-CLARIAH organised a roadshow event in Vaasa on March 14th, where some 20 local researchers, teachers and students learned about some of the central resources and tools for social science and humanities (SSH) research, with special emphasis on acquiring, processing and depositing born-digital data. From the University of Vaasa, the Natureach project will join the event with a presentation of AR/VR data and its analysis. In the following you can read a summary of presentations (watch the recording here) and on the bottom I provide a summary of the discussion.
The host of this event was Prof. Merja Koskela, the dean of the School of Marketing and Communication at the University of Vaasa.
After a warm welcome, Inés Matres, national coordinator of DARIAH-FI, briefly introduced FIN – CLARIAH, the premier Finnish digital research infrastructure (RI) for Social Sciences and Humanities (SSH). The presentation provided an overview of the kind of support and digital humanities landscape that exists in Finland as well as freely available resources that enable data-intensive research.
Mietta Lennes introduced Kielipankki – The Language Bank of Finalnd, with emphasis on How to find, use and deposit your research data and tools via Kielipankki.

Harri Kettunen & Tiina Onikki-Rantajääskö, from the University of Helsinki, introduced the collaborative terminology work at the Helsinki Term Bank for the Arts and Sciences (HTB), and how researchers can contribute to reliability of Information with their participation.
After a short break, Erik Henriksson, from the University of Turku’s TurkuNLP, introduced a set of three tools to make sense of noisy web data: a toxicity classifier for detecting harmful content, a question-answer extraction system for building conversational datasets, and a classifier for identifying registers (or genres) on the web.
The local audience had an interest in audiovisual and game stream data, so Prof. Raine Koskimaa and Jari Lindroos presented and demonstrated the new release of Twitch collector and analysis tools, that will support analyzing the visual content of live-streaming videos, alongside integrating audio and chat data to gain deeper insights into online video interactions and game participants.
Finally, a local infrastructure, Natural impact – VR in NATUREACH (Luonnon hyvinvointivaikutukset) was introduced by Rebekah Rousi alongside Martta Ylilauri, Joni-Roy Piispanen. The main goal of the NATUREACH project is to improve human health and well-being by developing new digital nature-based service models for social and health care. In this event, we will present the tools used in the project and highlight some of the results reached so far.
After these presentations a discussion between the presenters and local researchers, in their majority representing communication and marketing fields, thus having many questions about better access and support for using social media data.
The discussion focused on getting data, such as from Finnish discussion forums or social media datasets. While The Language Bank of Finland holds large corpora of social media data (such as Suomi24 and NordicTweetStream), this is most helpful for teaching purposes. Researchers working “in the now”, require frameworks to acquiring living data, that is to be able to follow social media streams of data as events unfold. For this, both the TurkuNLP tools, and new infrastructure planned this year for scraping Finnish citizen forums, as well as guidelines about depositing social media datasets in The Language Bank of Finland will be most welcomed.
Local researchers denounced the lack of national unified ethical and legal frameworks for all universities on how to access and negotiate data acquisition with data owners. Researchers are paying to acquire social media content, making it very hard to conduct research, and the need to consider visual and audiovisual data uploaded by people and increasingly being generated by AI. The continuously changing technological and platform landscape adds a difficulty to have ad-hoc and agile mechanisms spread across universities and for all types of data, specially concerning social media platforms where data is hosted and owned by private and commercial companies. We took good note of these demands that are surely shared by many researchers in social sciences and humanities Finland-wide working closely with communities such as media professionals, journalists or are interested in people’s everyday communicative practices and digital information environments.