DARIAH-FI Workshop: New datasets and tools for social media research
Organisers: Mikko Laitinen & Paula Rautionaho (University of Eastern Finland)
Date: Tuesday 9 December, at 9-12am (Online)
The workshop is free and open but register to participate here: https://link.webropol.com/ep/dariahsome
This workshop introduces datasets, tools and related services recently developed for social media and web research. In recent years, the Finnish Network for Data-intensive Research in the Humanities and Social Sciences (DARIAH-FI) together with The Language Bank of Finland (Kielipankki) have developed tools and methods to access, analyze, and enrich large-scale data from social media outlets, gaming outlets and other web-based platforms.
During the last two years the work has focused on facilitating research with new datasets and tools; this workshop showcases these infrastructures. For instance, DARIAH-FI affiliates have developed tools to enable data scraping from online forums or to provide automatic summaries of video clips. On the other hand, large collections of online language have been collected and enriched with metadata to allow more detailed analysis.
In this workshop, the developers of the new resources will demonstrate their tools and datasets. Also, a representative of FinnARMA (Finnish Association of Research Managers and Administrators) introduces the new ethical guidelines for the use of social media data in research, which will be followed by a discussion on the topic.
Format: Pre-recorded videos + online session
- The pre-recorded videos showcasing the tool or dataset will be made available before the workshop so that the participants can watch them beforehand.
- Presenters are free to use their 20 minutes as they wish (e.g. a tour of the tool/dataset with examples of what can be done with it, or a Q&A session).
- Each presentation will also consider some of the ethical challenges linked to the dataset/tool (e.g. how the data was collected, or how the tool accesses data).
Program:
9.00 Welcome
9.10-9.30 Finnish online forum scrapers (Matti Nelimarkka, University of Helsinki)
9.30-9.50 Game streams: automatic video clip summaries (Raine Koskimaa, Jari Lindroos, University of Jyväskylä)
9.50-10.10 Nordic Tweet Stream and beyond (Masoud Fatemi, Mehrdad Salimi, Mikko Laitinen, University of Eastern Finland)
10.10-10.30 Break
10.30-10.50 Online corpora from social media sources//Jupyter notebook pipeline (Steven Coats, University of Oulu)
10.50-11.10 TurkuNLP web register resources (Erik Henriksson, University of Turku)
11.10-11.55 FinnARMA guidelines for social media research (Katja Laine, University of Vaasa)
11.55-12.00 Closing
The videos and the Zoom link will only be provided to registered participants. Registration is open for anyone interested and until Monday December 8, 2025.
For questions, please contact Paula Rautionaho (firstname.lastname@uef.fi).
