
FIN-CLARIAH day Annotating Social Data
This FIN-CLARIAH day brings together researchers, infrastructure developers, and social scientists to explore current practices and future needs in annotating data sets in social sciences for further analysis.
The day opens with a keynote by Salla-Maaria Laaksonen on collecting, annotating, and analyzing social media data, followed by insight talks from Krista Lagus on theory-based annotation using large language models (LLMs) and Katja Valaskivi on the challenges of working with interview data with sensitive subjects. In the afternoon parallel sessions offer practical perspectives on secure environments for handling sensitive data, hands-on demonstrations with the CSC Secure Desktop environment, and discussions on ethical agreements and algorithmic transparency.
The milestone event is hosted by the Centre for Social Data Science (CSDS), it supports the broader goals of FIN-CLARIAH in developing national infrastructure for digital humanities and social sciences, and welcomes researchers and students from across disciplines interested in qualitative data, narrative analysis, secure practices for sensitive data, and the responsible use of AI tools in the annotation pipeline.
When: November 28, 2025 (10:30-17:00)
Where: University of Helsinki (venue tbc)
Register for this event (closes on 20.11.): https://forms.office.com/e/tL5Nai4pYq (Please note that registration for FIN-CLARIAH partners starts on 1.10. Registration “until space is filled” starts in November).
Preliminary schedule
10:30 Welcome coffee
11:00 Welcome words
11:10 Keynote by Salla-Maaria Laaksonen, University of Helsinki “Dream infrastructures for a social scientist: experiences and hopes built on ten years of interdisciplinary computational hermeneutics”
12:00 Lunch break
13:00 Insight talks :
Katja Valaskivi, University of Helsinki “Should sensitive interview data be opened?”
Krista Lagus, University of Helsinki “Social Theory-based annotation of Text Data using LLMs”
13:30 Afternoon parallel sessions
1) CSC AITTA environment to deal with LLMs. Training large language model towards annotating data (Facilitator: Martin Mathiessen, CSC)
2) CSC SD environment and how to use it – hands on session (BYO) (Facilitators: Francesca Morello, Kimmo Mattila, CSC)
3) Agreements for the reuse of social media and interview data (Facilitator: Mietta Lennes, The Language Bank of Finland)
16:00 Social mingling / Steering group meeting
17:00 Event ends