Analytical support for computational social sciences and humanities

Leader: Mikko Laitinen, University of Eastern Finland; Participants: JYU, UOULU, UHEL/SOC; Collaborator: UHEL/NLF

The foreseen impact of this WP is that it enables researchers to utilise large born-digital data effectively and to focus on analysis rather than dealing with technical details in often high volume and high velocity. The activities not only produce user-generated textual material for large-scale language models, but also result in representative benchmarks, contributing to increased replicability and reproducibility of SSH research, integrating text analysis with audio-visual cultural heritage (images, multimodal properties of speech, multimodal items in social media). Researchers currently have potential access to extremely large born-digital data sources, including but not limited to game streams, multimodal social media applications, and open-ended answers in surveys. Despite potential benefits, these sources are currently underused in SSH for technical, ethical and practical reasons. We need efficient analytical support tools for various fields:

  • In game studies, multimodal game stream analysis needs to utilise video stream interactions between video streams and stream chats and generate data used in developing AI-based solutions.
  • In computational sociolinguistics and dialectology, we need to develop tools for multimodal born-digital social media analysis and workflows for accessing and analysing dynamic social media data for computational research, including algorithmic tools to access social media interaction in digital networks, tools for the analysis of multimodal properties of naturalistic speech (e.g. phonetics, prosody, and facial expressions) in large online collections of multilingual data, and ways to further develop our understanding of regional language variation in the context of social media.
  • As for digital culture studies, we need to develop solutions for multimodal cultural heritage analysis.
  • Computational social science needs to enrich survey data by combining structured register data with unstructured textual data.

Mikko Laitinen, University of Easter Finland

DARIAH-FI: YLEISET KYSYMYKSET

DARIAH-FI: GENERAL

DARIAH-KONTTORI:

Turun yliopisto

Veronika Laippala

DARIAH-KONTTORI:

Jyväskylän yliopisto

Tanja Välisalo

DARIAH-KONTTORI:

Itä-Suomen yliopisto

Paula Rautionaho

DARIAH-KONTTORI:

Oulun yliopisto

Marika Rauhala

DARIAH-KONTTORI:

Aalto-yliopisto

Eero Hyvönen

DARIAH-KONTTORI:

Helsingin yliopisto

Risto Turunen

DARIAH-KONTTORI:

TampereEN YLIOPISTO

Sanna Kumpulainen

DARIAH-KONTTORI:

Suomen Kansalliskirjasto

Johanna Lilja

DARIAH-KONTTORI:

CSC – Tieteen tietotekniikan keskus

Katri Tegel

DARIAH-FI OFFICE:

CSC – IT Centre for Science

Katri Tegel

DARIAH-FI OFFICE:

National Library of Finland

Johanna Lilja

DARIAH-FI OFFICE:

Tampere University

Sanna Kumpulainen

DARIAH-FI OFFICE:

Aalto University

Eero Hyvönen

DARIAH-FI OFFICE:

University of Oulu

Marika Rauhala

DARIAH-FI OFFICE:

University of Eastern Finland

Paula Rautionaho

DARIAH-FI OFFICE:

Jyväskylä University

Tanja Välisalo

DARIAH-FI OFFICE:

University of Turku

Veronika Laippala