Datan osajoukkoihin jakaminen & kohinan ja vinoutumien tunnistaminen

Subsetting Data, Detecting Bias and Noise

Leader: Eetu Mäkelä, University of Helsinki/ARTS; Partners: UHEL/SOC, UTU, UEF, CSC; Collaborator: UHEL/NLF

The foreseen impact is to provide subparts of large data sets that are easier to manage and process for SSH scholars. The large datasets created in other work packages, which are of interest to a wide community of researchers, but have not originally been created for research, contain a range of biases, confounders and noise. Noise in online data streams evolves fast, quickly deteriorating detection accuracies of static systems. We need to develop tools by which researchers are able to robustly query and examine the large datasets to extract the subsets that cover their particular interest. We will deliver a process/service for indexing large datasets and robustly querying them for subsets of interest, an environment for obtaining statistical overviews of (sub)datasets to uncover and evaluate biases and suitability, and provide intelligent noise reduction applications for real-time social media data capture to identify bots and trolls.

DARIAH-FI: YLEISET KYSYMYKSET

DARIAH-KONTTORI:

CSC – Tieteen tietotekniikan keskus

Katri Tegel

DARIAH-FI OFFICE:

Jyväskylä University

Tanja Välisalo

DARIAH-FI OFFICE:

University of Eastern Finland

Paula Rautionaho

DARIAH-FI OFFICE:

University of Oulu

Marika Rauhala

DARIAH-FI OFFICE:

Aalto University

Eero Hyvönen

DARIAH-FI OFFICE:

Tampere University

Sanna Kumpulainen

DARIAH-FI OFFICE:

National Library of Finland

Johanna Lilja

DARIAH-FI OFFICE:

CSC – IT Centre for Science

Katri Tegel

DARIAH-KONTTORI:

Suomen Kansalliskirjasto

Johanna Lilja

DARIAH-FI: GENERAL

DARIAH-KONTTORI:

TampereEN YLIOPISTO

Sanna Kumpulainen

DARIAH-KONTTORI:

Helsingin yliopisto

Risto Turunen

DARIAH-KONTTORI:

Aalto-yliopisto

Eero Hyvönen

DARIAH-KONTTORI:

Oulun yliopisto

Marika Rauhala

DARIAH-KONTTORI:

Itä-Suomen yliopisto

Paula Rautionaho

DARIAH-KONTTORI:

Jyväskylän yliopisto

Tanja Välisalo

DARIAH-KONTTORI:

Turun yliopisto

Veronika Laippala

DARIAH-FI OFFICE:

University of Turku

Veronika Laippala