
- This event has passed.
FIN-CLARIAH Workshop “Tools to make sense of web data”
December 10, 2024 @ 10:00 am - 12:00 pm
Note that this workshop is open for participants to the Digital Research Data and Human Sciences (DRDHum) conference that will take place as an onsite event at the University of Eastern Finland, Joensuu campus, 10-12 December 2024. To participate in this workshop, participants in the conference can opt-in when registering for the conference.
This 2-hour workshop presents the results, services, and ongoing work produced within FIN-CLARIAH (https://www.kielipankki.fi/organization/fin-clariah/). In addition to introducing FIN-CLARIAH and its core services, in this workshop we will focus on a selection of tools and datasets for web data giving an experimental and hands-on setting that complements the conference theme “Digital applications in the advent of ML and AI”. The practical section introduces four resources for researchers in the format of a brief tutorial with time for attendees to try the resources on their own laptops and to pose questions to the presenters. The workshop is open to all conference participants.
Preliminary schedule December 10, 10:00-12:00
- Introduction to FIN-CLARIAH resources for SSH research (Risto Turunen, DARIAH-FI national coordinator and Post-doc researcher University of Jyväskylä)
- Brief tutorial on services in the Language Bank of Finland for research on social media data (Mietta Lennes, The Language Bank of Finland, University of Helsinki)
- Brief tutorial on TurkuNLP tools, machine learning tools to annotate and identify toxic language, genre and interaction in web content (Erik Henriksson, Post-doc researcher University of Turku)
- Brief tutorial on Elasticsearch for subsetting data from social media, participants will learn how to explore large datasets that have not originally been created for research and extract the subset they are interested in. (Ville Vaara, Doctoral researcher, University of Helsinki)
- Brief tutorial on Nordic Tweet Stream (NTS), a multilingual monitor corpus of geolocated tweets and associated metadata from the Nordic region covering the period 2013-2023. The data was collected using the academic API which is now closed. (Masoud Fatemi, Doctoral researcher, University of Eastern Finland)
- General discussion.
Information about the DRDHum conference: https://sites.uef.fi/drd-hum-2024/