What I’ve Published

reasearch-section

My Research Publications

My research focuses on leveraging AI and Natural Language Processing to create innovative solutions.
Recent works include:

Empowering Teachers with Usability-Oriented LLM-Based Tools for Digital Pedagogy

Empowering Teachers with Usability-Oriented LLM-Based Tools

We present our work on two LLM-based tools
that utilize artificial intelligence and creative
technology to improve education.

Predicting Sustainable Development Goals Using Course Descriptions from LLMsto Conventional Foundation Models

Predicting Sustainable Development Goals Using Course Descriptions
from LLMs to Conventional Foundation Models

We present our work on predicting United Nations sustainable development goals (SDG) for university
courses.

Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish

Scaling Sustainable Development Goal Predictions across Languages: From
English to Finnish

In this paper, we leverage an exclusive English dataset to train diverse multilingual classifiers, investigating their efficacy in adapting
to Finnish data.

Breaking Down Finnish Compounds: Creating a Dataset for NLP

Breaking Down Finnish
Compounds: Creating a
Dataset for NLP

This project has a dataset that decomposes Finnish compound words into their component parts, creating a resource to support computational models and a range of NLP applications.

Finetuning and Improving Prediction Results of LLMs Using Synthetic Data

This thesis evaluates several open-source large language models — Llama 3 (8B), Gemma (2B and 7B), and Phi 2 (2.7B) — implementing a methodology that includes training dataset generation, automated evaluation, comparative analysis, and error analysis.

Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages

Editor & Chairman of IWCLUL 2024

Editor and chairman of the Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages. The purpose of IWCLUL is to bring together researchers working on computational approaches to Uralic languages

My Open-Source Contributions

As part of my research and thesis work, I contributed datasets and fine-tuned models to the Hugging Face Hub, supporting NLP applications in sustainability and Finnish language processing. These resources have been downloaded 100+ times by developers and researchers worldwide.

A dataset that decomposes Finnish compound words into their component parts to support computational models and NLP research.

Generated using ‘Mistral-7B-Instruct-v0.2’ on 30 official sustainability documents to fine-tune LLMs for sustainability-related question-answering tasks.

These models were fine-tuned to answer sustainability-related questions, achieving measurable improvements in performance for specialized domains.

Read all my research

Peak an eye on my latest research, maybe you ended up interested in something related.