About Me

I am an M.S. candidate at Machine Learning and Intelligence Lab (MLILAB) in Korea Advanced Institute of Science and Technology (KAIST), advised by Prof.Eunho Yang.

My research broadly focuses on large language models. Recently, I have concentrated on improving data efficiency in reinforcement learning and addressing the overthinking problem in LLMs through reinforcement learning-based approaches.

Experiences

Research Engineer

Jul. 2022 - Mar. 2024

While working at Ringle, an English-speaking education startup, I spearheaded the planning, development, and deployment of an AI-driven engine, CAF, which predicts IELTS and TOEFL speaking scores by analyzing conversations between tutors and learners.

  • Created a speech dataset labeled with IELTS speaking levels and addressed data imbalance through data augmentation.
  • Analyzed the causes of low automatic speech recognition (ASR) performance for non-native English learners and improved it via supervised fine-tuning.
  • Defined features related to English proficiency and developed algorithms to extract these features from both speech and transcribed text.
  • Developed and deployed a model to predict IELTS and TOEFL speaking levels based on the extracted features.

Visiting Researcher

Mar. 2023 - Mar. 2024

I was dispatched to KAIST Interaction Lab (KIXLAB) as a visiting researcher affiliated with Ringle, under the supervision of Prof.Juho Kim. I have published a paper, titled 「LearnerVoice: A Dataset of Non-Native English Learners’ Spontaneous Speech」, as the first author, which has been accepted for INTERSPEECH 2024. Additionally, I filed two patents: “System for Learning English Speaking and Method Thereof” and “System for Diagnosing and Learning Pronunciation and Method Thereof”.

  • Constructed a dataset of L2 learners’ spontaneous speech with fully transcribed transcriptions.
  • Defined new features related to ASR errors in L2 speech through linguistic analysis of transcriptions and human annotation.
  • Improved the performance of state-of-the-art models by fine-tuning them on the constructed dataset.
  • Explored synthesizing non-native speaker-like audio that can mimic the distributions of defined features.

Publications

LearnerVoice A Dataset of Non-Native English Learners’ Spontaneous Speech
Haechan Kim, Junho Myung, Seoyoung Kim, Sungpah Lee, Dongyeop Kang, Juho Kim
INTERSPEECH 2024, Shortlisted for ISCA Best Student Paper Award 2024

Projects

<KAIST CS479: Machine Learning for 3D Data> (Best Poster Awards)

This project addresses the issue of synthesizing meshes from single-view images containing multiple objects. We inferred the relative distances between objects using depth maps image segmentation and stable diffusion inpainting, and based on this, we chose an approach to reassemble each synthesized mesh in a 3D scene.

<KAIST CS475: Machine Learning for Natural Language Processing>

The project proposes a new benchmark task for evaluating large language models (LLMs) by demonstrating the correlation between the level of knowledge in LLMs and the degree of bias. We utilize the concept of markedness and statistical methods, Finghtin’ Words method, to measure it.

<KAIST CS471: Graph Machine Learning and Mining>

This project demonstrate message passing GNN models, especially GCN, loses cyclic information theoretically and experimentally. We propose a novel approach using cycle nodes and cycle size dimension, and it performs significantly higher than prioir works.

<KAIST CS470: Introduction to Artificial Intelligence>

This project addresses the issue of low performance in automatic speech recognition (ASR) for long-form spontaneous speech. By employing force alignment, we identified the positions of silence and filler words in the original audio and either removed them or replaced them with shorter silences, resulting in a 56% improvement in ASR performance.

<KAIST CS376: Machine Learning>

This project involves applying StyleGAN at the application level. We took real soccer players’ images as inputs and carried out the task of transforming them into images in the style of FIFA online game graphics. Throughout this project, I acquired an understanding of GANs and learned techniques for efficient training by freezing each layer of the GAN.

<KAIST CS479: Machine Learning for 3D Data>
<KAIST CS479: Machine Learning for 3D Data>
<KAIST CS479: Machine Learning for 3D Data>
<KAIST CS470: Introduction to Artificial Intelligence>
<KAIST CS470: Introduction to Artificial Intelligence>
<KAIST CS470: Introduction to Artificial Intelligence>
<KAIST CS470: Introduction to Artificial Intelligence>
<KAIST CS330: Operating Systems and Lab>

Skills & Proficiency

Python & Pytorch

C & C++

Java & Scala & MATLAB