Haechan Kim

Researcher / Engineer

Education

B.S. in School of Computing

Korea Advanced Institute of Science and Technology (KAIST)

Mar. 2018 - Feb. 2025

B.S. in School of Electrical Engineering

Korea Advanced Institute of Science and Technology (KAIST)

Mar. 2018 - Feb. 2025

M.S. in Kim Jaechul Graduate School of AI

Korea Advanced Institute of Science and Technology (KAIST)

Feb. 2025 - Present

Languages

Korean (Native)

Career Profile

I am M.S. student at KAIST, advised by Eunho Yang.

My research and experience have focused on large language models and automatic speech recognition, with topics broadly divided into (1) AI-driven education and (2) AI fairness. While working at Ringle, an English-speaking education startup, I spearheaded the planning and development of an AI-driven engine that predicts IELTS/TOEFL speaking scores by analyzing conversations between tutors and learners. Recognizing the low performance of automatic speech recognition (ASR) for the second language learners, I was dispatched to KAIST Interaction Lab (KIXLAB) as a visiting researcher affiliated with Ringle to conduct related research. I constructed a dataset of non-native English learners’ spontaneous speech and analyzed the reasons behind the low ASR performance for the second language learners. The paper about the research has been accepted for INTERSPEECH 2024. Furthermore, I conducted a course work project proposing a new evaluation benchmark task by attempting to measure the correlation between the knowledge level of large language models and fairness. I have also worked on projects such as generating 3D mesh from Single-View Multi-object images, winning the best poster awards.

Currently, I am interested in delving more mathematically into the field of artificial intelligence, especially minority group data augmentation and convex optimization.

Experiences

Natural Language Processing Engineer

Jul. 2022 - Mar. 2024

Ringle, Seoul

While working at Ringle, an English-speaking education startup, I spearheaded the planning and development of an AI-driven engine, named CAF, that predicts IELTS/TOEFL speaking scores by analyzing conversations between tutors and learners.

Defined features related to the English proficiency of learners.
Conducted analysis on both spoken and written language.
Analyzed the performance of ASR for non-native English learners.
Developed sentence structure classification algorithms through part-of-speech (POS) and dependency analysis.
Collected ground truth data for IELTS speaking levels and addressed data imbalance issues through data augmentation.
Conducted feature engineering based on the defined features.
Developed a final prediction model by using the augmented data and the extracted features.

Visiting Researcher

Mar. 2023 - Mar. 2024

KAIST Interaction Lab (KIXLAB), Daejeon

Recognizing the low performance of ASR for second language (L2) learners, I was dispatched to KAIST Interaction Lab (KIXLAB) as a visiting researcher affiliated with Ringle. I have published a paper about ASR for L2 spontaneous speech as the first author, which has been accepted for INTERSPEECH 2024. Additionally, I filed two patents under the names <L2 Speaker Annotation System> and <Pronunciation Diagnosis and Learning System>.

Constructed a dataset consisting of 50.04 hours of audio with fully transcribed transcriptions.
Defined new features attributed to ASR errors from L2 speech through linguistic analysis of transcriptions.
Constructed human annotation experiment to measure newly defined features.
Developed the automatic detection system to detect the features.
Improved the performance of state-of-the-art models by fine-tuning with the constructed dataset.
Attempted to synthesize non-native speaker-like audio that can mimic the defined features.

Publications

LearnerVoice A Dataset of Non-Native English Learners’ Spontaneous Speech

Haechan Kim, Junho Myung, Seoyoung Kim, Dongyeop Kang, Juho Kim

INTERSPEECH 2024, Shortlisted for ISCA Best Student Paper Award 2024

Projects

MultiDreamer: Generating 3D mesh from a Single-view Multi-object Image

<KAIST CS479: Machine Learning for 3D Data> (Best Poster Awards)

This project addresses the issue of synthesizing meshes from single-view images containing multiple objects. We inferred the relative distances between objects using depth maps image segmentation and stable diffusion inpainting, and based on this, we chose an approach to reassemble each synthesized mesh in a 3D scene.

What is the best LLM Chatbot in South Korea?

The project proposes a new benchmark task for evaluating large language models (LLMs) by demonstrating the correlation between the level of knowledge in LLMs and the degree of bias. We utilize the concept of markedness and statistical methods, Finghtin’ Words method, to measure it.

Augmentation of Graph Representations with Cycles

This project demonstrate message passing GNN models, especially GCN, loses cyclic information theoretically and experimentally. We propose a novel approach using cycle nodes and cycle size dimension, and it performs significantly higher than prioir works.

Accurate long-form audio transcription via voice activity detection and forced alignment

This project addresses the issue of low performance in automatic speech recognition (ASR) for long-form spontaneous speech. By employing force alignment, we identified the positions of silence and filler words in the original audio and either removed them or replaced them with shorter silences, resulting in a 56% improvement in ASR performance.

Turn Over a New Leaf : Face Image Style Transition for FIFA Player

This project involves applying StyleGAN at the application level. We took real soccer players’ images as inputs and carried out the task of transforming them into images in the style of FIFA online game graphics. Throughout this project, I acquired an understanding of GANs and learned techniques for efficient training by freezing each layer of the GAN.

PointNet

NeRF

Diffusion

Multi-Layer Perceptron (MLP)

Convolutional Neural Networks (CNN)

Markov Decision Process & Dynamic Programming

A* algorithm

PintOS

Haechan Kim

Researcher / Engineer

Education

B.S. in School of Computing

Korea Advanced Institute of Science and Technology (KAIST)

B.S. in School of Electrical Engineering

Korea Advanced Institute of Science and Technology (KAIST)

M.S. in Kim Jaechul Graduate School of AI

Korea Advanced Institute of Science and Technology (KAIST)

Languages

Career Profile

Experiences

Natural Language Processing Engineer

Visiting Researcher

Publications

Projects

Skills & Proficiency

Python & Pytorch

C & C++

Java & Scala & MATLAB