my photo

Youngmin Kim

Hi, I'm Youngmin Kim. I'm a researcher at Yonsei University, MIRLAB (Multimodal Intelligance Research Lab) advised by Youngjae Yu. I received my bachelor's degree in Economics and Computer Science, and I am currently pursuing an integrated MS/Ph.D program in Artificial Intelligence.

My research question is "How can we enable AI systems to deeply understand and extract meaningful information from videos, and how can this understanding be effectively connected to human perception and communication?". I'm deeply interested in how AI can comprehend the complex and rich information embedded in videos, and how this understanding can facilitate natural interactions between humans and AI. Ultimately, I see AI as a tool designed to improve people’s lives, and I believe human-AI interaction should focus on making AI more accessible and effective for users. With this perspective, I'm particularly interested in the understanding and generation of nonverbal expressions, as well as the recognition and production of sign language.

Publications

Publication thumbnail

Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues

arXiv

Youngmin Kim*, Jiwan Chung*, Jisoo Kim, Sunghyun Lee, Sangkyu Lee, Junhyeok Kim, Cheoljong Yang, Youngjae Yu

ACL2025 Main

See tldr

TLDR; We introduce VENUS, a large-scale video dataset for generating and understanding nonverbal expressions, along with MARS, a model designed to leverage it.

Publication thumbnail

Scalp Diagnostic System With Label-Free Segmentation and Training-Free Image Translation

arXiv

Youngmin Kim*, Saejin Kim*, Hoyeon Moon, Youngjae Yu, Junhyug Noh

MICCAI 2025

See tldr

TLDR; We introduce ScalpVision, an AI system for comprehensive scalp disease and alopecia diagnosis that uses innovative hair segmentation and DiffuseIT-M, a generative model for dataset augmentation, to improve severity assessment and prediction accuracy.

Publication thumbnail

MAVL: A Multilingual Audio-Video Lyrics Dataset for Animated Song Translation

arXiv

Woohyun Cho, Youngmin Kim, Sunghyun Lee, Youngjae Yu

Under Review

See tldr

TLDR; We present MAVL, a multimodal benchmark for singable lyrics translation, and SylAVL-CoT, a model using audio-video cues and syllable constraints for natural, accurate results.

Publication thumbnail

Preprocessing for Keypoint based Sign Language Translation without Glosses

arXiv

Youngmin Kim, Hyeongboo Baek

Sensors (IF: 3.847)

See tldr

TLDR; We introduce the effective preprocessing pipeline for sign language translation without glosses, combining skeleton-based motion features, keypoint normalization, and stochastic frame selection to enhance model performance.

Publication thumbnail

A 2-Stage Model for Vehicle Class and Orientation Detection with Photo-Realistic Image Generation

arXiv

Youngmin Kim*, Donghwa Kang*, Hyeongboo Baek

IEEE BigData 2022

See tldr

TLDR; We introduce a two-stage vehicle class and orientation detection model using synthetic-to-real image translation and meta-table fusion to improve real-world prediction accuracy.

Publication thumbnail

A Study of Tram-Pedestrian Collision Prediction Method Using YOLOv5 and Motion Vector

KCI Article

Youngmin Kim, Hyeonuk Ahn, Heegyun Jeon, Jinpyung Kim, Gyujin Jang, Hyeonchyeol Hwang

Korea Information Processing Society (KIPS)

See tldr

TLDR; (Korean) We introduce a real-time tram collision prediction system that combines fast object detection with YOLOv5 and a modified local dense optical flow to estimate object speed and predict collision time and probability using a single camera image.

Publication thumbnail

Pedestrian Accident Prevention Model Using Deep Learning and Optical Flow

KCI Article

Youngmin Kim, Gyujin Jang, Hyunjai Bae, Youngnam Kim, Jinpyung Kim

Korea Computer Congress 2021 (🥇Best Paper Award)

See tldr

TLDR; (Korean) We introduce a real-time pedestrian collision prediction system that uses YOLOv5 for fast object detection and a Local Dense Optical Flow method to quickly estimate pedestrian direction and speed, enabling accurate prediction of collision time and location.

Publication thumbnail

Optical Flow Estimation Techniques and Recent Research Trends Survey

KCI Article

Youngmin Kim, Hyeonuk Ahn, Jinpyung Kim

Korea Information Processing Society (KIPS) Special Session

See tldr

TLDR; (Korean) We survey recent advances in optical flow estimation, comparing traditional and deep learning-based methods, and highlight their applications in autonomous driving, medical imaging, and surveillance systems.