I am an Assistant Professor at KAIST Graduate School of AI, where I direct the Structured and Probabilistic Machine Learning (SPML) Lab.
My research focuses on developing machine learning algorithms for molecules, with applications to drug discovery and material design. I enjoy bringing a machine learning perspective—especially a probabilistic one—to scientific problems, and diving deep into the underlying physics, chemistry, and biology.
Members
Yunhui Jang, Hyosoon Jang, Hyomin Kim, Seonghyun Park, Seongsu Kim, Kiyoung Seong, Minkyu Kim, Dongyeop Woo, Nayoung Kim, Minsu Kim (Postdoc), Taewon Kim, Hyunjin Seo, Yinhua Piao (Postdoc), Yoonho Kim, Honghui Kim (Postdoc)
Alumni: Haeji Ko, Juwon Hwang
Publications
- Learning Adaptive Perturbation-Conditioned Contexts for Robust Transcriptional Response Prediction[arxiv]
- Boltz is a Strong Baseline for Atom-level Representation Learning[arxiv]
- Riemannian MeanFlow[arxiv]
- Progressive Multi-Agent Reasoning for Biological Perturbation Prediction[arxiv]
- AtomMOF: All-Atom Flow Matching for MOF-Adsorbate Structure Prediction[arxiv]
- CatFlow: Co-generation of Slab-Adsorbate Systems via Flow Matching[arxiv]
- INDIBATOR: Diverse and Fact-Grounded Individuality for Multi-Agent Debate in Molecular Discovery[arxiv]
- Latent Veracity Inference for Identifying Errors in Stepwise Reasoning (ICLR 2026)[arxiv]
- Learning Collective Variables from BioEmu with Time-Lagged Generation (ICLR 2026)[arxiv]
- DNACHUNKER: Learnable Tokenization for DNA Language Models[arxiv]
- Self-Training Large Language Models with Confident Reasoning (EMNLP 2025)[arxiv]
- MT-Mol: Multi Agent System with Tool-based Reasoning for Molecular Optimization (EMNLP 2025)[arxiv]
- Generative Flows on Synthetic Pathway for Drug Design (ICLR 2025)[arxiv]
- Adaptive Teachers for Amortized Samplers (ICLR 2025)[arxiv]
- Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity[arxiv]
- Iterated Energy-based Flow Matching for Sampling from Boltzmann Densities[arxiv]
- Multi-resolution Spectral Coherence for Graph Generation with Score-based Diffusion (NeurIPS 2023)[paper]
- Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning (ICML 2022)[paper]
- Spanning Tree-based Graph Generation for Molecules (ICLR 2022)[paper]
- Maximum Weight Matching using Odd-sized Cycles: Max-Product Belief Propagation and Half-Integrality (IEEE TIT 2018)[paper]
- Learning Collective Variables from Time-lagged Generation (W 2025)[]
- Generative Flows on Synthetic Pathway for Drug Design (W 2024)[]
- MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks (W 2024)[]
- Chain-of-Thoughts for Molecular Understanding (W 2024)[]
- Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers (W 2024)[]
- Non-backtracking Graph Neural Networks (W 2023)[]
- A Simple and Scalable Representation for Graph Generation (W 2023)[]
- Symmetric Exploration in Combinatorial Optimization is Free! (W 2023)[]
- Removing Multiple Biases through the Lens of Multi-task Learning (W 2023)[]
- EPIC: Graph Augmentation with Edit Path Interpolation via Learnable Cost (W 2023)[]
- Bootstrapped Training of Score-Conditioned Generator for Offline Design of Biological Sequences (W 2023)[]
- Hierarchical Graph Generation with K2 Trees (W 2023)[]
- Diffusion Probabilistic Models for Structured Node Classification (W 2023)[]
- Contrastive Learning of Molecular Representation with Fragmented Views (W 2023)[]
- Learning Debiased Classifier with Biased Committee (W 2022)[]
- Substructure-Atom Cross Attention for Molecular Representation Learning (W 2022)[]
- A Closer Look at the Intervention Procedure of Concept Bottleneck Models (W 2022)[]
- Visual Abstract Reasoning via Logic-Guided Generation (W 2021)[]
- RetCL: A Selection-based Approach for Retrosynthesis via Contrastive Learning (W 2020)[]
- Variational Mutual Information Distillation for Transfer Learning (W 2018)[]