Hi, I’m Minsung Hwang, a PhD Student of School of Computing at KAIST. I’m a member of the Big Data Intelligence Lab led by Professor Joyce Jiyoung Whang. My current research interests are knowledge graphs, graph neural networks, and Large Language Models (LLMs).
Knowledge GraphsGraph Neural NetworksRepresentation LearningMachine Learning TheoryLarge Language Model
Selected Publications
Stability and Generalization Capability of Subgraph Reasoning Models for Inductive Knowledge Graph Completion
Minsung Hwang, Jaejun Lee, Joyce Jiyoung Whang International Conference on Machine Learning (ICML), 2025.
Click to view abstract
Abstract
Inductive knowledge graph completion aims to predict missing triplets in an incomplete knowledge graph that differs from the one observed during training. While subgraph reasoning models have demonstrated empirical success in this task, their theoretical properties, such as stability and generalization capability, remain unexplored. In this work, we present the first theoretical analysis of the relationship between the stability and the generalization capability for subgraph reasoning models. Specifically, we define stability as the degree of consistency in a subgraph reasoning model's outputs in response to differences in input subgraphs and introduce the Relational Tree Mover’s Distance as a metric to quantify the differences between the subgraphs. We then show that the generalization capability of subgraph reasoning models, defined as the discrepancy between the performance on training data and test data, is proportional to their stability. Furthermore, we empirically analyze the impact of stability on generalization capability using real-world datasets, validating our theoretical findings.
PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning
Jaejun Lee, Minsung Hwang, Joyce Jiyoung Whang International Conference on Machine Learning (ICML), 2024.
Click to view abstract
Abstract
While a number of knowledge graph representation learning (KGRL) methods have been proposed over the past decade, very few theoretical analyses have been conducted on them. In this paper, we present the first PAC-Bayesian generalization bounds for KGRL methods. To analyze a broad class of KGRL models, we propose a generic framework named ReED (Relation-aware Encoder-Decoder), which consists of a relation-aware message passing encoder and a triplet classification decoder. Our ReED framework can express at least 15 different existing KGRL models, including not only graph neural network-based models such as R-GCN and CompGCN but also shallow-architecture models such as RotatE and ANALOGY. Our generalization bounds for the ReED framework provide theoretical grounds for the commonly used tricks in KGRL, e.g., parameter-sharing and weight normalization schemes, and guide desirable design choices for practical KGRL methods. We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs.