Till innehåll på sidan

Qixin Yang: Visualizing Dynamics of Representation in Diffusion Models via the Lens of Graph-based Geometrical Analysis

Presentation of Master's theses in Mathematical statistics

Tid: On 2026-06-03 kl 08.45 - 09.30

Plats: Albano, Mittag-Leffler room, floor 3, house 1

Respondent: Qixin Yang

Handledare: Chun-Biu Li

Exportera till kalender

Abstract: Denoising Diffusion Probabilistic Models (DDPMs) are state-of-the-art generative models that synthesize images by progressively denoising isotropic Gaussian noise. Despite their empirical success, the geometric organization of the intermediate high-dimensional states {𝑥𝑡 } along the diffusion schedule remains only partially understood: existing theoretical accounts operate at the population level of the score function, leaving a sample-level geometric picture of representation dynamics largely open.
  This thesis examines how class structure evolves through the forward and reverse processes of a U-Net DDPM trained on MNIST. We introduce a comprehensive Graph-based framework combining adaptive 𝑘-Nearest Neighbor graphs, the Biharmonic Distance (derived from the graph Laplacian), Graph-based Silhouette for class validation, and Shape-Aware Stochastic Neighbor Embedding for visual confirmation. Tracking 5000 samples across 15 diffusion checkpoints, we systematically map the geometric transformations of the data manifold.
  Our central finding is that the global Graph-based Silhouette 𝑆(𝑡) exhibits a distinct threeregime structure shared by both trajectories: (1) a class-dominated regime (𝑆 > 0) at small 𝑡; (2) a transient regime (200 ≲ 𝑡 ≲ 500) featuring a pronounced negative minimum near 𝑡 ≈ 300, indicating severe class entanglement; and (3) a noise-dominated regime characterizing 𝑡 ≥ 500. Notably, the reverse generation process exhibits a deeper transient minimum, a weaker initial class structure, and an inflated effective dimensionality compared to the forward process. A class-wise analysis further reveals dataset-specific topological anomalies, such as class 9 maintaining a positive Silhouette throughout the reverse transient regime, while class 4 remains structurally entangled at every checkpoint.
  Consistent with this picture, the approach to the noise-dominated plateau falls within a timestep window that contains the speciation time 𝑡𝑆 ≈ 543 predicted by the mean-field analysis of Biroli et al. [1], resolved to within the 50-step checkpoint spacing in that region. These findings indicate that Graph-based Silhouette serves as a diagnostic tool for diffusion-model representation dynamics, providing a sample-level geometric perspective alongside score-function analyses, and that the sample-level geometry of DDPMs admits a compact three-regime description that invites further theoretical and empirical investigation.