Deep Learning Models for Robust Latent Structure in Single-Cell and Methylation Data
Epigenetic clocks commit to linearity: fit a line between methylation and age, keep the high-slope sites, discard the rest. Aging is not a smooth slope. Using a hybrid VAE and Contrastive encoder on the Hannum cohort (656 samples, 470k CpGs), two GMM clusters in the learned latent space capture 53.8% of SNITCH-classified nonlinear CpGs at odds ratios of 4.97 and 2.74. The CTCF/NF1 transcription factor axis identified in Module 9 independently replicates in Grolaux et al. (2026) on EPICv2 data from a different cohort and an entirely different methodological framework. The central claim: nonlinear epigenetic aging structure is encoded in trajectory shape geometry, not in the statistical relationship between methylation and age.
Read Thesis →