A Euclidean Embedding for Computing Persistent Homology with Gaussian Kernels
- Speaker(s)
- Kunal Dutta
- Language of the talk
- English
- Date
- Nov. 22, 2024, 2:15 p.m.
- Room
- room 5060
- Seminar
- Seminar Algorithms
Computing persistent homology of large datasets using Gaussian kernels is useful in the domains of topological data analysis and machine learning as shown by Phillips, Wang and Zheng [SoCG 2015]. However, compared to other typical distances, using Gaussian kernels involves significantly higher overhead, as all distance computations are in terms of the Gaussian kernel distance which is computationally more expensive. Further, most algorithmic implementations (e.g. Gudhi, Ripser, etc.) are based on Euclidean distances, so the question of finding a Euclidean embedding – preferably low-dimensional – that preserves the persistent homology computed with Gaussian kernels, is quite important.
We consider the Gaussian kernel power distance (GKPD) given by Phillips, Wang and Zheng. Given an n-point dataset and a relative error parameter {ε} ∈ (0,1], we show that the persistent homology of the Čech filtration of the dataset computed using the GKPD can be approximately preserved using O({ε}^{-2}log n) dimensions, under a high stable rank condition. Our results also extend to the Delaunay filtration and the (simpler) case of the weighted Rips filtrations constructed using the GKPD. Our embedding may also be viewed as dimensionality reduction – reducing the dimensionality from n to ∼ log n dimensions.
The proof utilizes the embedding of Chen and Phillips [ALT 2017], based on the Random Fourier Functions of Rahimi and Recht [NeurIPS 2007], together with two novel ingredients. The first one is a new decomposition of the squared radii of Čech simplices computed using the GKPD, in terms of the pairwise GKPDs between the vertices, which we state and prove. The second is a new concentration inequality for sums of cosine functions of Gaussian random vectors, which we call Gaussian cosine chaoses. We believe these are of independent interest and will find other applications in future.
Joint work with Jean-Daniel Boissonnat, ESA 2024