Dimensionality reduction: PCA vs Isomap vs t-SNE
What you are seeing: a topologically non-trivial dataset (top-left panel, slowly rotating so you can see its 3D structure), and three different ways of flattening it into 2D. Points are colored by a hidden label (toroidal angle, ring-index, or cluster-index). A method is doing a "good job" when same-colored points stay near each other in the 2D embedding.
PCA (linear): keep the two directions of largest variance. Fast and deterministic, but rigid; cannot unwrap a torus or unlink the Hopf rings because it has no notion of curvature or connectivity. Isomap: build a -nearest-neighbor graph, measure distances along the graph (not through air), then do classical MDS on those geodesic distances. Unwraps the torus into a roughly periodic strip. t-SNE: place a Student-t kernel in 2D and minimize the KL divergence to a Gaussian-affinity matrix in the original space. Local neighborhoods preserved well; global geometry lost. The catalog slug names UMAP; we show classical PCA in its place as the linear baseline.
Datasets: Torus ( embedded in 3D) is intrinsically 2D but topologically has a hole: no linear projection can flatten it. Hopf link is two interlinked 1D circles, intrinsically 1D but topologically inseparable in 3-space. Five clusters in 5D has five Gaussian blobs on a ring in dimensions 0 and 1 with dims 2 through 4 pure noise. A good DR method ignores the noise.
WHAT TO TRY
- Vary each control and watch the rail readouts respond.
- Compare the diagnostic plot against the live scene.