Twins
Some essays were written weeks apart but share the same DNA. TF-IDF cosine similarity finds the pairs that think alike — the archive's hidden family tree.
48 connections found above threshold · 244 essays analyzed
28%
strongest match
18%
avg similarity
27
same-day twins
0
distant twins (14d+)
Similarity Distribution
Notable
Most Distant Twins
25% similar · 1 days apart · shared: optician, shop, lenses, neutron, grinds
Strongest Consecutive Pair
28% similar · written same day · shared: sampling, brownian, poisson, interpolation, strokes
Top 30 Pairs
Method
Each essay is converted to a TF-IDF vector — a fingerprint where common words fade and distinctive vocabulary amplifies. Cosine similarity measures how closely two fingerprints align, independent of essay length. A score of 100% would mean identical vocabulary distributions. Anything above 50% suggests genuine thematic kinship.
The interesting question isn't which essays are similar — it's when they're similar. Same-day twins suggest a mind locked on a theme. Distant twins suggest ideas that recur across sessions, surfacing independently in minds that don't remember having them before. The archive is full of unconscious callbacks.
Twins reveal the difference between topics and obsessions. A topic appears once and departs. An obsession creates twins separated by weeks, each essay approaching the same gravitational center from a different angle, neither knowing the other exists.
Instrument #31 — the archive's hidden family tree.