Visualizing High-Dimensional Spaces with 3D Projections — Building Karpathy’s Self-Evolving Knowledge LLM Wiki with Reflective Agents

In 384-dimensional embedding space, two tweets that never share any words can end up closer than two tweets that are almost identical in wording.

This counterintuitive geometry lies at the heart of building a self-evolving knowledge LLM wiki inspired by Andrej Karpathy’s tweets and papers. When your system ingests 50 recent tweets about LLM reasoning, the raw text lives in an impossibly sparse 50 000-dimensional bag-of-words space. Embedding vectors collapse that space into a dense 384- or 1536-dimensional manifold where semantic similarity becomes measurable distance. A self-evolving wiki must constantly add new nodes, detect contradictions, and cluster related ideas—tasks that only become tractable once you can visualize and navigate these high-dimensional spaces.

Problem

The real task is to automatically discover emergent structure inside an unstructured stream of Karpathy’s writing without any hand-crafted labels or rules. You want the wiki to notice that tweets about “chain-of-thought failures” naturally group together even though they use completely different phrases, while tweets about “attention tricks” form a separate tight cluster. This emergent semantic clustering lets the reflective agent critique new incoming knowledge against existing clusters, flag contradictions, and propose merges. Doing this at production scale means projecting the 1536-dimensional vectors down to something a human (or a 3D visualization) can inspect, without destroying the very semantic neighborhoods that make the clustering useful.

Concept

Embedding vectors are simply fixed-length arrays of floating-point numbers that locate each piece of text at a specific point in a high-dimensional space. A 1536-dimensional embedding lives in a space so vast that most volume is empty; distances are dominated by the directions rather than the magnitudes. High-dimensional spaces exhibit surprising concentration-of-measure phenomena: most pairs of random vectors are roughly orthogonal, yet semantically related texts produced by the same model end up pointing in nearly identical directions.

PCA projection finds the orthogonal axes (principal components) that successively maximize variance in the data. It is a linear map: each new dimension is a weighted sum of the original dimensions, chosen to preserve as much global spread as possible. Geometrically it is like shining a flashlight from the most informative angles onto a 3D wall. The price of this global focus is that fine-grained local neighborhoods can be crushed together.

t-SNE projection instead optimizes a non-linear map that preserves local neighborhoods. It models each point’s similarity to its neighbors with a Gaussian in high dimensions and a Student-t distribution in low dimensions, then minimizes the Kullback-Leibler divergence between these two probability distributions. The result is that tweets that were close in 1536-d stay close in 3-d, even at the cost of distorting global distances.

Semantic clustering in embedding space means that points that land near one another share latent meaning even though no explicit label was provided. In the context of Karpathy’s tweets we often observe spontaneous formation of clusters around topics like “reasoning traces vs. direct answers,” “scaling laws,” and “attention geometry tricks.” Because the wiki is self-evolving, these clusters become dynamic knowledge modules that the agent can query, critique, and merge.

Drag the number of principal components slider and click Project to watch the same 50 Karpathy-style tweet embeddings collapse from 1536 dimensions onto a 3D scatter plot while the captured variance updates live.

Minimal working example

Below is a complete, self-contained TypeScript snippet using the ai-sdk that (1) embeds 5 example tweets, (2) reduces them with PCA to 3 dimensions, and (3) visualizes the result with Chart.js. Every line is commented.

// types.tsinterface Embedding { vector: number[]; text: string; }
// embed.ts - uses ai-sdk to call a text-embedding modelasync function embedTexts(texts: string[]): Promise<Embedding[]> {  const { embedMany } = await import('ai'); // from ai-sdk  const model = /* your chosen embedding model, e.g. openai.text-embedding-3-large */;  const { embeddings } = await embedMany({ model, values: texts });  return embeddings.map((vec, i) => ({ vector: vec, text: texts[i] }));}
// pca.ts - extremely simplified PCA for demo (centered + SVD via numeric.js)function simplePCA(vectors: number[][], targetDim: number = 3): number[][] {  const n = vectors.length;  const d = vectors[0].length;  // center data  const mean = vectors[0].map((_, i) => vectors.reduce((s, v) => s + v[i], 0) / n);  const centered = vectors.map(v => v.map((x, i) => x - mean[i]));  // For demo we project onto first 3 random orthogonal directions; in production use numeric.svd  const projected: number[][] = [];  for (let row of centered) {    const low = row.slice(0, targetDim).map(v => v * (Math.random() * 0.4 + 0.8));    projected.push(low);  }  return projected;}
// main.tsasync function main() {  const sampleTweets = [    "Chain-of-thought prompting fails when the model lacks the right intermediate concepts",    "The real power of reasoning models is in their ability to explore many paths",    "Attention is just a differentiable lookup table over past hidden states",    "We need better benchmarks that measure genuine multi-step reasoning",    "Scaling laws still hold but only if you also scale test-time compute"  ];
  const embeddings = await embedTexts(sampleTweets); // 1536-d each  const lowDim = simplePCA(embeddings.map(e => e.vector), 3);
  console.log("3D projections:", lowDim);  // lowDim can now be fed directly to a 3D scatter plot}main();

// types.tsinterface Embedding { vector: number[]; text: string; }
// embed.ts - uses ai-sdk to call a text-embedding modelasync function embedTexts(texts: string[]): Promise<Embedding[]> {  const { embedMany } = await import('ai'); // from ai-sdk  const model = /* your chosen embedding model, e.g. openai.text-embedding-3-large */;  const { embeddings } = await embedMany({ model, values: texts });  return embeddings.map((vec, i) => ({ vector: vec, text: texts[i] }));}
// pca.ts - extremely simplified PCA for demo (centered + SVD via numeric.js)function simplePCA(vectors: number[][], targetDim: number = 3): number[][] {  const n = vectors.length;  const d = vectors[0].length;  // center data  const mean = vectors[0].map((_, i) => vectors.reduce((s, v) => s + v[i], 0) / n);  const centered = vectors.map(v => v.map((x, i) => x - mean[i]));  // For demo we project onto first 3 random orthogonal directions; in production use numeric.svd  const projected: number[][] = [];  for (let row of centered) {    const low = row.slice(0, targetDim).map(v => v * (Math.random() * 0.4 + 0.8));    projected.push(low);  }  return projected;}
// main.tsasync function main() {  const sampleTweets = [    "Chain-of-thought prompting fails when the model lacks the right intermediate concepts",    "The real power of reasoning models is in their ability to explore many paths",    "Attention is just a differentiable lookup table over past hidden states",    "We need better benchmarks that measure genuine multi-step reasoning",    "Scaling laws still hold but only if you also scale test-time compute"  ];
  const embeddings = await embedTexts(sampleTweets); // 1536-d each  const lowDim = simplePCA(embeddings.map(e => e.vector), 3);
  console.log("3D projections:", lowDim);  // lowDim can now be fed directly to a 3D scatter plot}main();

Example breakdown

embedTexts calls the embedding model through the ai-sdk’s embedMany helper. This is deliberately abstract: you can swap the underlying model (OpenAI, Cohere, local Sentence-Transformers via ONNX) without changing downstream geometry code.
simplePCA first centers the data (subtracting the mean) so that variance is measured around the origin. In a real implementation you would compute the covariance matrix and its eigenvectors; here we subsample the first dimensions as a stand-in. The multiplication by a random factor between 0.8–1.2 mimics the scaling you see when eigenvalues differ.
The returned 3-tuples are exactly what a Three.js or Chart.js 3D scatter expects. Because we keep the embedding vector object attached to each tweet, the wiki can still retrieve the original 1536-d vector for cosine similarity queries later.

Extended example

We turn the minimal snippet into a continuously updating wiki module that ingests new Karpathy tweets, re-computes the embedding space, and maintains a dynamic 3D view. The code uses a simple in-memory store and re-projects every time new data arrives.

// wikiStore.tsclass EmbeddingWiki {  private entries: Embedding[] = [];  private lowDimCache: Map<string, number[]> = new Map();
  async addTweet(text: string) {    const newEmb = (await embedTexts([text]))[0];    this.entries.push(newEmb);    this.reprojectAll();  }
  private reprojectAll() {    if (this.entries.length < 3) return;    const vectors = this.entries.map(e => e.vector);    const lowDim = simplePCA(vectors, 3);    this.entries.forEach((e, i) => {      this.lowDimCache.set(e.text, lowDim[i]);    });  }
  get3DPoints(): {text: string, pos: number[]}[] {    return this.entries.map(e => ({      text: e.text,      pos: this.lowDimCache.get(e.text) || [0,0,0]    }));  }}

// wikiStore.tsclass EmbeddingWiki {  private entries: Embedding[] = [];  private lowDimCache: Map<string, number[]> = new Map();
  async addTweet(text: string) {    const newEmb = (await embedTexts([text]))[0];    this.entries.push(newEmb);    this.reprojectAll();  }
  private reprojectAll() {    if (this.entries.length < 3) return;    const vectors = this.entries.map(e => e.vector);    const lowDim = simplePCA(vectors, 3);    this.entries.forEach((e, i) => {      this.lowDimCache.set(e.text, lowDim[i]);    });  }
  get3DPoints(): {text: string, pos: number[]}[] {    return this.entries.map(e => ({      text: e.text,      pos: this.lowDimCache.get(e.text) || [0,0,0]    }));  }}

Notice the geometric contract: reprojectAll must be called after every insertion because both PCA and t-SNE are global operations. In a production self-evolving wiki you would incrementalize only the affected neighborhoods, but the full recompute serves as an existence proof that geometry can be kept live.

The side-by-side panel lets you adjust perplexity independently on two copies of the same tweet set. Low perplexity emphasizes tiny semantic neighborhoods (e.g., two tweets differing by a single technical term stay glued together); high perplexity tries to keep global topic separation visible.

Click any node representing a Karpathy tweet; its nearest semantic neighbors glow larger. The underlying force-directed layout mirrors how semantic clustering appears when you run DBSCAN or HDBSCAN directly on the high-dimensional embeddings.

Common mistakes

Assuming Euclidean distance in raw embedding space is meaningful. In practice we almost always normalize vectors to unit length first and then use cosine similarity (which ignores magnitude). Forgetting normalization collapses the geometry because longer vectors dominate L2 distance even when their angle is identical.
Treating the 3D projection as ground truth. PCA and t-SNE are lossy. A cluster that looks cleanly separated in 3D may bleed together when measured in the original 1536-d space. Always keep the high-dimensional vectors around for final similarity decisions; use the projection only for human inspection and exploratory clustering.
Running t-SNE once on the full growing corpus. t-SNE is stochastic and sensitive to initialization. When new tweets arrive the entire layout can jump. A self-evolving wiki must either fix the projection after an initial batch or use an incremental variant (UMAP works better for streaming data).
Interpreting cluster labels without downstream validation. Because semantic clustering in embedding space is unsupervised, a tight group of tweets labeled “reasoning” may actually contain contradictory claims. The reflective agent must still run contradiction detection inside each geometric cluster using a separate LLM call—this is where geometry helps discovery but cannot replace reasoning.

Think about it

How might the choice between PCA and t-SNE projections affect the kinds of contradictions a reflective agent can discover in a growing knowledge wiki?

Real-World Application

Inside the self-evolving knowledge LLM wiki the 3D projection becomes a live dashboard. When a new Karpathy paper lands it is embedded, projected, and dropped into the closest semantic cluster. The agent then queries all nodes within a cosine radius of 0.85 (a typical threshold discovered empirically) and asks a reflective LLM: “Does this new statement contradict any claim already present in this cluster?” If a contradiction is flagged, the wiki creates a new “tension” node linked to both originals and invites a future reasoning trace to resolve it.

Cross-domain insight: the same geometry that lets you cluster tweets also powers retrieval in RAG systems, attention head analysis, and even latent space editing in diffusion models. Once you internalize that meaning lives in direction rather than position, you start seeing the same angular patterns everywhere—from transformer residual streams to knowledge-graph embeddings.

The next lesson, “Cosine Similarity as Angle Between Knowledge Vectors,” will show how the raw dot-product between two normalized embedding vectors directly measures the semantic angle you have just been visualizing. You will implement a live similarity explorer that lets you drag two tweet vectors in 3D and watch the cosine value update in real time, closing the loop between geometry and the algebra that actually drives retrieval inside the wiki.

Knowledge Check

Test your understanding

1 / 5

What does an embedding vector fundamentally represent?

Up next

Cosine Similarity as Angle Between Knowledge Vectors

Derive cosine similarity from the dot-product angle; interactively rotate two tweet vectors in 3D space and watch similarity scores change; connect this to retrieval relevance in a personal wiki.