How I think about LLM prompt engineering
Summary
In 2013, Google researchers Mikolov et al. developed Word2Vec, a model that embedded words into a vector space, revealing emergent "word arithmetic" where vector operations like V(king) - V(man) + V(woman) = V(queen) captured semantic relationships. This phenomenon, where correlation relationships between words translated into distance relationships in the embedding space, demonstrated an emergent learning capability not explicitly trained. Modern Large Language Models (LLMs) share fundamental principles with Word2Vec, including embedding tokens in a vector space, using cosine distance, and relying on self-attention within the Transformer architecture to refine these embeddings. While Word2Vec was a shallow model, LLMs, with their deep Transformer layers and billions of parameters, extend this concept to complex "vector programs" that enable sophisticated transformations, such as rewriting text in a specific style. LLMs can be conceptualized as continuous, interpolative databases of both data and these vector programs, with prompts acting as queries to retrieve and execute them.
Key takeaway
For NLP engineers developing or fine-tuning LLMs, understanding the underlying principle of emergent vector programs is crucial. Your prompts are not just instructions but queries into a continuous database of functions. You should approach prompt engineering as a search process to locate the most effective vector program for your specific task, rather than assuming the LLM inherently "understands" your intent from a single phrasing.
Key insights
LLMs extend Word2Vec's emergent word arithmetic into complex "vector programs" via self-attention and deep architectures.
Principles
- Correlation becomes proximity in embedding spaces.
- Self-attention refines token embedding spaces.
- LLMs are continuous, interpolative program databases.
Method
Both Word2Vec and LLMs embed tokens in a vector space, optimizing for proximity based on co-occurrence, with LLMs using self-attention to iteratively refine these embeddings.
In practice
- Use vector arithmetic for semantic transformations.
- Explore prompt variations to find optimal "vector programs".
Topics
- Word2Vec
- Large Language Models
- Word Embeddings
- Self-Attention
- Prompt Engineering
Best for: NLP Engineer, Machine Learning Engineer, AI Engineer, Prompt Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Sparks in the Wind.