How I use LLMs

· Source: Andrej Karpathy · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, extended

Summary

The provided content offers a comprehensive guide to interacting with Large Language Models (LLMs) like ChatGPT, Gemini, Claude, and Grok, focusing on practical applications and underlying mechanisms. It details the LLM ecosystem, including major providers and model tiers, emphasizing that models are essentially "lossy, probabilistic zip files" of internet data. The guide covers basic text interaction, tokenization, and the concept of a context window as working memory. It then explores advanced functionalities such as "thinking models" for complex problem-solving, tool use for internet search and deep research, and integration with Python interpreters for data analysis and code generation. The content also delves into multimodal interactions, including audio input/output, advanced voice modes, podcast generation, and image/video input and output capabilities. Finally, it highlights quality-of-life features like memory, custom instructions, and custom GPTs, providing examples of their utility in daily tasks and professional work.

Key takeaway

For data scientists and software engineers seeking to maximize LLM utility, prioritize understanding the specific model's capabilities and available tools. Always verify outputs, especially for critical tasks, as models can hallucinate or make implicit assumptions. Experiment with different LLM providers and their tiered offerings to find the best fit for your specific professional needs, considering factors like reasoning, tool integration, and multimodal support to enhance efficiency and accuracy in your workflows.

Key insights

LLMs are versatile tools, but understanding their underlying mechanisms and available features is key to effective use.

Principles

Method

Interact with LLMs by understanding tokenization, managing context windows, selecting appropriate models and tools (search, code interpreter), and leveraging multimodal capabilities for diverse tasks.

In practice

Topics

Best for: Prompt Engineer, Software Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Andrej Karpathy.