What is an LLM?
An LLM (Large Language Model) is an AI model trained on vast amounts of text data to understand and generate human language. GPT-4, Claude, Llama, and Gemini are all LLMs.
How they work: LLMs predict the next token (word or word fragment) in a sequence based on the preceding context. They're trained on trillions of tokens from books, websites, code, and other text. Through this training, they develop an internal representation of language, facts, reasoning patterns, and even coding ability.
What LLMs can do: answer questions, write content, summarize documents, translate languages, analyze data, generate code, extract structured information from unstructured text, and follow complex multi-step instructions. They're general-purpose reasoning engines.
What LLMs can't do: they don't have real-time information (knowledge cutoff), they can hallucinate (generate confident but wrong information), they can't perform mathematical calculations reliably (use code execution instead), and they don't truly "understand" — they're pattern matching at an incredibly sophisticated level.
Key parameters: context window (how much text the model can process at once — 128K-200K tokens in 2026), temperature (randomness of outputs — lower = more deterministic), and model size (larger models are generally better but more expensive).
For business applications: don't build on raw LLMs. Use structured prompts, RAG for knowledge grounding, guardrails for safety, and evaluation frameworks for quality. The model is the engine — your application logic is what makes it useful.