AI & models

Model — The trained system that predicts text (or images, etc.). “GPT-4”, “Claude”, “Gemini” are product/model families; capabilities differ by version.

Prompt — What you send the model (instructions + context). Quality of prompt usually matters more than minor model choice.

Token — A chunk of text the model bills and processes on — roughly subword pieces, not always whole words. Long prompts and outputs cost more tokens.

Context / context window — How much text the model can consider at once (input + output). Big windows help with long documents.

Hallucination — Confident-sounding but false content. Always verify facts that matter (money, health, law, names).

RAG (retrieval-augmented generation) — Pulling relevant documents into the prompt before answering so answers stay grounded in your sources.

Agent — Loosely: a loop that plans, calls tools (search, APIs, files), and iterates — not just a single chat reply.

Tool calling / function calling — The model asks to run a predefined function (e.g. getWeather) with arguments; your code runs it and returns results.

Fine-tuning — Extra training on specific data — powerful but usually not your first step; most people use prompts + RAG.

Embedding — A numeric vector representing text meaning; used for search (“similar chunks”) in many AI apps.

Inference — Running the model to get a response (what you pay for per token on APIs).

System prompt — Hidden instructions that define assistant behavior; product designers set this; you may override partly in user prompts.