LLM Beginner Guide: Prompts, Tokens & Context

A language model does not look up one fixed answer. It predicts a useful continuation from your input and its training patterns.

A strong prompt usually combines role, context, task, constraints, examples and output format.

What is an LLM?

An LLM, or large language model, is software trained on huge amounts of text to predict and generate useful language. The practical way to think about it: you give it a task, it turns your text into tokens, estimates likely next tokens and returns a draft. It can write, summarize and classify because many language patterns are compressed into its parameters. It is not a database and it does not automatically know whether a fresh fact is true.

Example: A SaaS support lead asks: “Summarize this complaint and propose a calm refund reply.” The model does not “feel” the complaint; it recognizes the pattern of complaint, refund policy and professional tone.

What is a prompt?

A prompt is the instruction package you send to an AI system. Good prompts usually include the role, the situation, the exact task, constraints, examples and the desired output format. A weak prompt asks for “ideas”. A strong prompt says who the ideas are for, what they must achieve and how they should be judged.

Example: Weak: “Write a product description.” Better: “Act as a DTC ecommerce copywriter. Write a 90-word product description for a reusable coffee cup. Audience: commuters. Tone: practical, not hype. Include one headline and three bullets.”

What actually happens when you send a prompt?

The normal flow is not: “the AI understands the request, writes a Python program, then starts working.” A large language model usually receives text, turns it into tokens, runs those tokens through neural-network layers and generates the next token again and again. Python or other tools only enter the process when the product around the model explicitly gives it tool access, for example a calculator, code interpreter, browser, database connector or function call.

Flow chart showing prompt input, preprocessing, tokens, model inference, optional tool call, decoding and final answer

1. Input

You type a prompt. The application may add hidden system instructions, safety policies, chat history or selected documents before the model sees the final request.

2. Tokens

The text is split into token IDs. The model does not receive words as a person reads them; it receives numerical token identifiers.

3. Inference

The model computes probabilities for possible next tokens based on the full context. This is where attention, embeddings and model weights matter.

4. Optional tool

If the application supports tools, the model may request a function such as search, calculator or code execution. The external result is then inserted back into the conversation as more context.

5. Answer

The final text is decoded from generated tokens. Better prompts help because they change the context the model conditions on before choosing each next token.

Prompting lesson: the model is highly sensitive to the information it receives before generation. Clear context, examples, constraints and source material improve the probability that the next tokens follow your intent.

Better than a mini test: build a prompt before-and-after gallery

Instead of running a generic test, collect five real prompts from your own work and save the weak version, the improved version and the final edited result. This becomes a reusable prompt gallery for your team.

Pick one recurring task, such as support replies, product descriptions or lesson summaries.
Save the original prompt and output.
Add context, format rules and one example.
Compare how much editing the second output needs.
Turn the winner into a template.

What are tokens?

Tokens are the small text units a model reads and writes. A token can be a whole word, part of a word, punctuation or a character, depending on the tokenizer and language. Token limits matter because input plus output must fit inside the model context window.

Example: “PromptingEasy helps teams” might become tokens similar to “Prompt”, “ing”, “Easy”, “helps”, “teams”. This is why long documents and many examples increase cost and may crowd out important instructions.

Why does AI sometimes invent things?

AI can hallucinate when it generates a plausible-sounding continuation without enough reliable grounding. The model is optimized to produce likely text, not to guarantee truth by default. Hallucinations become more likely when the question asks for obscure facts, fresh information, hidden data or citations that were not provided.

Example: A user asks for “the exact 2026 price of a niche API plan” without browsing or source text. The model may produce a confident-looking price because that shape of answer is common, even if the number is wrong.

Why does context help?

Context narrows the search space. If the model knows the audience, goal, constraints, examples and source material, it can produce an answer that fits your situation instead of a generic answer. More context is not always better: irrelevant context can distract the model and increase cost.

Example: A marketing team gets better ad copy when it includes the product positioning, target customer, banned claims and two successful past ads instead of only saying “write ads”.

Search engine vs. LLM

A search engine retrieves pages and ranks links. An LLM generates an answer from patterns and provided context. Search is better when you need current sources, official pages or multiple perspectives. LLMs are better when you need synthesis, rewriting, reasoning over supplied information or structured drafts. Many strong AI products combine both.

Example: For “latest tax deadline in Zurich”, use search or official sources. For “turn these notes into a polite customer email”, an LLM is the better interface.

What does “AI understands language” mean?

In everyday language, “understands” means the model can respond appropriately to meaning, tone and structure. Technically, it learned statistical representations that map text patterns to useful outputs. It does not understand like a person with lived experience, intentions or common-sense accountability.

Example: If you write “make this less salesy,” the model can often adjust tone because it has learned patterns of salesy vs. neutral language. That is useful linguistic competence, not human understanding.

LLM Beginner Guide prompts, tokens & context