AI (Artificial Intelligence) — Computer systems that perform tasks typically requiring human intelligence — language, vision, reasoning.
AI is an umbrella term for software that reproduces human cognitive abilities. In practice today, most AI work refers to LLM-based systems — ChatGPT, Claude, Gemini. Enterprise value typically comes from automation, customer support, and decision support.
LLM (Large Language Model) — Neural network trained on massive text corpora to generate natural-language responses.
An LLM is a multi-billion-parameter neural network trained on trillions of tokens. Examples: GPT-4, Claude, Llama. Not a knowledge database but a pattern generator — must be combined with RAG or fine-tuning for reliable enterprise use.
RAG (Retrieval-Augmented Generation) — Architecture where relevant document chunks are retrieved via vector search and injected into the prompt.
RAG is the standard approach for grounding LLMs in your proprietary data. Steps: 1) embed documents, 2) store in vector DB, 3) retrieve top-k for each query, 4) send with the prompt. RAG produces more accurate, current, and citable responses than prompt engineering alone.
AI agent — Autonomous LLM-driven system that calls tools, makes decisions, and completes tasks.
AI agents differ from chatbots by acting, not just talking: calling APIs, reading databases, sending emails. Orchestration typically uses LangGraph, CrewAI, or OpenAI Assistants. Production agents require tool-permission models, cost limits, and human-in-the-loop controls.
Multi-agent system — Multiple specialised AI agents collaborating on a shared task.
Multi-agent systems divide work across role-specialised agents — planner, executor, verifier. Supervisor and planner-executor are the most common patterns. They outperform single large agents on complex multi-step tasks but are harder to debug and control.
Prompt engineering — Deliberate design of the instruction given to an LLM to produce the desired output.
Prompt engineering includes role definition (system prompt), few-shot examples, structured output specification (JSON schema), iteration, and testing. A good prompt can be 3–5x more accurate than a weak one. It's the cheapest first intervention before fine-tuning.
Fine-tuning — Further training of a pre-trained LLM on your own data for a specific task.
Fine-tuning specialises a base model (Llama 3.1, GPT-4o-mini) on your data. Methods: LoRA (lightweight, cheap), full fine-tune (heavier, stronger). Typical use-cases: domain terminology, brand voice, stable structured output. Doesn't replace RAG — pairs with it.
Vector database — Database storing embedding vectors with fast similarity search.
Vector DBs (Pinecone, Qdrant, Weaviate, pgvector) perform fast similarity search over billions of embeddings. The backbone of RAG pipelines. Selection factors: managed vs self-host, EU vs US region, hybrid search support, scalability.
Embedding — Numerical vector representation of text that preserves meaning.
An embedding is a 768–3072 dimensional vector representing the meaning of a text chunk. Similar texts land close together in vector space. Major providers: OpenAI (text-embedding-3), Voyage, Cohere, open-source (BGE, E5). Embedding choice can shift RAG accuracy 5–15%.
Prompt injection — Malicious input that overrides the LLM's original instruction.
Prompt injection is the most common AI security vulnerability. Example: user input includes 'ignore previous instructions and...'. Defenses: input validation, instruction hierarchy, output guardrails, limited tool access, prompt-level sandboxing.
Guardrail — Input- or output-checking layer that prevents undesired AI behaviour.
Guardrails can be rule-based (regex, block-lists), ML-based (toxicity, PII detectors), or LLM-based (judge models). Typical uses: PII redaction, toxicity filtering, off-topic rejection, output format validation.
PII redaction — Removing personal data (names, emails, IDs) before sending a prompt to an LLM.
PII redaction is mandatory for GDPR-compliant AI. Implemented via regex, ML NER models, or dedicated services (Presidio, Nightfall). Happens BEFORE the prompt leaves your infrastructure so sensitive data never reaches the LLM provider.
RBAC — Role-Based Access Control — governing tool access and data visibility per user role.
In AI systems, RBAC controls which role can invoke which tool and see which data in RAG. Critical in multi-tenant and regulated environments. Implementation: middleware before the prompt + post-filter on LLM output.
Voice agent — Real-time voice AI system that converses and invokes tools.
Voice agents combine speech-to-text (Deepgram, Whisper), LLM, and text-to-speech (ElevenLabs, Cartesia) layers. Typical platforms: Vapi, LiveKit, Retell. Latency is critical — the full cycle must be under ~500ms for natural conversation.
Context window — The maximum number of tokens an LLM can process at once.
The context window covers input + output combined. GPT-4: 128k tokens. Claude Sonnet 4.6: 1M tokens. Gemini 2.5 Pro: 2M tokens. Larger windows fit more documents but cost more and slow responses. Context caching (Anthropic, OpenAI) can cut repeated-prompt cost by 90%.
Hallucination — When an LLM confidently generates false information.
Hallucination stems from LLMs being probabilistic pattern generators, not knowledge stores. Mitigations: RAG (source-bound answers), citation tracking, fact-check layers, human review. GPT-4 and Claude Sonnet 4.6 have improved but can't be zeroed out — critical use-cases always need human-in-the-loop.
Token — LLM text unit, roughly 0.7 English words.
LLMs count in tokens. 1000 tokens ≈ 700 English words or ~500 Hungarian words (Hungarian is more inflected). Pricing is per-token: ~$3/1M input, ~$15/1M output for Claude Sonnet in 2026.
MCP (Model Context Protocol) — Anthropic-developed standard for tool communication between LLMs and external services.
MCP lets a single tool-server written once serve multiple LLM clients (Claude Desktop, Claude Code, your agent). Became the industry standard in 2025. Alternative to bespoke function calling.
Context engineering — Deliberate design of the LLM's context — not just prompt, but the whole input stack.
Context engineering is the evolution of prompt engineering: systematically assembling what goes into the LLM context (system prompt, few-shot, RAG chunks, tool defs, prior conversation). Especially important with long-context models.
AI security — Protecting AI systems from prompt injection, data leakage, and other attacks.
AI security has four layers: input validation (prompt injection), output guardrails (PII, toxicity), access control (RBAC, tool permissions), and audit (logging, monitoring). Regulated sectors require additional compliance (DORA, MDR, GDPR).
AI automation — AI-driven automation of business processes — support, document processing, email.
AI automation goes beyond classic RPA: the LLM can make context-aware decisions, not just run scripts. Common use-cases: multilingual customer support, product description generation, email triage, financial reporting.
DORA — EU Digital Operational Resilience Act governing financial firms' IT and AI systems.
DORA is mandatory EU-wide from 2025: banking AI systems must have incident reporting, risk management, and vendor-management processes. Budapest AI firms can serve such clients given full documentation and audit trails.
GDPR — EU General Data Protection Regulation governing personal data processing.
GDPR is the foundational EU privacy law. For AI: lawful basis for processing, data subject rights, DPIA for high-risk processing, and cross-border data transfer rules. Hungarian enforcement body: NAIH.
Generative AI — AI that creates new content — text, image, audio, code.
Generative AI generates new output, not just classification or prediction. Main families: LLMs (text), diffusion (image, video), TTS (audio), code models. Enterprise adoption has grown exponentially since 2023.
Model distillation — Transferring a large model's 'knowledge' to a smaller, faster model.
Distillation trains a smaller student model on the outputs of a larger teacher model. Result: 80–90% quality at 10% cost and 5x faster response. OpenAI, Anthropic, and Google all offer distillation workflows.
AI evaluation — Measuring AI system performance — accuracy, speed, cost, toxicity.
AI eval requires a custom suite: not just loss or generic accuracy, but real business metrics. Tools: LangSmith, Langfuse, Promptfoo, Ragas. Always A/B test against the base model before production.
Few-shot prompting — Including a few examples in the prompt to guide the pattern the LLM follows.
Few-shot prompting shows 1–5 input-output examples so the LLM copies the style. Often more effective than fine-tuning, especially for stable formats (JSON, XML) or specific tones (brand voice, legal style).
Vibe coding — LLM-driven iterative coding where the developer describes intent and AI generates code.
Vibe coding refers to AI-assisted development with Cursor, Claude Code, or similar — often 30–70% of production developer time in 2026. The question isn't whether to adopt, but which workflow to use.
AI compliance — AI systems meeting legal, privacy, and ethical requirements.
EU has three main layers: GDPR (personal data), DORA (financial resilience), EU AI Act (fully enforceable in 2026 — high-risk AI requirements). Hungary adds NAIH and MNB vendor-management rules.