AI Glossary A–Z
Key AI and machine learning concepts explained in plain language. A reference for business leaders, developers, and anyone navigating the AI landscape.
A
A/B Testing for AI
A/B testing for AI is the practice of comparing two or more variants of an AI system (different models, prompts, or configurations) by serving them to different user groups and measuring which performs better.
Agentic AI
Agentic AI refers to AI systems that can autonomously pursue goals by planning actions, using tools, making decisions, and adapting their approach based on results — moving beyond passive response generation to active task completion.
Agentic Loop
An agentic loop is the core execution cycle of an AI agent — observe the current state, reason about what to do next, take an action, and repeat until the goal is achieved or a stopping condition is met.
AI Agent
An AI agent is an autonomous system that uses a language model to reason about goals, plan actions, use tools, and execute multi-step tasks with minimal human intervention.
AI Agents
AI agents are autonomous AI systems that can plan, reason, use tools, and take actions to accomplish goals with minimal human supervision, going beyond simple question-answering to execute multi-step tasks.
AI Alignment
AI alignment is the field of research and practice focused on ensuring that AI systems behave in accordance with human values, intentions, and goals — doing what we actually want rather than what we literally specify.
AI Audit
An AI audit is a systematic evaluation of an AI system's performance, fairness, safety, compliance, and governance, providing assurance that the system operates as intended and meets regulatory requirements.
AI Bias
AI bias refers to systematic errors in AI systems that produce unfair outcomes, typically arising from biased training data, flawed model design, or inappropriate application of AI to sensitive decisions.
AI Centre of Excellence (CoE)
An AI Centre of Excellence is a centralised team or organisational unit that provides AI expertise, best practices, governance, and support to enable consistent, effective AI adoption across an organisation.
AI Champion
An AI champion is an individual within an organisation who advocates for AI adoption, drives awareness and experimentation, and helps bridge the gap between AI potential and practical business application.
AI Fairness
AI fairness refers to the design, evaluation, and deployment of AI systems that treat all individuals and groups equitably, avoiding discrimination and ensuring that benefits and harms are distributed justly.
AI Literacy
AI literacy is the ability to understand, evaluate, and effectively interact with AI systems, encompassing knowledge of how AI works, its capabilities and limitations, and how to use it responsibly.
AI Maturity Model
An AI maturity model is a framework that assesses an organisation's current AI capabilities across multiple dimensions and defines progressive stages of AI adoption, from initial experimentation to enterprise-wide integration.
AI Observability
AI observability is the practice of monitoring, tracing, and understanding the behaviour of AI systems in production, providing visibility into performance, quality, costs, and potential issues.
AI Procurement
AI procurement is the process of evaluating, selecting, and purchasing AI solutions, requiring specialised assessment criteria beyond traditional software procurement.
AI Readiness
AI readiness is an organisation's preparedness to successfully adopt and benefit from AI, encompassing data infrastructure, technical capabilities, talent, leadership support, and governance frameworks.
AI Regulation
AI regulation refers to the laws, standards, and governance frameworks established by governments and international bodies to ensure that AI systems are developed and used safely, fairly, and transparently.
AI ROI
AI ROI measures the return on investment from AI initiatives, comparing the business value generated (cost savings, revenue growth, efficiency gains) against the total investment (technology, talent, data, change management).
AI Transparency
AI transparency is the practice of being open and clear about how AI systems work, what data they use, how decisions are made, and what limitations they have, building trust and enabling accountability.
AI Vendor Lock-in
AI vendor lock-in occurs when an organisation becomes so dependent on a specific AI provider's technology, APIs, or data formats that switching to an alternative becomes prohibitively costly or disruptive.
API Gateway
An API gateway is an infrastructure component that sits between clients and AI services, managing authentication, rate limiting, routing, load balancing, and monitoring for AI API traffic.
Artificial General Intelligence (AGI)
Artificial General Intelligence (AGI) refers to a hypothetical AI system capable of understanding, learning, and applying knowledge across any intellectual task at or above human level, rather than being limited to a single domain.
Artificial Narrow Intelligence (ANI)
Artificial Narrow Intelligence (ANI) describes AI systems designed and trained to perform a specific task or narrow set of tasks, such as language translation, image recognition, or recommendation engines, without general-purpose reasoning ability.
Attention Mechanism
The attention mechanism is a neural network technique that allows AI models to dynamically focus on the most relevant parts of their input when producing each output, enabling them to capture relationships across long sequences of text.
Auto-scaling
Auto-scaling automatically adjusts the number of AI model instances or compute resources based on real-time demand, scaling up during peak traffic and down during quiet periods to optimise cost and performance.
B
Backpropagation
Backpropagation is the core algorithm used to train neural networks, calculating how much each weight in the network contributes to the overall error and adjusting weights to improve predictions.
Batch Size
Batch size is a training hyperparameter that determines how many data samples are processed before the model's weights are updated, affecting training speed, memory usage, and model quality.
Beam Search
Beam search is a text generation strategy that explores multiple candidate sequences simultaneously, keeping the top-scoring options at each step to find a globally better output than greedy decoding.
Blue-Green Deployment
Blue-green deployment is a release strategy that maintains two identical production environments (blue and green), switching traffic between them to enable zero-downtime updates and instant rollback.
C
Canary Deployment
Canary deployment is a release strategy that gradually rolls out changes to a small subset of users first, monitoring for issues before expanding to the full user base, significantly reducing the risk of AI system updates.
Chain-of-Thought Prompting
Chain-of-thought (CoT) prompting is a technique that instructs AI models to reason through problems step by step before providing a final answer, significantly improving accuracy on complex tasks.
Chat Completion
Chat completion is the API pattern used to interact with language models in a conversational format, where messages are sent as a sequence of roles (system, user, assistant) and the model generates the next response.
Chunking
Chunking is the process of splitting documents into smaller, meaningful segments for embedding and retrieval in AI systems, with the chunking strategy significantly affecting search quality and response accuracy.
Citizen AI
Citizen AI refers to the practice of empowering non-technical business users to build and deploy AI solutions using low-code/no-code tools, under appropriate governance and support structures.
Cloud AI
Cloud AI refers to artificial intelligence services and infrastructure delivered through cloud computing platforms, enabling businesses to access AI capabilities without managing their own hardware or model training.
Code Generation
Code generation is the use of AI models to automatically write, complete, debug, and refactor computer code, significantly accelerating software development workflows.
Computer Vision
Computer vision is a field of AI that enables machines to interpret and understand visual information from images and video, powering applications from quality inspection and medical imaging to autonomous vehicles and document processing.
Constitutional AI
Constitutional AI (CAI) is an alignment approach developed by Anthropic where AI models are trained to follow a set of explicit principles (a 'constitution'), enabling the model to self-critique and revise its outputs for safety and helpfulness.
Containerised AI
Containerised AI packages AI models, their dependencies, and serving infrastructure into portable containers that run consistently across any environment, simplifying deployment and scaling.
Context Window
A context window is the maximum amount of text (measured in tokens) that a language model can process in a single interaction, encompassing both the input prompt and the generated response.
D
Data Labelling
Data labelling is the process of annotating raw data (text, images, audio, or video) with meaningful tags or categories that AI models use to learn patterns during supervised training.
Data Lakehouse
A data lakehouse is a data architecture that combines the flexibility and cost-effectiveness of data lakes with the reliability and performance of data warehouses, providing a unified platform for analytics and AI workloads.
Data Mesh
Data mesh is a decentralised data architecture that treats data as a product owned by domain teams, with a self-serve data platform and federated governance, enabling scalable data management for AI and analytics.
Data Strategy
A data strategy is the organisational plan for collecting, managing, governing, and leveraging data assets to support business objectives and AI initiatives, providing the essential foundation for effective AI adoption.
Deep Learning
Deep learning is a subset of machine learning that uses neural networks with many layers (hence 'deep') to learn complex patterns from large amounts of data, enabling capabilities such as image recognition, language understanding, and speech processing.
Dense Retrieval
Dense retrieval is an information retrieval approach that uses learned dense vector representations (embeddings) to find semantically relevant documents, as opposed to sparse methods that rely on exact keyword matching.
Diffusion Models
Diffusion models are a class of generative AI that create images, video, and other media by learning to gradually remove noise from random data, producing high-quality outputs through an iterative refinement process.
Direct Preference Optimisation (DPO)
DPO is an AI alignment technique that trains models directly on human preference data without needing a separate reward model, offering a simpler and more stable alternative to RLHF.
Document Parsing
Document parsing is the process of extracting structured text, tables, images, and metadata from documents in various formats (PDF, Word, HTML, scanned images), making content accessible for AI processing.
E
Edge AI
Edge AI runs artificial intelligence models directly on local devices or edge servers rather than in the cloud, enabling real-time processing, data privacy, and operation without internet connectivity.
Embeddings
Embeddings are numerical representations of data (text, images, or other content) in a high-dimensional vector space, where similar items are positioned closer together, enabling machines to understand meaning and similarity.
Entity Extraction
Entity extraction is the process of automatically identifying and classifying key pieces of information (such as names, dates, amounts, and locations) from unstructured text.
Explainable AI (XAI)
Explainable AI (XAI) encompasses techniques and methods that make AI system outputs understandable to humans, providing insight into why a model made a particular prediction or decision.
F
Feature Store
A feature store is a centralised platform for storing, managing, and serving machine learning features — the processed data inputs that models use for predictions — ensuring consistency between training and production.
Few-Shot Learning
Few-shot learning is a technique where an AI model is given a small number of examples (typically 2-10) within the prompt to demonstrate the desired task, significantly improving accuracy and consistency without requiring full fine-tuning.
Fine-Tuning
Fine-tuning is the process of further training a pre-trained AI model on a smaller, task-specific dataset to adapt its behaviour, style, or knowledge for a particular use case.
Foundation Model
A foundation model is a large AI model trained on broad data that can be adapted to a wide range of downstream tasks through fine-tuning or prompting, serving as the base layer for many AI applications.
Function Calling
Function calling is a capability that allows AI models to generate structured requests to external tools, APIs, and services, enabling them to take actions and retrieve real-time information beyond their training data.
G
GPU Computing
GPU computing uses graphics processing units — originally designed for rendering images — to accelerate AI workloads, providing the massive parallel processing power needed for training and running AI models.
Grounding
Grounding is the practice of anchoring AI model outputs to verified, authoritative data sources, ensuring responses are factually accurate and traceable rather than generated from the model's training data alone.
Guardrails
Guardrails are safety mechanisms and constraints applied to AI systems to prevent harmful, inaccurate, or off-topic outputs, ensuring models behave reliably and within defined boundaries.
H
Hallucination
AI hallucination occurs when a language model generates plausible-sounding but factually incorrect, fabricated, or unsupported information, presenting it with the same confidence as accurate responses.
Human-in-the-Loop (HITL)
Human-in-the-loop is a design pattern where human judgment is integrated into AI-powered workflows at critical decision points, combining AI speed and scale with human expertise and accountability.
Hybrid Search
Hybrid search combines traditional keyword-based search with vector-based semantic search, leveraging the strengths of both approaches to deliver more accurate and comprehensive results.
I
Inference
Inference is the process of using a trained AI model to generate predictions or outputs from new input data — it is the production phase where the model does its actual work for end users.
Inference Optimisation
Inference optimisation encompasses techniques that make AI model predictions faster, more memory-efficient, and less expensive to compute, enabling cost-effective production deployment.
Information Retrieval
Information retrieval (IR) is the field of study and practice concerned with searching for and finding relevant documents, data, or information from large collections based on user queries.
Instruction Tuning
Instruction tuning is a fine-tuning process where a pre-trained language model is trained on instruction-response pairs, teaching it to follow human directions and produce helpful, structured outputs.
K
Knowledge Distillation
Knowledge distillation is a technique that transfers the knowledge of a large, powerful AI model (the teacher) to a smaller, faster model (the student), enabling efficient deployment without a proportional loss in quality.
Knowledge Graph
A knowledge graph is a structured representation of information that organises entities (people, places, concepts) and the relationships between them, enabling AI systems to reason about complex, interconnected data.
L
Large Language Model (LLM)
A large language model is a type of AI trained on vast amounts of text data that can understand, generate, and reason about human language with remarkable fluency and versatility.
Large Language Model (LLM)
A large language model (LLM) is an AI system trained on vast amounts of text data that can understand, generate, and reason about human language, powering applications from chatbots to code generation.
LLMOps
LLMOps is the set of practices, tools, and processes for managing large language model applications in production, covering prompt management, evaluation, monitoring, cost control, and continuous improvement.
Load Balancing
Load balancing distributes incoming AI requests across multiple model instances or servers to optimise resource utilisation, minimise latency, and ensure high availability.
LoRA (Low-Rank Adaptation)
LoRA is a parameter-efficient fine-tuning technique that trains a small set of adapter weights instead of modifying all model parameters, making it possible to customise large AI models quickly and affordably.
M
Machine Learning
Machine learning is a branch of artificial intelligence in which algorithms learn patterns from data and improve their performance over time without being explicitly programmed for each specific task.
Memory (AI)
Memory in AI refers to mechanisms that allow agents and models to retain and recall information across interactions, enabling personalisation, context awareness, and learning from past experiences.
Metadata Filtering
Metadata filtering is a retrieval technique that narrows search results by applying structured attribute filters (such as date, category, or source) alongside semantic search, improving precision and relevance.
Mixture of Experts (MoE)
Mixture of Experts is a neural network architecture that divides the model into specialised sub-networks (experts) and uses a routing mechanism to activate only the most relevant experts for each input, achieving high capability with lower computational cost.
ML Pipeline
An ML pipeline is an automated workflow that orchestrates the steps of machine learning — from data ingestion and processing through model training, evaluation, and deployment — ensuring reproducibility and operational reliability.
MLOps
MLOps (Machine Learning Operations) is a set of practices that combines machine learning, DevOps, and data engineering to reliably deploy, monitor, and maintain AI models in production environments.
Model Card
A model card is a standardised document that describes an AI model's intended uses, performance characteristics, limitations, ethical considerations, and training data, providing transparency for stakeholders.
Model Registry
A model registry is a centralised repository that stores, versions, and tracks AI models along with their metadata, enabling teams to manage the full lifecycle of models from development to production.
Model Serving
Model serving is the process of deploying trained AI models to production infrastructure where they can receive requests and return predictions in real time, handling concerns like scaling, latency, and reliability.
Multi-Agent Systems
Multi-agent systems are architectures where multiple specialised AI agents collaborate, communicate, and coordinate to solve complex tasks that would be difficult or impossible for a single agent to handle alone.
Multi-modal AI
Multi-modal AI refers to artificial intelligence systems that can process, understand, and generate multiple types of data — such as text, images, audio, and video — within a single model.
N
Named Entity Recognition (NER)
Named Entity Recognition (NER) is an NLP technique that automatically identifies and classifies named entities in text — such as people, organisations, locations, dates, and quantities — into predefined categories.
Natural Language Generation (NLG)
Natural Language Generation (NLG) is the subfield of NLP concerned with producing fluent, human-readable text from structured data or other inputs, enabling AI systems to write reports, summaries, responses, and creative content.
Natural Language Processing (NLP)
Natural language processing is a branch of AI that enables computers to understand, interpret, and generate human language, powering applications from chatbots and search engines to document analysis and translation.
Natural Language Understanding (NLU)
Natural Language Understanding (NLU) is a subfield of NLP focused on enabling machines to comprehend the meaning, intent, and context of human language, going beyond surface-level text processing to grasp what users actually mean.
Neural Network
A neural network is a computing system inspired by the human brain, composed of layers of interconnected nodes (neurons) that learn patterns from data, forming the foundation of modern AI including language models, image recognition, and more.
O
On-premise AI
On-premise AI refers to deploying and running AI systems on an organisation's own infrastructure rather than using cloud services, providing maximum control over data, security, and performance.
ONNX Runtime
ONNX Runtime is an open-source inference engine that runs AI models in the ONNX (Open Neural Network Exchange) format, enabling optimised, cross-platform model deployment across CPUs, GPUs, and specialised hardware.
Ontology
An ontology is a formal representation of knowledge within a domain, defining concepts, their properties, and the relationships between them, providing structured context for AI systems and data integration.
Orchestration (AI)
Orchestration in AI is the coordination of multiple components — language models, tools, data sources, and processing steps — into cohesive workflows that accomplish complex tasks.
P
Planning (AI)
Planning in AI refers to an agent's ability to break down a high-level goal into a sequence of actionable steps, determine the right tools and information needed, and adapt the plan based on intermediate results.
Pre-training
Pre-training is the initial phase of training an AI model on a large, diverse dataset to learn general patterns and knowledge, before it is fine-tuned or adapted for specific tasks.
Prompt Chaining
Prompt chaining is a technique where the output of one AI prompt becomes the input of the next, breaking complex tasks into manageable sequential steps for more reliable and controllable results.
Prompt Engineering
Prompt engineering is the practice of designing and refining the instructions given to AI models to elicit accurate, relevant, and useful responses for specific tasks.
Prompt Management
Prompt management is the practice of systematically versioning, testing, deploying, and monitoring the prompts used in AI applications, treating prompts as critical production artefacts.
R
Rate Limiting
Rate limiting controls the number of requests that clients can make to an AI service within a given time period, preventing abuse, managing costs, and ensuring fair access for all users.
Re-ranking
Re-ranking is a retrieval technique that uses a more powerful model to reorder an initial set of search results by relevance, significantly improving the quality of the final results presented to the user or AI model.
Red Teaming (AI)
Red teaming in AI is the practice of systematically probing AI systems for vulnerabilities, failure modes, and harmful outputs by simulating adversarial or edge-case scenarios.
Reflection (AI)
Reflection in AI is a pattern where a model evaluates its own output, identifies errors or improvements, and revises its response, leading to higher-quality results through iterative self-critique.
Reinforcement Learning
Reinforcement learning (RL) is a machine learning paradigm where an agent learns optimal behaviour through trial and error, receiving rewards or penalties for its actions and improving its strategy over time.
Reinforcement Learning from Human Feedback (RLHF)
RLHF is a training technique that uses human judgments to teach AI models which outputs are preferred, aligning model behaviour with human values and expectations for helpfulness, safety, and accuracy.
Responsible AI
Responsible AI is the practice of developing, deploying, and governing AI systems in ways that are ethical, fair, transparent, safe, and accountable, considering the impact on individuals and society.
Retrieval-Augmented Generation (RAG)
RAG is a technique that enhances large language model responses by retrieving relevant information from external knowledge sources before generating an answer, reducing hallucinations and keeping outputs grounded in factual data.
Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is an architecture that enhances AI model responses by retrieving relevant information from external knowledge sources before generating an answer, reducing hallucinations and enabling access to current or proprietary data.
S
Semantic Search
Semantic search is an AI-powered search technique that understands the meaning and intent behind queries rather than just matching keywords, delivering more relevant and accurate results.
Shadow AI
Shadow AI refers to the use of AI tools and services by employees without official organisational approval, oversight, or governance, creating risks around data security, compliance, and quality control.
Speech-to-Text (STT)
Speech-to-text is an AI technology that automatically transcribes spoken audio into written text, enabling applications like meeting transcription, voice commands, and call centre analytics.
Streaming
Streaming is a technique where AI model responses are delivered incrementally, token by token, as they are generated, rather than waiting for the complete response before displaying it.
Structured Output
Structured output is a capability that constrains AI model responses to conform to a specified data schema, ensuring outputs are machine-readable and compatible with downstream systems.
Supervised Learning
Supervised learning is a machine learning approach where models are trained on labelled data — input-output pairs — so they can learn to predict the correct output for new, unseen inputs.
System Prompt
A system prompt is a set of instructions given to an AI model that defines its behaviour, personality, constraints, and response format, shaping how it interacts with users throughout a conversation.
T
Taxonomy
A taxonomy is a hierarchical classification system that organises concepts, content, or data into categories and subcategories, providing structure for navigation, search, and AI-powered classification.
Temperature
Temperature is a parameter that controls the randomness of an AI model's outputs — lower values produce more focused, deterministic responses while higher values increase creativity and variability.
TensorRT
TensorRT is NVIDIA's high-performance deep learning inference optimiser and runtime that maximises AI model speed on NVIDIA GPUs through precision calibration, layer fusion, and kernel auto-tuning.
Text-to-Speech (TTS)
Text-to-speech is an AI technology that converts written text into spoken audio, producing natural-sounding voice output for applications like virtual assistants, accessibility tools, and content narration.
Tokenisation
Tokenisation is the process of breaking text into smaller units called tokens (words, sub-words, or characters) that AI models can process, forming the fundamental input and output unit for language models.
Tool Calling
Tool calling is the mechanism by which language models generate structured requests to invoke external functions, APIs, or services, enabling them to take actions and access real-time information.
Tool Use
Tool use is the ability of language models to invoke external functions, APIs, or services during a conversation, extending their capabilities beyond text generation to interact with real-world systems.
Top-k Sampling
Top-k sampling is a text generation strategy that restricts the language model's next-token selection to the k most probable tokens, balancing creativity and coherence by filtering out unlikely choices.
Total Cost of Ownership (AI)
Total cost of ownership for AI encompasses all direct and indirect costs of building, deploying, and maintaining an AI system over its lifecycle, including often-underestimated costs like data, talent, and ongoing operations.
Transfer Learning
Transfer learning is a machine learning technique where knowledge gained from training on one task is applied to a different but related task, dramatically reducing the data and compute needed to build effective AI models.
Transformer
The transformer is a neural network architecture based on self-attention mechanisms that has become the foundation for virtually all modern large language models, enabling them to process and generate text with remarkable capability.
Transformer Architecture
The transformer architecture is a neural network design based on self-attention mechanisms that processes input data in parallel, enabling the training of large, powerful models for language, vision, and other tasks.
V
Vector Database
A vector database is a specialised storage system designed to efficiently store, index, and search high-dimensional vectors (embeddings), enabling fast similarity-based retrieval for AI applications.
Vector Search
Vector search is a technique that finds similar items by comparing their mathematical representations (vectors) in high-dimensional space, enabling search by meaning rather than exact keyword matching.
Vision-Language Model (VLM)
A vision-language model is an AI system that can understand and reason about both images and text simultaneously, enabling tasks like image captioning, visual question-answering, and document analysis.