How long does an AI implementation take?

Most single workflow implementations take 2-6 weeks from kickoff to production. Full AI transformation programmes run 6-12 weeks.

Do you work with specific AI models?

We are model-agnostic and work with all major providers including Anthropic Claude, OpenAI GPT, Google Gemini, Meta Llama, Mistral, and more.

Can you deploy AI on our own servers?

Yes. Our Local & Private AI service deploys models on your own infrastructure or private cloud.

Free Tool

AI Hardware Calculator

Find out exactly what hardware you need to run AI models locally. Adjust the inputs below to match your use case.

Your Configuration

Model Size

Quantisation Level

Lower quantisation = less VRAM, slightly lower quality

Context Length4K tokens

2K128K

Concurrent Users1

110

Priority

Recommended Build

Required VRAM

5.0 GB

Recommended GPU

RTX 4060 (8 GB)

System RAM

32 GB

CPU

AMD Ryzen 5 / Intel i5

Storage

1 TB NVMe SSD

PSU

550W

Estimated Build Cost

£500 – £800

Build Tier

Entry

What You Can Run

7B chatbot (Llama 3.2, Mistral 7B)
Code completion assistant
Document summariser
Local embeddings for RAG

Reference

VRAM Requirements by Model & Quantisation

Model	FP16	Q8	Q5	Q4	Q3
7B	14.0 GB	7.0 GB	4.4 GB	3.5 GB	2.6 GB
13B	26.0 GB	13.0 GB	8.1 GB	6.5 GB	4.9 GB
30B	60.0 GB	30.0 GB	18.8 GB	15.0 GB	11.3 GB
70B	140.0 GB	70.0 GB	43.8 GB	35.0 GB	26.3 GB
120B+	240.0 GB	120.0 GB	75.0 GB	60.0 GB	45.0 GB

Base VRAM only. Add KV cache overhead for long contexts and multiple concurrent users.

Software Stack

Recommended Software

Ollama

Easiest way to download and run models. One-command setup with built-in API.

llama.cpp

Maximum performance inference engine. Powers most local AI tools under the hood.

vLLM

Production serving with batching, OpenAI-compatible API. Best for multi-user setups.

Open WebUI

ChatGPT-like interface for local models. Works with Ollama out of the box.

LM Studio

GUI model manager for Mac and Windows. Great for non-technical users.

text-generation-webui

Feature-rich web interface with model loading, fine-tuning, and extensions.

Need Help With Your Local AI Setup?

We help UK businesses deploy AI on their own infrastructure — from hardware specification to production deployment.

Book a Strategy Call Read the Full Guide