AI Solutions Architect

I design, build, and ship
production AI systems
on contract.

Independent AI Solutions Architect, available for solutions development and hands-on consultancy. I take agentic AI and machine learning from first prototype to a system your team can trust in production.

See the work Book an intro call

Oxford AI ProgrammeStanford ML7+ yrs · 11 direct reports

HyperAgent · recursive self-improvementgen 01

score trajectory6.80

eval criteria7 dims

reasoningtool useroutingevaluatorarchitectureimpactcleanup

A system that improves the system that builds the agent. When scores plateau, it expands its own rubric, and the drop from 9.3 to 6.34 exposes the reward-hacking the old metric was hiding.

years

direct reports led

3,700+

campaigns delivered

2,000+

hrs/yr automated

R² 0.98

ML in production

Oxford

AI Programme

Selected work

Four systems, one visual language

You just met the first, up in the hero. Here are three more, drawn in the same visual language. Scroll any of them to watch real data trace its true path through the architecture, with every stage showing what it does and what it delivers.

Optiml

Production-gradeMulti-agent execution layer · institutional B2B

A production-grade multi-agent layer for institutional sales, where every claim is citation-checked and every send is a human decision.

Discovered

Nine per-source scrapers (Companies House, GLEIF, FCA, ESMA, GRESB and more) build a deterministic substrate of candidate accounts.

→ 113 qualifying accounts, carefully entity-reconciled to avoid mis-links.

Throughput

1 operator runs the pipeline

Lifecycle

11 states · 2 human gates

Discipline

citation-checked · 189 tests

PythonSQLAlchemyFastAPINext.jsClaude OpusPostmarkTavily / Exa

scroll to trace the pipeline, or tap a stage to inspect it

GeoForge

Proof of conceptGeospatial risk + rebuild-cost ML · insurance

From address to UPRN to climate risk to insurance rebuild cost in about 3 seconds, with a model deliberately biased to over-estimate rather than under-estimate.

Address

A messy UK address or postcode, possibly with swapped columns or typos.

→ Any free-text UK location is accepted.

Flood risk · sample

0/100 · High

Accuracy

R² ≈0.985 · ~3.5% MAPE (5-fold CV)

Underwriting

~98% within BCIS band · downside-biased

Speed

~3s / property · 12 open sources

PythonFastAPICatBoostReactLeaflet / Three.js41M UPRN index

scroll to trace the pipeline, or tap a stage to inspect it

AI Visibility

ProductionGenerative-engine optimisation · analytics

Measure how often your brand surfaces in AI answers, across ChatGPT, Gemini, Perplexity and Google, with statistical rigour.

Prompts

A taxonomy of buyer prompts per topic and market is run against each engine.

→ Hundreds of real-world questions, not vanity queries.

Fortress map · platform × topic

ChatGPTGeminiPplxGoogle

DominanceParityBlindspot

Coverage

ChatGPT · Gemini · Perplexity · Google

Rigour

Wilson-score CIs · 20+ metrics

Output

Fortress Map · win/loss · SOV

React + TypeScriptFastAPIPandas / SciPyPlotlyGPT-4o (sentiment)

scroll to trace the pipeline, or tap a stage to inspect it

Approach

The throughline across everything I build

The same shape recurs in every project: an autonomous system, an evaluator that judges it, guardrails that keep it honest, and a human at the gate. That's what makes AI safe to put in front of a client.

Measurable by design

Every system scores its own output, using AI-as-judge checks, reusable test registries, and regression tests. If it can't be measured, it doesn't ship.

Guardrails that refuse

Loop and error detection, safe-action checks, and hard cost ceilings. When something looks wrong, the system stops loudly instead of failing quietly.

Humans at the gates

Autonomy where it's safe, human approval where it counts, backed by a full audit trail so you can replay every decision.

Built to run in production

Prompt caching, model selection, observability, Docker, CI. Prototyped fast, then hardened for reliability and cost.

Agentic AI & LLMs

Multi-agent orchestration · LLM-as-judge · RAG · tool use · guardrails · Claude SDK

Machine Learning

CatBoost · LightGBM · XGBoost · scikit-learn · quantile regression · NLP

Infra & Data

FastAPI · Docker · PostgreSQL · Redis · WebSocket · CI/CD · ETL · AWS

Leadership & delivery

Team of 11 · KPI frameworks · data governance · rapid prototyping · PRINCE2 · Agile

About

From Fortune 500 automation to autonomous AI

Seven years turning messy, data-heavy problems into systems that run themselves, and leading the teams that keep them running.

AI Solutions Architect

2025–present

Freelance / Consultancy

Built HyperAgent, a recursive self-improvement orchestrator, and its target agent Claw.
Shipped an AI-visibility platform to production, built the production-grade Optiml outreach engine, and proved out GeoForge for insurance risk.

Data Analyst → Interim Head of Data

2022–2025

Search Intelligence Ltd.

Promoted twice in three years; led 11 analysts supporting 70+ PR executives.
Owned data quality and pipelines across 3,700+ digital PR campaigns; internal tools drove 40% of deliverables.

Data Analyst & Developer

2019–2022

Pilgrim's Pride Ltd. (Fortune 500)

Automated 2,000+ business hours annually across supply chain, logistics and procurement.
Built REST APIs and ETL pipelines connecting siloed systems; QlikSense operational dashboards.

Education & certs

Oxford AI Programme · Saïd Business School

ML Specialization · Stanford (Coursera)

AWS Solutions Architect (in progress, 2026)

PRINCE2 · Google PM · IBM Agile

Book a call

Grab a 30-minute intro call

The quickest way to scope something. We'll talk through what you're building, whether that's solutions development or consultancy, and how I'd approach it. Flexible engagements, day-rate or fixed-scope.

Step 1 of 2 · pick a time30 min · times in UTC

Loading availability…

Or just say hello

Tell me what you're building.

Prefer email to a calendar? Drop me a line or connect on LinkedIn. I'm available for contract and consulting work across agentic AI, production ML, and full-stack delivery.

kallum@kallumjames.com LinkedIn

I design, build, and shipproduction AI systemson contract.

Four systems, one visual language

Optiml

GeoForge

AI Visibility

The throughline across everything I build

Measurable by design

Guardrails that refuse

Humans at the gates

Built to run in production

From Fortune 500 automation to autonomous AI

AI Solutions Architect

Data Analyst → Interim Head of Data

Data Analyst & Developer

Grab a 30-minute intro call

Tell me what you're building.

I design, build, and ship
production AI systems
on contract.