AI Solutions Architect

I design, build, and ship
production AI systems
on contract.

Independent AI Solutions Architect, available for solutions development and hands-on consultancy. I take agentic AI and machine learning from first prototype to a system your team can trust in production.

Oxford AI ProgrammeStanford ML7+ yrs · 11 direct reports
HyperAgent · recursive self-improvementgen 01
BUILDEVALUATEIMPROVECRITIQUEEXPANDself-improvingClaude × GPT-4o judge
score trajectory6.80
eval criteria7 dims
reasoningtool useroutingevaluatorarchitectureimpactcleanup

A system that improves the system that builds the agent. When scores plateau, it expands its own rubric, and the drop from 9.3 to 6.34 exposes the reward-hacking the old metric was hiding.

7+
years
11
direct reports led
3,700+
campaigns delivered
2,000+
hrs/yr automated
R² 0.98
ML in production
Oxford
AI Programme
Selected work

Four systems, one visual language

You just met the first, up in the hero. Here are three more, drawn in the same visual language. Scroll any of them to watch real data trace its true path through the architecture, with every stage showing what it does and what it delivers.

01

Optiml

Production-grade
Discovered9 sourcesClassifiedICP scoreTriagehuman gate 1Signal + Cite4-check validatorDraftedOpus 4.xCritic v56-check · can refuseEscalatedfail loudApprovalhuman gate 2SentDKIM · PostmarkRepliedauto-classifyClosedaudited

A production-grade multi-agent layer for institutional sales, where every claim is citation-checked and every send is a human decision.

Discovered

Nine per-source scrapers (Companies House, GLEIF, FCA, ESMA, GRESB and more) build a deterministic substrate of candidate accounts.

113 qualifying accounts, carefully entity-reconciled to avoid mis-links.

Throughput
1 operator runs the pipeline
Lifecycle
11 states · 2 human gates
Discipline
citation-checked · 189 tests
PythonSQLAlchemyFastAPINext.jsClaude OpusPostmarkTavily / Exa
scroll to trace the pipeline, or tap a stage to inspect it
02

GeoForge

Proof of concept
Addressraw inputValidate+ AI correctUPRN41M indexFloodEA zonesEPCfabricElevationLiDAR 1mSubsidenceBGS clayWildfireDEFRARisk Engine0–100 compositeRebuild MLCatBoost ×3Record100+ fields

From address to UPRN to climate risk to insurance rebuild cost in about 3 seconds, with a model deliberately biased to over-estimate rather than under-estimate.

Address

A messy UK address or postcode, possibly with swapped columns or typos.

Any free-text UK location is accepted.

Flood risk · sample
0/100 · High
Accuracy
R² ≈0.985 · ~3.5% MAPE (5-fold CV)
Underwriting
~98% within BCIS band · downside-biased
Speed
~3s / property · 12 open sources
PythonFastAPICatBoostReactLeaflet / Three.js41M UPRN index
scroll to trace the pipeline, or tap a stage to inspect it
03

AI Visibility

Production
PromptstaxonomyChatGPTOpenAIGeminiGooglePerplexityGoogle AIOverviewsExtractde-bloatScoreWilson CIFortress Mapreport

Measure how often your brand surfaces in AI answers, across ChatGPT, Gemini, Perplexity and Google, with statistical rigour.

Prompts

A taxonomy of buyer prompts per topic and market is run against each engine.

Hundreds of real-world questions, not vanity queries.

Fortress map · platform × topic
ChatGPTGeminiPplxGoogle
DominanceParityBlindspot
Coverage
ChatGPT · Gemini · Perplexity · Google
Rigour
Wilson-score CIs · 20+ metrics
Output
Fortress Map · win/loss · SOV
React + TypeScriptFastAPIPandas / SciPyPlotlyGPT-4o (sentiment)
scroll to trace the pipeline, or tap a stage to inspect it
Approach

The throughline across everything I build

The same shape recurs in every project: an autonomous system, an evaluator that judges it, guardrails that keep it honest, and a human at the gate. That's what makes AI safe to put in front of a client.

01

Measurable by design

Every system scores its own output, using AI-as-judge checks, reusable test registries, and regression tests. If it can't be measured, it doesn't ship.

02

Guardrails that refuse

Loop and error detection, safe-action checks, and hard cost ceilings. When something looks wrong, the system stops loudly instead of failing quietly.

03

Humans at the gates

Autonomy where it's safe, human approval where it counts, backed by a full audit trail so you can replay every decision.

04

Built to run in production

Prompt caching, model selection, observability, Docker, CI. Prototyped fast, then hardened for reliability and cost.

Agentic AI & LLMs
Multi-agent orchestration · LLM-as-judge · RAG · tool use · guardrails · Claude SDK
Machine Learning
CatBoost · LightGBM · XGBoost · scikit-learn · quantile regression · NLP
Infra & Data
FastAPI · Docker · PostgreSQL · Redis · WebSocket · CI/CD · ETL · AWS
Leadership & delivery
Team of 11 · KPI frameworks · data governance · rapid prototyping · PRINCE2 · Agile
About

From Fortune 500 automation to autonomous AI

Seven years turning messy, data-heavy problems into systems that run themselves, and leading the teams that keep them running.

AI Solutions Architect

2025–present
Freelance / Consultancy
  • Built HyperAgent, a recursive self-improvement orchestrator, and its target agent Claw.
  • Shipped an AI-visibility platform to production, built the production-grade Optiml outreach engine, and proved out GeoForge for insurance risk.

Data Analyst → Interim Head of Data

2022–2025
Search Intelligence Ltd.
  • Promoted twice in three years; led 11 analysts supporting 70+ PR executives.
  • Owned data quality and pipelines across 3,700+ digital PR campaigns; internal tools drove 40% of deliverables.

Data Analyst & Developer

2019–2022
Pilgrim's Pride Ltd. (Fortune 500)
  • Automated 2,000+ business hours annually across supply chain, logistics and procurement.
  • Built REST APIs and ETL pipelines connecting siloed systems; QlikSense operational dashboards.
Education & certs
Oxford AI Programme · Saïd Business School
ML Specialization · Stanford (Coursera)
AWS Solutions Architect (in progress, 2026)
PRINCE2 · Google PM · IBM Agile
Book a call

Grab a 30-minute intro call

The quickest way to scope something. We'll talk through what you're building, whether that's solutions development or consultancy, and how I'd approach it. Flexible engagements, day-rate or fixed-scope.

Step 1 of 2 · pick a time30 min · times in UTC

Loading availability…

Or just say hello

Tell me what you're building.

Prefer email to a calendar? Drop me a line or connect on LinkedIn. I'm available for contract and consulting work across agentic AI, production ML, and full-stack delivery.