Reliability layer

I build the layer that decides whether an AI's output is safe to act on.

Not the model, the engineering around it: the checks, the boundaries, the evaluation. The point where “the model said so” stops being good enough.

First Class Mathematics, Bristol · MSc Business Analytics & AI, Imperial

GitHub Email LinkedIn

How I work

I ask the same few questions of every system, then build the part that answers them.

What can fail?
How do we know this output is correct?
Where should a human stay in control?

Selected work

Claim Verification Pipeline

7/7 caught

A deterministic verification layer for AI-assisted research. The model proposes; code verifies every URL, quote, and figure against primary sources it fetches itself; humans keep the judgment. It attacks itself in CI with spoofed domains and fabricated quotes, and found a real exploit a separate review had passed as safe.

Repo Live demo

Tutor Analytics

0 unchecked writes

A full-stack platform that scores free-text answers with an LLM behind a strict policy boundary, then runs deterministic analytics with zero further model calls. Nothing the model returns reaches the database unchecked.

Repo

Volatility Regime Forecasting

No look-ahead

An ML pipeline that classifies 30-day forward volatility regimes, built to avoid look-ahead bias and read with SHAP rather than left as a black box.

GitHub

Now

Finishing my MSc, consulting on AI implementation for a major global asset manager, and building something small that people can actually use. Open to applied AI and AI engineering roles at small teams building things people rely on.