Reliability layer
I build the layer that decides whether an AI's output is safe to act on.
Not the model, the engineering around it: the checks, the boundaries, the evaluation. The point where “the model said so” stops being good enough.
How I work
I ask the same few questions of every system, then build the part that answers them.
- What can fail?
- How do we know this output is correct?
- Where should a human stay in control?
Selected work
Claim Verification Pipeline
7/7 caughtA deterministic verification layer for AI-assisted research. The model proposes; code verifies every URL, quote, and figure against primary sources it fetches itself; humans keep the judgment. It attacks itself in CI with spoofed domains and fabricated quotes, and found a real exploit a separate review had passed as safe.
Tutor Analytics
0 unchecked writesA full-stack platform that scores free-text answers with an LLM behind a strict policy boundary, then runs deterministic analytics with zero further model calls. Nothing the model returns reaches the database unchecked.
Volatility Regime Forecasting
No look-aheadAn ML pipeline that classifies 30-day forward volatility regimes, built to avoid look-ahead bias and read with SHAP rather than left as a black box.
Now
Finishing my MSc, consulting on AI implementation for a major global asset manager, and building something small that people can actually use. Open to applied AI and AI engineering roles at small teams building things people rely on.