The Lab

Exploring how AI actually works

We run local AI infrastructure, build agentic workflows, and work with frontier systems daily. Our research comes from doing: deploying, building, observing, and documenting what we learn.

What we do

We spend our time working with AI systems directly. Building, testing, breaking, understanding. The research comes from the work.

Current focus

Local AI infrastructure. Running frontier models on our own hardware. NVIDIA Blackwell, DGX systems, multi-GPU configurations. This gives us complete control over what we test and how.

Agentic workflows. Building and testing AI agents for real work. We use tools like Claude Code daily and develop our own automation systems.

Frontier systems. Working with new releases as they ship. Understanding what changes, what improves, what breaks.

Reliability observation. Documenting how AI systems actually behave when relied upon for consequential work. The gap between claimed and observed capability.

What we have learned

Working with AI systems daily surfaces patterns. Some of what we have observed:

Observations

Confidence without accuracy. Systems can sound authoritative while being wrong. The fluency can mask failures.

Context matters. The same prompt behaves differently across models, configurations, and context lengths. Reproducibility is harder than it looks.

Agentic failures cascade. When multi-step workflows break, they often break in interesting ways. Small errors compound.

Local deployment reveals more. Running systems yourself, without API smoothing, shows failure modes that cloud access hides.

Longer-term work

Some projects take time. We are developing methodology for assessing when AI reliance is reasonable in professional contexts. This work is informed by our daily use of these systems and will be published openly when ready.

Infrastructure

We operate local AI infrastructure for testing that cannot be done through API access alone. Complete control over test conditions, ability to run identical tests across model versions, and freedom to publish without commercial constraints.

Our hardware includes NVIDIA Blackwell, DGX systems and multi-GPU configurations capable of running current open-source models locally. All tests are documented with exact prompts, outputs, and analysis.

Writing

We publish what we learn. Observations, experiments, things that surprised us.

View all writing

Interested?

We are open to research collaboration, advisory conversations, and support for what we are building. If this sounds interesting, reach out.

Get in touch