fastpriorsTalk to an engineer
about fastpriors

A small firm.
An opinionated one.

We're a four-person consulting practice for AI-native companies that have outgrown hosted inference. We do migrations, optimization, custom kernels, and the unglamorous SRE work that keeps it all up at 3am.

principles

What we believe.

Stated up front so you can decide whether we'll get along before the kickoff call.

Sovereignty by default

Your weights, your data, your VPC. We never touch production credentials we don't need, and we hand them back when we leave.

Predictable bills

No revenue share, no token margin, no platform that magically gets more expensive at scale.

Six engagements a year

We stay small on purpose. You get senior engineers, not an account manager and a PowerPoint.

Code that survives our exit

Documented, testable, runbook'd. We measure success by what works after we leave, not what breaks if you fire us.

the team

Four people.
Forty years of GPUs.

Every engagement is run by a senior engineer from start to finish. No hand-offs, no junior-on-junior staffing, no offshore "delivery centers."

portrait · aarav
Founder · Inference

Aarav Shenoy

Ex-Together AI. Built distributed serving for a 1.4M-RPS RAG product.

portrait · mira
Hardware · GPU

Mira Halvorsen

Ex-NVIDIA TensorRT. CUDA kernels, FP8 quantization, MoE routing.

portrait · joon-ho
Reliability · Eval

Joon-ho Park

Ex-Stripe SRE. Builds the eval harnesses that catch regressions in production.

portrait · dr.
Research advisor

Dr. Selma Okafor

Compiler optimization, sparse attention, posts at NeurIPS we mostly understand.

where we are

Distributed by design.

We work where our clients run their hardware. Office hours overlap on Wednesdays.

Bangalore
engineering · founder
IST
San Francisco
client work · go-to-market
PST
Berlin
research · advisory
CET
Remote
wherever the colos are
UTC±

Want to work together?

Book a 30-min call. We'll tell you on the call whether we can help.

Talk to an engineer →