What does a typical engagement look like?
A written cost audit (~1 week, fixed-fee, refundable against the engagement that follows), then a 6-14 week migration: Audit → Plan → Port → Tune → Ship. Shadow traffic from week 3, eval-parity proof before any cutover, hosted baseline kept warm for 30 days post-cutover.
What if the migration doesn't work?
You stay on your hosted API. We charge for the audit and the work performed; you keep the runbooks and the cost model. We have walked away from engagements where the math did not justify the migration, that is what the audit is for.
How big is the team?
Two founders. Abhimanyu Singh leads inference-and-infra delivery end-to-end. Kaushlendra Kumar Giri (Edinburgh M.Sc. AI) leads optimization, quantization, and evaluation work. No project managers, no junior staffing, no offshore delivery centers. If we sign with you, you get us.
How many engagements do you take a year?
Small on purpose. We turn down work that is not a fit so the work we take gets the depth it needs. The cost audit (~1 week) is usually available within a week of the kickoff call.
Can you start without an audit?
In rare cases yes, usually when the migration is a follow-up to a prior engagement where we already have the cost model. For new clients, the audit is non-optional. It is the artefact that lets us both decide whether the math works.
Can I just hire you to optimize without migrating?
Yes, that is one of our most common engagements. If you are already self-hosted and your p95 latency or cost-per-token is the problem, the optimization engagement (kernels, quantization, batching, scaling) is fixed-scope and runs 4-8 weeks.