The External CTO Playbook for Pre-Seed AI Startups

Why AI startups need a different playbook

The classic fractional-CTO playbook — architecture audit, hiring rubric, 90-day roadmap — works fine for a pre-seed SaaS company. It misses three things that consistently sink pre-seed AI startups:

Vendor risk that can flip your unit economics overnight.
Evaluation infrastructure that determines whether the product actually works.
Agent and pipeline architecture decisions that are hard to reverse.

A good external CTO at an AI startup spends most of their first 90 days on those three things, not on the generic items. Here is what that looks like.

Vendor risk: assume the price doubles

If your product depends on a single model provider, your gross margin is hostage to that provider's pricing decisions. The OpenAI / Anthropic / Google price changes in 2024 and 2025 surprised dozens of startups; some did not survive.

The external CTO's job is to push the team toward provider abstraction from day one. Not because you should constantly switch, but because the ability to switch is what gives you negotiating leverage and runway protection. Concretely:

All model calls go through one internal interface, not directly to a vendor SDK.
Costs are tracked per feature, not per company. You should know which feature breaks if a price doubles.
At least one open-weights fallback is implemented and tested for the highest-volume path, even if you do not use it day-to-day.

This is unglamorous infrastructure work. It is the difference between surviving a vendor price change and not.

Evaluation: if you cannot measure, you cannot improve

A pre-seed AI startup with no eval framework is flying blind. The product feels right when the founder uses it. It might be wrong half the time for real users and the team has no way to know. The first version of the product ships, customers complain, and the team has no instrumentation to find the failure mode.

The external CTO's job in the first 30 days is to insist on a basic eval framework. It does not need to be sophisticated:

50 to 100 representative inputs covering the main use cases and the failure modes you have heard about.
An automated pass that grades outputs against expected behavior (LLM-as-judge is fine for a first pass).
A weekly run that produces a single number the team can watch trend up or down.

Once this exists, every change is testable. Without it, every change is hope. The startups that get this right outpace ones that do not within a few months.

Agent architecture: the choices you cannot reverse

By 2026, most AI products have agentic components — multi-step workflows that combine model calls with code execution, tool use, and external API calls. The architectural decisions here are unusually sticky.

Specifically, three decisions that are hard to unwind:

State location. Do agent runs hold state in-memory, in a database, or in a queue? Mixing these is a long-term pain.
Failure mode contract. What happens when an agent step fails partway through? "It retries silently" looks fine until production traffic, and then becomes the bug you cannot reproduce.
Observability. What gets logged, with what privacy properties, and how long is it kept. Decisions you defer here lead to compliance scrambles in year two.

A good external CTO will surface these decisions explicitly and force the team to make them deliberately, not by drift.

What does not need to be different

A lot does not change. The hiring rubric still matters. The deploy pipeline still needs to be solid. The incident response still has to be written down. These items are not AI-specific; the playbook for them is the same as any SaaS startup.

The right hire at pre-seed AI

The external CTO who fits a pre-seed AI startup has shipped at least one production AI product end-to-end, has war stories about cost and reliability, and treats vendor neutrality, evaluation, and agent architecture as non-optional items in the first 60 days. The CTO who treats AI as a feature on top of a normal SaaS roadmap is probably the wrong fit. Ask in the first interview: how would you spend the first month addressing vendor risk and eval? The answer will tell you everything.