All articlesWe blamed serverless for our latency. It was Next.js.

We blamed serverless for our latency. It was Next.js.

By Allan Clempe

engineering
performance
infrastructure

The complaint, and the wrong guess

People outside Australia told us the app felt slow. That tracked with how we're deployed: one AWS region, ap-southeast-2, in Sydney. From Sydney it's fast. From London or São Paulo you pay for the trip across the planet first. So our first guess was the obvious one: serverless cold starts. A Lambda that's been idle has to wake up, and if you're far away you're already starting from behind.

We were about to start warming functions and arguing about provisioned concurrency. Then we actually profiled it.

What the profiling showed

The cold start was real, but it wasn't mostly the serverless part. It was what we were loading during the cold start. The function spent most of its wake-up time bringing Next.js itself online, the App Router and all its server/client machinery, before a single line of our code ran.

We'd been blaming AWS for time that Next.js was spending. The distance to Sydney we couldn't change. The framework we were dragging into memory on every cold start we could.

That reframed the whole problem. It wasn't "make Lambda faster." It was "stop loading so much before we answer the request."

Why Next.js was heavy for us

Next.js isn't slow. It's "a lot of framework", and we were using a fraction of it.

We run the app as a server on Lambda behind CloudFront, provisioned with SST. In that setup, two things hurt:

  • Startup overhead. The bigger the server bundle, the longer the cold start. Next.js gave us a large one, and the App Router and RSC runtime are loaded whether or not the page needs them.
  • Payload weight. RSC streams a serialized component payload next to the HTML. For pages that are basically static (landing, docs, legal) we were shipping and parsing more than the page actually did. That's latency the user feels even after the server has responded.

None of this is wrong for the apps Next.js is built for. It was wrong for ours.

Swapping to TanStack Start

We rebuilt the app on TanStack Start: TanStack Router and Query on Vite and Nitro. Every route moved over.

The win is mostly subtraction:

  • Smaller server bundle. Vite and Nitro produce a leaner, more direct build than the Next.js output. Smaller bundle, faster cold start. That was the exact spike we set out to kill.
  • Server functions, not a framework boundary. Data loading is explicit createServerFn().handler(...) calls wired into route loaders. We send the data a page needs and nothing else. No RSC payload riding along for pages that are really just HTML.
  • The model matches the deployment. Nitro is built to target many runtimes, so "this is a server that becomes a Lambda" stopped being a translation we fought and became a supported build target.

Page loads recovered. The cold-start spike that had us reaching for provisioned concurrency mostly went away on its own, because there was far less to load.

What we didn't change

Worth being clear, because people read "framework swap" as "infra swap." It wasn't.

The services still run on AWS Lambda in Sydney. The web app is an sst.aws.TanStackStart component on nodejs22.x. The webhook handler is still its own Lambda. The database is still one Neon Postgres instance. We changed what runs inside the function, not where the function runs.

So we took the latency we control down close to its floor. The latency we don't control, a request from London still crossing the ocean to Sydney, is still there. That's the next problem.

Workers next

We already use Cloudflare for DNS. The next step is moving the app itself onto Cloudflare Workers, and the reason it's even realistic is the choice above: Nitro can target Workers. Because TanStack Start builds through Nitro, "deploy to a Worker" is a build-target change, not another rewrite. We didn't have that option on Next.js-on-Lambda.

Workers run close to the user and have near-zero cold starts. A request from London would hit a worker in London instead of waking a function in Sydney. For the parts of the app that don't need a fresh database read on every hit (static pages, cached reads, auth checks that can run at the edge), that turns an ocean crossing into a local hop.

It isn't free. The database is still in one region, so anything that needs a real write or an uncached read still pays to reach Sydney; the edge helps most where we can serve without that round trip. Moving the data itself is a separate, harder call we haven't made. But for everyone outside Australia, pushing compute to the edge is the biggest latency win left, and near-zero cold starts should push it further.

The takeaway

There are two kinds of latency: the kind you cause and the kind geography causes. We assumed we had the geography kind and went looking for it. We mostly had the kind we caused: a framework we were loading on every cold start. Leaving Next.js for TanStack Start fixed that without moving a single server.

Geography is next, and the groundwork's already laid. Same code, closer to the user. Workers are how we get there.