For a while, kentcdodds.com had two separate deployable things living in the same git repo:
- The main site (React Router, Remix before that, SQLite, deployed to Fly)
- An OAuth worker (Cloudflare Worker)
Then yesterday, I added two more deployable things:
- A Call Kent audio worker (Cloudflare Worker)
- A Call Kent audio container (separate Docker container)
Each of them had their own package.json, their own lockfile, their own
tsconfig.json, and their own idea of how things should be wired together. The
root package.json belonged to the site and treated everything else as optional
siblings. The monorepo structure existed in the folder tree but not in the
package manager.
That wasn't catastrophic. But it was annoying and I knew I really should just embrace the monorepo. The repo already was a monorepo. The layout just hadn't caught up.
What we changed
The migration had a single structural rule: everything runnable lives under
services/*.
services/
site/ ← the main app
oauth/ ← Cloudflare OAuth worker
call-kent-audio-worker/ ← Cloudflare audio worker
call-kent-audio-container/ ← Docker audio container
The root package.json became a thin orchestration layer. It owns the
workspace declaration, Nx, and convenience scripts that forward into the site
workspace:
{
"name": "kcd-workspace",
"private": true,
"workspaces": ["services/*"],
"scripts": {
"dev": "npm run dev --workspace kentcdodds.com",
"build": "npm run build --workspace kentcdodds.com",
"typecheck": "npm run typecheck --workspace kentcdodds.com",
"typecheck:all": "nx run-many -t typecheck"
},
"devDependencies": {
"nx": "^22.5.4"
}
}
The real app scripts stayed in services/site/package.json, where they belong.
ci:verify, test:browser, build, postinstall - all of it lives there,
scoped to the thing that actually needs it.
Three old nested lockfiles (call-kent-audio-container/package-lock.json,
call-kent-audio-worker/package-lock.json, oauth/package-lock.json) were
deleted and replaced by one root lockfile. That made the raw diff stat look
alarming (726 files, 21,000 deletions) but the bulk of it was three lockfiles
evaporating. The actual logic changes were modest.
How Nx fits in
We kept Nx intentionally minimal. There's one nx.json at the root with
caching defaults and package-script inference:
{
"namedInputs": {
"sharedGlobals": [
"{workspaceRoot}/package-lock.json",
"{workspaceRoot}/tsconfig.base.json",
"{workspaceRoot}/nx.json"
]
},
"targetDefaults": {
"build": { "cache": true, "inputs": ["production", "^production"] },
"lint": { "cache": true, "inputs": ["default", "^default"] },
"typecheck": { "cache": true, "inputs": ["default", "^default"] },
"test": { "cache": true, "inputs": ["default", "^default"] }
}
}
No hand-authored project.json files. No plugin configuration beyond what Nx
infers. The payoff came from the structure itself, not from the tool.
What the services/* constraint exposed
This is the part worth actually talking about.
When you enforce that every runnable thing has its own package under
services/*, you immediately learn which assumptions your code was making about
where it was running from. We found three categories of breakage.
1. Package import aliases stopped working
The site had a #other/* import alias defined in the root package.json. Once
the site became services/site and got its own package boundary, Node rejected
any import that pointed outside that boundary:
ERR_INVALID_PACKAGE_TARGET: Package subpath '#other/semantic-search/...'
is not defined in "services/site/package.json"
The alias #other/* resolved to ./other/* relative to the package root, but
from services/site, other/ is two levels up and outside the package. Node
refuses that. The fix was mechanical but educational: replace the aliases with
explicit relative paths:
- } from '#other/semantic-search/ignore-list-patterns.ts'
+ } from '../../../../other/semantic-search/ignore-list-patterns.ts'
Not pretty, but I only have two of these so I don't really care much (I'm barely looking at the code anymore anyway).
2. Production went down because content moved
This one stung.
The site fetches blog posts, talks, testimonials, and other content from GitHub via the API at runtime. That code had a hardcoded path prefix:
const mdxFileOrDirectory = `content/${relativeMdxFileOrDirectory}`
After the migration, the content was at services/site/content/ in the repo,
not content/. The GitHub API was dutifully returning 404s for everything.
Production was down.
The fix was to centralize all content path logic in a new utility:
// services/site/app/utils/github-content-paths.server.ts
export const GITHUB_CONTENT_PATH = 'services/site/content'
export function getGitHubContentPath(relativePath: string): string {
return `${GITHUB_CONTENT_PATH}/${relativePath}`
}
And then use it at every callsite:
- const mdxFileOrDirectory = `content/${relativeMdxFileOrDirectory}`
+ const mdxFileOrDirectory = getGitHubContentPath(relativeMdxFileOrDirectory)
The lesson here is don't merge a 726-file structural refactor from your phone while you're away from home without pulling it down and running it locally 😆. Honestly, I'm not sure even that would have been enough. The Cursor Cloud Agent had a working demo. The problem was the GitHub API mock I had for local development and testing handled the path change fine, but not the actual implementation 🙈
Once the path was fixed, I also made the site more resilient to future GitHub API failures. Rather than crashing or returning empty pages when content can't be fetched, each relevant route now returns a graceful fallback with a message and a direct link to the GitHub repo. So at least users have somewhere to go if the integration is broken. Better late than never.
3. Docker stages have their own dependency graph
After moving the site to services/site, the Dockerfile was updated to build
from the new path. The production-deps stage copied services/site/package.json
but not services/site/prisma/. Two other stages actually need the Prisma schema:
- The
depsstage runsnpm install, which triggerspostinstall: prisma generate - The
buildstage runsnpx prisma generateexplicitly before building the app
The production-deps stage doesn't run either of those, so it's not entirely
clear which stage the failure manifested in. But the schema was missing where it
was needed, and the fix was two lines:
ADD services/site/package.json /app/services/site/package.json
+ ADD services/site/prisma /app/services/site/prisma
+ ADD services/site/prisma.config.ts /app/services/site/prisma.config.ts
ADD services/oauth/package.json /app/services/oauth/package.json
The reason this one wasn't caught is because Cursor Cloud Agent's don't have support for building Docker images (which is surprising to me, maybe I'm doing something wrong?). So when I asked it to build the Docker image to make sure things would work, it just said it couldn't but it was "confident" 😆 And my hubris was my demise 💀
CI got restructured around the actual workload
Before the migration, CI ran a workspace-wide install and then ran everything. That was fine when there was effectively one package. With real service boundaries, it made more sense to optimize around the actual usage pattern.
The site changes roughly much more often than the workers do. So site CI now does a site-only install:
- name: 📥 Install site deps
run: npm ci --workspace=kentcdodds.com
Rather than pulling the full dependency graph. The worker pipelines mirror this: each one installs only its own workspace when they need to run.
The other meaningful CI change: browser tests were always part of ci:verify,
but the Playwright browser binaries were never installed in the gate job. It
worked before because the old CI didn't include browser tests in the gate. After
the migration restructured the gate job, that assumption surfaced immediately as
a CI failure:
browserType.launch: Executable doesn't exist
Fixed by adding a cached Playwright browser install step before ci:verify:
- name: 🧰 Cache Playwright browsers
id: playwright-cache
uses: actions/cache@v5
with:
path: ~/.cache/ms-playwright
key: playwright-${{ runner.os }}-node${{ env.NODE_VERSION }}-${{ hashFiles('package-lock.json') }}
- name: 🌐 Install Playwright browsers
if: steps.playwright-cache.outputs.cache-hit != 'true'
run: npm run test:e2e:install --workspace kentcdodds.com
What I'd take away from this
Don't ask an agent how confident it is that something won't break. Make it prove it to you. If it's not able to, then give it the tools it needs to do that or pull it down and verify things locally yourself.
For my website, it's not a huge deal if the site goes down for half an hour or something, so I'm generally pretty lax on things with my website. In a production application with millions of users, I would definitely be more careful about this, and we'd have staging environments or at least preview deploys and stuff to avoid production downtime.
Nx was useful mostly for caching. The services are technically inter-dependant, but they don't really share code or have hard dev dependencies on each other. The structure was the actual win.