Back to blog
#case-study#productivity#solo-dev

Solo dev, 60 tasks, one night

Monday: project brief. Tuesday: 60 PRDs generated. Hit play, went to bed. Wednesday: 52 done, 5 failed, 3 skipped.

Solo dev, 60 tasks, one night

The project

A friend of mine runs a small gym. He asked me to build an app for his members: class bookings, check-in via QR code, payment tracking. Nothing crazy. A React Native frontend, a Node API, and a Postgres database. Standard indie dev stuff.

I said yes on Monday. He needed it in two weeks. I didn't have a team.

Monday: one brief, one prompt

I spent Monday afternoon writing a project brief. Not a spec, not a formal doc. Just a plain markdown file describing the app, the screens, the data model, and the user flows. About 1,400 words.

Then I opened Zowl's "Generate from Brief" feature and pasted the whole thing in. This capability, which automatically generates PRDs from a brief, is what made scaling from 1 person to handling 60 tasks overnight possible.

It took about three minutes. Zowl broke the brief into 60 individual tasks, each with its own PRD. Each PRD had a description, acceptance criteria, the files it would likely touch, and a dependency list linking it to other tasks. The dependency graph meant things like "create the database schema" came before "build the booking endpoint," which came before "build the booking screen."

60 tasks from 1,400 words of English. Honestly, I didn't expect it to get the dependency order right. But it did, mostly.

Tuesday afternoon: the review pass

Here's where people make mistakes. They generate tasks and immediately hit run. Don't do that.

I spent about 90 minutes reviewing every PRD. Most were fine. But four had problems.

Task #14: "Implement payment processing." Too vague. Which payment provider? What's the flow? I added: "Use Stripe. Accept card payments via the Stripe React Native SDK. Create a /payments endpoint that creates a PaymentIntent server-side. Store transaction records in the payments table with status, amount, and stripe_payment_id."

Task #27: "Add push notifications for class reminders." The PRD said "send push notifications" but didn't specify the timing or the service. I updated it: "Use expo-notifications. Schedule a local notification 1 hour before the user's booked class. No server-side push for v1."

Task #38: "Build the admin dashboard." The PRD referenced a "stats overview page" but the brief never mentioned specific metrics. I added: "Show total active members, bookings this week, revenue this month. Three cards at the top, a line chart of daily bookings below. Use Recharts."

Task #51: "Handle expired memberships." The generated PRD assumed memberships had an end date field. They didn't, because my data model used a "months_remaining" counter instead. I rewrote the acceptance criteria to match the actual schema.

Four tasks out of 60 needed real intervention. That's a 93% hit rate on the auto-generation, which I'll take any day.

Zowl also flagged two "gap questions" during generation. Things it couldn't resolve from the brief alone. One was "Should class capacity be enforced at booking time or only shown as a warning?" and the other was "What happens when a member tries to book two overlapping classes?" Good questions. I answered both directly in the PRDs.

Tuesday night

9:47 PM. All 60 tasks reviewed. Dependencies confirmed. Pipeline set to NightLoop (pre-check, implement, validate on each task).

I hit Start All, plugged in the charger, and went to sleep.

Nothing dramatic about it. That's kind of the point.

Wednesday morning: the results

I opened Zowl at 6:30 AM with coffee in hand. The dashboard showed:

  • 52 tasks: passed
  • 5 tasks: failed
  • 3 tasks: skipped (dependencies on failed tasks)

52 out of 60 on the first run. Let me walk through what failed.

Task #22: "Build the check-in QR scanner screen." Failed at validation. The agent used expo-barcode-scanner, which is deprecated. The validator caught the deprecation warning in the build output and flagged it. Good catch. I updated the PRD to specify expo-camera with barcode scanning mode instead.

Task #31: "Add class cancellation with refund logic." Failed at the pre-check stage. The agent read the codebase and realized the payments table (built by task #14) didn't have a refund_status column. The pre-check said: "Cannot implement refund tracking without a refund_status field. SKIP or update schema." That's exactly right. I added a migration step to the PRD.

Task #39: "Admin dashboard authentication." Failed at implementation. The agent tried to import a middleware function from src/middleware/auth.ts, but the file was actually at src/lib/auth.ts. My brief wasn't consistent about the directory structure, and the agent guessed wrong. Easy fix: I added the correct import path to the PRD.

Task #44: "Booking conflict detection." Failed at validation. The implementation worked, but the test the agent wrote had a timezone bug. It compared UTC timestamps against local time strings. The validator ran the tests, saw the failure, reported it. I added "all datetime comparisons must use UTC" to the PRD.

Task #48: "Member profile photo upload." Failed at implementation. The agent tried to use multer for file uploads but configured it for Express when my API used Fastify. Classic framework mismatch. Updated the PRD to specify Fastify's multipart plugin.

Look at those failures for a second. Not one of them was a random hallucination or a garbage output. Every failure was a real, specific, catchable engineering problem. Missing columns, deprecated packages, wrong import paths, timezone bugs, framework mismatches. These are the same bugs a human developer would create at 2 AM. The difference is the validation step caught them before I even woke up.

The re-run

I fixed all five PRDs. Took about 20 minutes. Then I selected just those five tasks plus the three that were skipped, and ran the pipeline again.

All eight passed.

Total time from "start pipeline" to "60 tasks done": about 9 hours of compute, 8 of which I was asleep.

The human math

Here's what I actually spent time on:

| Activity | Time | |---|---| | Writing the project brief | ~2 hours (Monday) | | Reviewing and fixing PRDs | ~90 minutes (Tuesday) | | Fixing 5 failed PRDs + re-run | ~20 minutes (Wednesday) | | Code review of all 60 outputs | ~2 hours (Wednesday) | | Manual fixes during review | ~45 minutes (Wednesday) | | Total human time | ~6.5 hours |

Without Zowl, 60 tasks at maybe 20 minutes each (these were small, scoped tasks) would be 20 hours of focused coding. Probably spread across 4-5 days with breaks, context switching, and the inevitable afternoon slump.

I did it in a day and a half, with most of the actual coding happening while I slept.

What I'd do differently

My PRDs for the payment and refund tasks should've been more detailed from the start. When you're dealing with money, "add payment processing" is never enough context. Be specific about providers, flows, error states, and data models.

I'd also run the first 10 tasks during the day to catch structural issues early. If task #2 puts files in the wrong directory, tasks #3 through #60 will inherit that mistake. A small daytime batch would've caught the auth middleware path issue (task #39) before it cascaded.

What this actually means

I'm one person. I don't have a team. I've got a MacBook, a list of things to build, and about 6 hours of focused energy per day before my brain turns to static.

Zowl doesn't write better code than me. It writes more code than me, while I'm unconscious. And the validation pipeline catches enough mistakes that the morning review is fixing edge cases, not debugging disasters. That gym app shipped on time. My friend doesn't know a single line was written at 3 AM by an AI agent managed by a tool that started as a bash script called nightloop.sh. If you're curious how to build similar tools or orchestrate your own agents at scale, check out how we approach CLI integration and learn more at Zowl. He just knows it works.