How to Write a PRD: Reduce Testing Time by Over 50% with Vibe Coding

Last week, I reviewed a Product Requirement Document (PRD) and found myself restless by page 27.

It wasn’t poorly written. On the contrary, it was comprehensive. Background, goals, pages, fields, processes—everything was there. It looked like a standard answer.

But one question kept popping into my mind:

Can this document help AI develop the product with less rework?

The answer was quite harsh.

No.

Because it clearly described what the product looks like but failed to clarify how developers would determine when it was finished. More troubling, it didn’t specify how testing should prove that nothing was overlooked.

This is the pitfall many encounter with vibe coding: assuming that handing requirements to AI will speed up development. The code may be produced quickly, but testing turns into a patch-up job.

You change one page, and another entry breaks.

You fix one field, and three boundary paths collapse.

You ask AI to generate tests, and while it does, the tests are disconnected from the actual requirements.

It’s frustrating.

Why Does the PRD Consume Testing Time?

I used to think that slow testing was due to late test case writing.

But I found that wasn’t the case.

Testing is often slow because the PRD is written too much like a document and not like executable constraints.

Traditional PRDs often state things like:

Users can create tasks
Support for editing and deleting
The system automatically scores
Display statistical results

These statements seem fine.

But when testers open them, a series of questions arise:

What fields need to be validated when creating a task?
What happens to associated data when deleted?
Does a failure in automatic scoring count as a failure, or does it go into a pending state?
Are statistical results calculated in real-time or from cached data?
What message is displayed when a user lacks permission?

Without these questions addressed in advance, the testing phase turns into a large Q&A session.

A PM adds a comment, the development team revises, and the testers have to retest. Back and forth, an entire afternoon can be lost.

Thus, I increasingly believe that the real time savings in vibe coding come not from AI writing tests but from exposing testing issues during the PRD writing phase.

That’s where the 50% reduction comes from.

It’s not about testing half as much. It’s about eliminating half the time spent discovering that the requirements weren’t clear.

A PRD Suitable for Vibe Coding Must Be Divided into Two Layers

If you only write one massive PRD, both AI and humans will struggle.

My suggestion is to split it into two layers: one for the overall introduction and one for individual User Stories (US).

The overall introduction document does not need to elaborate on all details; it serves as a “map.” It should include product background, target users, business loops, function maps, page maps, US index, terminology, and non-functional requirements.

This is akin to reviewing a floor plan before renovation. You need to know where the kitchen is, where plumbing and electricity run, and which walls are load-bearing. Don’t start discussing the color of a drawer handle right away.

The individual US document is akin to a construction order.

It should clearly state:

As a [role], I want to [action] so that I can [benefit].
What are the preconditions?
What are the business rules?
What are the field types, required fields, and constraints?
How does the state change?
What does the page wireframe look like?
How is the normal path validated?
How is the boundary path validated?
What is explicitly not included?

Note the “not included.”

Many AI projects fail due to this oversight. If you don’t tell AI what not to do, it may enthusiastically add a bunch of “smart-sounding” features.

It may seem fine initially, but before launch, you’ll be left with a mess.

50% of Testing Time is Saved from the US Document

Let me share a specific writing method.

Each US must have two tables: normal path acceptance criteria and boundary path acceptance criteria.

What goes in the normal path?

It describes what the system must produce when a user performs an action under the preconditions.

For example:

Given (Given) When (When) Then (Then)

Teacher is logged in, class exists
Publish a task containing 10 questions
The system creates the task and enters a waiting state for responses

For boundary paths, it describes situations that are most likely to cause the system to malfunction.

For example:

Given (Given) When (When) Then (Then)

Task has been published
Teacher attempts to delete the task
The system prohibits deletion and prompts archiving first

By writing these two tables, testing time will immediately shorten.

Why?

Because testers no longer have to guess what the PM wants to test, and developers don’t have to guess if the AI-generated implementation is correct. Each US is a small closed loop that is ready for development, testing, and regression from the start.

I would even go further: US without acceptance criteria should not be allowed to enter development.

It may sound bureaucratic.

But it actually saves lives.

Agent Products Require Clear “Judgments” to Avoid Testing Explosions

If you are developing a standard CRUD product, writing fields and processes in the PRD can still hold up.

But if you’re developing an Agent product, merely writing pages isn’t enough.

The real challenge in testing Agent products isn’t the buttons.

It’s the judgments.

When a user says, “Help me organize the meeting minutes,” should the system generate a summary, extract tasks, or write a report? When should it ask follow-up questions? When can it execute directly? When must it pause for user confirmation?

If these aren’t written down, the testing phase will become mystical.

The same input may pass today and be deemed non-compliant tomorrow.

AI fears such vague requirements, as they are hard to grasp and clarify.

Thus, the PRD for Agent products should additionally cover three areas:

Intent space: What users may genuinely want to accomplish
Tool invocation rules: Under what conditions to call search, knowledge base, calendar, or ticketing systems
Boundary conditions: How to handle insufficient information, tool failures, high-risk actions, and conflicting resolutions

These three areas are not advanced configurations.

They are the sources of test cases.

For instance, for a high-risk action like “creating a task,” the PRD must state: if the responsible person and deadline are unclear, only a draft can be generated, and it cannot be submitted automatically. This makes testing clear: construct an input lacking a responsible person and see if the system halts.

This is far better than having users complain post-launch.

The Correct Rhythm for Vibe Coding: Not Generating the Entire Project at Once

I’ve seen the worst vibe coding approach, which is to hand a large PRD to AI and say:

“Help me implement the entire system.”

This does not improve efficiency.

It’s like opening a blind box.

A more stable rhythm should be:

Write the overall PRD, locking in business loops, roles, function maps, and page maps.
Break it down into US, with each US handling one minimal business loop.
Provide the sequence for building US, adhering to “produce data first, then consume data.”
Write acceptance criteria and boundary paths for each US.
Let AI develop based on individual US.
After completing each US, generate a test acceptance report and an independent review report.
Only proceed to the next US after approval.

This process may seem slow, but it’s actually fast.

Because AI no longer needs to guess your intentions within a pile of context. It just needs to complete a small closed loop and then use the test report to prove it hasn’t strayed.

There’s a key principle here: first create US that produce data, then those that consume data.

For example, have schools, grades, classes, and students established before tasks can be issued; have tasks and answer records in place before monitoring statistics can be done. If you start with the statistics page, you can only use fake data, and every change to the underlying capabilities will require rework on the statistics page.

This isn’t AI’s fault.

It’s a matter of construction order.

The Real Reduction in Testing Time Comes from the “Evidence Chain”

I now require each US to include a source of evidence.

Evidence can come from PRD chapters, user research, business rules, interface documents, or historical issues. The format isn’t complicated; just one table will suffice:

Evidence Number	Source Document	Original Summary	Supporting Content
E-001	PRD Chapter 3	Teachers can publish tasks to classes	Supports task publishing entry
E-002	Business Rules	Published tasks cannot be deleted	Supports deletion boundary

What’s the use of this table?

It ensures that testing and reviews no longer rely on intuition.

After AI produces a feature, the review report can directly ask: does this feature cover E-001? Does it violate E-002? If there’s no evidence, don’t add features; if evidence conflicts, go back to the document for corrections.

This may seem a bit cold.

But when the project reaches its later stages, you’ll be grateful for it. Because every regression can trace back to the source rather than relying on a vague recollection from a group chat.

That recollection can be very costly.

If You Remember Only One Thing: PRD Must Be Written for Testing to Take Over Directly

So, how should a PRD be written to allow vibe coding to reduce testing time by over 50%?

My answer is simple:

Write the PRD so that testing can take over directly, rather than just being understandable to development.

If development can only vaguely understand it, they can start coding.

If testing can take over directly, it indicates that boundaries, states, acceptance criteria, scope, and evidence have all been solidified.

From this perspective, a PRD is not just a document.

A PRD is the track for AI development and the map for testing regression.

If you lay the track straight, AI runs fast; if you draw the map clearly, testing takes less time.

I cannot guarantee that every project will stably save 50%. This figure depends on the complexity of requirements, team habits, and AI tool capabilities.

But I can say that if your current testing time is largely spent on “requirements not being clear, boundaries not written, and rework retesting,” then breaking the PRD into an overall introduction + individual US + acceptance criteria + evidence chain can indeed save half of the ineffective testing time, and that’s not an exaggeration.

It might even be more.