Writing Great Engineering Documentation

Clear documentation enables people to work independently, makes knowledge transferable, and saves time.

Mindset: Documentation should invite scrutiny. Be clear enough that errors are obvious.

This embodies /pb-preamble thinking (clear writing enables critical thinking, ambiguous docs hide flawed thinking) and applies /pb-design-rules thinking, particularly:

Key Design Rules for Documentation:

Clarity: Documentation must be crystal clear so readers immediately understand the system
Representation: Information architecture matters-organize docs so knowledge is findable, not buried
Least Surprise: Documentation should behave like readers expect; no hidden gotchas or contradictions

Resource Hint: sonnet - Documentation writing is implementation-level work; routine quality standards.

When to Use This Command

Writing new docs - Creating READMEs, guides, API docs
Improving existing docs - Docs review found issues to fix
Onboarding prep - Ensuring docs support new team members
Knowledge transfer - Capturing tribal knowledge before someone leaves
Architecture documentation - Documenting system design decisions

Purpose

Good documentation:

Enables onboarding: New people learn faster
Preserves knowledge: Doesn’t disappear when people leave
Reduces questions: People can find answers themselves
Saves debugging time: Common issues documented with solutions
Improves quality: Explains design, catches inconsistencies
Enables async work: Remote teams need written context

Bad documentation:

Outdated (last updated 2 years ago)
Incomplete (“see code for details”)
Wrong (misleading, inaccurate)
Scattered (spread across 10 places)
Unreadable (walls of text, no examples)

Documentation Levels

Level 1: Code Comments

Purpose: Explain why code exists, not what it does.

Good code is self-documenting:

# Bad
x = y + 2  # Add 2
delay = 1000 * 60  # Delay

# Good
buffer_size = max_size + overhead  # Account for header
wait_time_ms = seconds_to_wait * 1000  # Convert to milliseconds

What to comment:

Why a non-obvious approach was chosen
Warning about common mistakes
Reference to related code
Complex logic (but usually means refactor instead)

# Bad comment (obvious)
def add(a, b):
    # Add a and b
    return a + b

# Good comment (explains non-obvious)
def calculate_deadline(start_time):
    # Add 5 days but skip weekends (business days only)
    # See accounting_spec.md for requirements
    days = 5
    current = start_time
    while days > 0:
        current += timedelta(days=1)
        if current.weekday() < 5:  # 0-4 = Mon-Fri
            days -= 1
    return current

Level 2: Function/Module Documentation

Purpose: Tell someone reading code what it does and how to use it.

def create_order(customer_id, items, payment_method):
    """
    Create a new order for a customer.

    Args:
        customer_id: ID of customer placing order
        items: List of {product_id, quantity}
        payment_method: "credit_card" or "bank_transfer"

    Returns:
        Order object with fields: id, status, total, created_at

    Raises:
        ValueError: If items is empty
        PaymentError: If payment fails

    Note:
        - Inventory is decremented immediately
        - Email confirmation sent asynchronously
        - See order_processing.md for state diagram
    """

TypeScript/JavaScript:

/**
 * Fetch user profile with optional caching
 *
 * @param userId - User ID to fetch
 * @param options.useCache - Cache result for 5 minutes (default: true)
 * @returns Promise resolving to User object
 * @throws NotFoundError if user doesn't exist
 *
 * @example
 * const user = await fetchUser('user_123');
 * const freshUser = await fetchUser('user_123', { useCache: false });
 */
async function fetchUser(userId: string, options?: { useCache?: boolean }): Promise<User> {

Level 3: API/Integration Documentation

Purpose: Help someone use the API/service without reading code.

# Payment API

## Overview
The Payment API handles charging customers, refunds, and payment status.

## Base URL
`https://api.example.com/v1`

## Authentication
All requests must include header: `Authorization: Bearer {token}`

## Endpoints

### Create Order

POST /orders Content-Type: application/json

Request: { “customer_id”: “cust_123”, “items”: [ {“product_id”: “prod_1”, “quantity”: 2} ], “payment_method”: “credit_card” }

Response (201): { “id”: “order_456”, “status”: “pending_payment”, “total”: 99.99, “created_at”: “2026-01-11T14:30:00Z” }

Error (400): { “error”: “missing_required_field”, “message”: “items cannot be empty” }


## Rate Limiting
100 requests per minute per API key

## Webhooks
- `order.created` - Order created
- `payment.succeeded` - Payment processed
- `payment.failed` - Payment failed

See webhook specification in #webhooks section

Level 4: System Documentation

Purpose: Help someone understand how systems fit together.

What to include:

# Payment System Architecture

## Purpose
Process payments, handle refunds, track payment status.

## Components
- Payment API (Node.js)
- Payment Database (PostgreSQL)
- Stripe integration (external)
- Webhook handler (async processor)
- Audit log (for compliance)

## Diagram

User → Payment API → Stripe ↓ Payment DB Audit Log


## Data Flow
1. User submits payment
2. API sends to Stripe
3. Stripe responds with status
4. API stores in DB
5. Webhook fires (order.paid)
6. Email sent asynchronously

## Key Decisions
- Why Stripe? See ADR-2024-001
- Why PostgreSQL? See ADR-2024-002

## Scaling Concerns
- Stripe timeout handling (retry with exponential backoff)
- Audit log growth (partition by date)

## Related Systems
- Order system (creates orders)
- Email system (sends confirmations)
- Billing system (monthly invoices)

## Runbooks
- Payment processing stuck: See runbook-payment-stuck.md
- Database grew too large: See runbook-db-size.md

Level 5: Process Documentation

Purpose: Help someone follow a process or handle an event.

# Release Process

## Overview
Releasing code to production involves building, testing, and deploying.

## Steps
1. Create release branch (release/v1.2.3)
2. Update CHANGELOG
3. Tag commit (v1.2.3)
4. Build Docker image
5. Deploy to staging
6. Run smoke tests
7. Deploy to production
8. Monitor for errors

## Detailed Steps

### 1. Create Release Branch
```bash
git checkout -b release/v1.2.3 main

Why: Isolates release prep from ongoing development

2. Update Changelog

Edit CHANGELOG.md:

Add new version (v1.2.3)
List features added, bugs fixed, breaking changes
Include author names

Example:

## [1.2.3] - 2026-01-11
### Added
- Support for bulk user import (#234)
- New analytics dashboard (#245)
### Fixed
- Bug: Orders not showing in some cases (#240)
### Breaking
- Removed deprecated /v1/orders endpoint

3. Tag Commit

git tag -a v1.2.3 -m "Release version 1.2.3"
git push origin v1.2.3

4. Build Docker Image

CI/CD automatically builds when tag pushed. Check: CI pipeline passes all checks.

5. Deploy to Staging

./deploy staging v1.2.3
./run-smoke-tests staging

Check:

Smoke tests pass
No errors in logs
Performance acceptable
Database migrations successful

6. Deploy to Production

./deploy production v1.2.3

Monitor:

Error rate (should be same as before)
Latency (should be same as before)
Resource usage (should be reasonable)
User complaints (check Slack)

7. Post-Release

Send release notes to stakeholders
Update documentation
Monitor for issues
Be available for next 2 hours

Rollback

If something breaks:

./deploy production v1.2.2

Fast: < 2 minutes Safe: Previous version still tested


---

## Writing Guidelines

### 1. Know Your Audience

Different people need different docs:

Junior Developer:

Detailed step-by-step
Explain assumptions
Show examples
Link to further reading

Experienced Developer:

Quick reference
Why, not what
Key decisions/gotchas
Links to detailed docs

DevOps Engineer:

Architecture overview
Infrastructure requirements
Scaling considerations
Monitoring/alerting


### 2. Use Clear Structure

Bad:

The system works by first doing thing A which connects to thing B and then thing C happens which processes the data from B, so then you get the result in D. Sometimes if D fails you should check B.


Good:

How the system works

Data Collection (Component A) Gathers input from users
Processing (Component B) Transforms data according to rules
Storage (Component C) Saves result to database

If processing fails

Check Component B logs for errors


### 3. Show Examples

Always show examples, even for simple things.

Bad:

Use the create_order function to create orders.


Good:

Use the create_order function to create orders:

order = create_order(
    customer_id="cust_123",
    items=[
        {"product_id": "prod_1", "quantity": 2},
        {"product_id": "prod_2", "quantity": 1}
    ]
)
print(f"Order created: {order.id}")

Common mistakes

Empty items list (will raise ValueError)
Forgetting payment method (will fail at checkout)


### 4. Keep It Updated

**Stale docs are worse than no docs.**

Outdated docs:

Installing

Clone the repo
Install Node 14 ← Node 14 is deprecated!
Run npm install
npm start


Fix:

Installing

Clone the repo
Install Node 18+ (required)
- macOS: brew install node@18
- Ubuntu: sudo apt-get install nodejs=18.*
Run npm install
Run npm start

Last updated: 2026-01-11


**How to keep docs updated:**

Link docs in code review (remind people they exist)
Update docs in same PR as code change
Schedule quarterly review (is this still accurate?)
Delete docs that no longer apply
Note last-updated date prominently


### 5. Use Visuals

Pictures convey information faster.

Text:

The system has a frontend that talks to an API which talks to a database and also talks to an external payment service.


Diagram:

┌─────────┐ ┌─────┐ ┌──────────┐ │Frontend │─────→│ API │──────→│ Database │ └─────────┘ └─────┘ └──────────┘ │ ↓ ┌──────────────┐ │Payment Service│ └──────────────┘


Tools:
- **Mermaid**: Embed diagrams in markdown
- **Excalidraw**: Draw diagrams quickly
- **Lucidchart**: More complex diagrams
- **ASCII art**: Simple diagrams in text

### 6. Link, Don't Repeat

Bad:

API Documentation

The API requires authentication… (then 500 words about auth)

Database Documentation

The database requires authentication… (same 500 words repeated)


Good:

API Documentation

See Authentication section below.

Database Documentation

See Authentication section below.

Authentication (Single Source of Truth)

[Detailed auth explanation once]


### 7. Make It Scannable

People don't read documentation linearly. They scan.

Bad:

To set up, first you need to have docker installed, you can get it from docker.com, then you run docker-compose up which will start the database, after that you can run npm install and then npm start to start the server


Good:

Setup

Prerequisites

Docker installed from docker.com
Node 18+
npm 9+

Steps

Start database: docker-compose up -d
Install dependencies: npm install
Start server: npm start
Visit http://localhost:3000


---

## Documentation Templates

### README.md Template

```markdown
# Project Name

Short description of what this does.

## Features
- Feature 1
- Feature 2

## Quick Start

### Prerequisites
- Node 18+
- PostgreSQL 14+

### Installation
```bash
git clone ...
cd ...
npm install
npm run setup-db
npm start

Visit http://localhost:3000

Documentation

Getting Help

Slack: #engineering
Issues: GitHub issues
Email: team@example.com


### API Documentation Template

```markdown
# API Name

## Overview
What does this API do?

## Base URL
`https://api.example.com/v1`

## Authentication
How to authenticate?

## Endpoints

### Create Resource

POST /resources Content-Type: application/json

Request: {…} Response (201): {…} Error (400): {…}


## Rate Limiting
Limits and behavior

## Webhooks
What events are available?

## SDK
Available libraries for common languages

Architecture Documentation Template

# System Architecture

## Purpose
Why does this system exist?

## Components
- Component A: What it does
- Component B: What it does

## Diagram
[Visual diagram]

## Data Flow
How data moves through system

## Key Decisions
Why were choices made?

## Scaling
How does it scale?

## Monitoring
What to watch for?

## Runbooks
- [Common issue 1](runbook-1.md)
- [Common issue 2](runbook-2.md)

Documentation Tools & Organization

Tools

Tool	Use For	Example
README.md	Quick start, overview	How to get running
Markdown files	Detailed docs	Architecture, guides
ADR folder	Design decisions	Why we chose X
Runbooks	How to fix things	Recovery procedures
API docs	API reference	Endpoint definitions
Video	Complex processes	Architecture walkthrough
Diagrams	Visual understanding	System flows
Code comments	Why code exists	Explain non-obvious

Organization

Good structure:

Project/
  README.md (Start here)
  docs/
    architecture.md (System design)
    api.md (API reference)
    getting-started.md (Setup guide)
    troubleshooting.md (Common issues)
    adr/ (Design decisions)
      adr-001-database-choice.md
      adr-002-api-versioning.md
    runbooks/ (How to fix things)
      runbook-payment-stuck.md
      runbook-database-full.md
    images/ (Diagrams, screenshots)
  src/ (Code with clear structure)

Bad structure:

Project/
  README.md (Outdated, hard to follow)
  doc-old.md (Obsolete)
  NOTES.txt (Unclear)
  docs/
    stuff.md (What is this?)
    more-stuff.md (Unclear title)
  Lots of scattered documentation

Documentation Maintenance

Quarterly Review

Each quarter:

1. Read each doc
2. Is it still accurate? (Mark last-updated date)
3. Is it clear? (Ask someone else to read it)
4. Is it complete? (What's missing?)
5. Delete obsolete docs

Keep Docs in Sync with Code

Bad:

Engineer changes code but doesn't update docs
Docs become wrong
New person reads old docs, confused

Good:

Engineer changes code AND updates docs
PR review checks that docs match code
Docs stay accurate

In code review:

Reviewer: "You added a new API. Did you update docs/api.md?"
Engineer: "Yes, added new endpoint and examples"

Integration with Playbook

Part of SDLC:

/pb-guide - Document requirements by project size
/pb-onboarding - Good docs enable self-guided learning
/pb-adr - Documenting decisions
/pb-security - Documenting security practices

/pb-adr - How to document decisions
/pb-review-docs - Documentation quality review
/pb-sam-documentation - Clarity-first documentation review (see “When to Use” for integration)
/pb-repo-readme - Generate project README
/pb-onboarding - Using docs for training

Documentation Checklist

README exists and is current
Getting started guide works (tested)
Architecture documented with diagrams
API endpoints documented with examples
Key decisions documented (ADRs)
Common issues documented (troubleshooting)
Setup/deploy procedures documented (runbooks)
Code is self-documenting (good names, structure)
Comments explain why, not what
Last-updated date shown
Docs are linked in code (easy to find)
Broken links checked
Examples actually work
Docs reviewed quarterly
Obsolete docs deleted

Created: 2026-01-11 | Category: Documentation | Tier: M/L

Keyboard shortcuts

Engineering Playbook