[Using Session Traces to Maintain Context in Spec-Driven Development]

Analyze with AI

Get AI-powered insights from this Mad Devs tech article:

No matter how good Spec-Driven Development is in Agentic Software Engineering, developers run into two things that break the flow, where a spec can actually be considered the project's source.

You, as a developer, had to nudge the model after it created code based on some spec and task, and you realized something needed tweaking. You spin this, of course, not by changing the code, but by adding small tasks to the model in the chat. It finishes, you check, and if everything is OK, the task is solved. The problem in this case is: you don't commit the knowledge that was born in the chat with the agent.
You, as a developer, sometimes have to do small tasks or a series of small tasks. Keeping a huge context in your head is hard; you want to unload your head sometimes. For this, you pick days when you just give the agent simple tasks. But you still want to keep specs as the main source of the project. Even if you created a task in an md file, its planning act was inside the chat with the agent, and again loss of knowledge, meaning fragmented knowledge about the project in the repository.

Knowledge is the main "source" in Spec-Driven Development

First, let's implant one rule into habit – not the CODE of the project is the source for the project, but ALL KNOWLEDGE about the project is its source. If there is knowledge about the project, you can regenerate code from this knowledge. The way this knowledge is packaged — its formatting, granularity, connectedness, meaning, and presentation – is what defines the overall quality of the knowledge, and if it is of high quality, the model can generate code from this knowledge again, as long as we instruct it properly and it understands how our knowledge is composed.

But if there is no knowledge, or not enough of it, then we lose the thread of project artifacts with knowledge. Therefore, all knowledge about project development needs to be managed better.

We could either start developing a system for proper management of knowledge as a project source right away, so agents can better understand the project. But in this direction, many startups are developing now, and big tech isn't lagging, so they 100% will invent something valid this year.

Our task is to build a habit of storing knowledge about the project and its development in the repository.

`Google Antigravity` and the artifacts of this IDE in the work process

Surprisingly, in Google Antigravity IDE, by default, somewhere inside the agent relies on several md files that the system forms based on how the chat with the agent goes in the active session. Most often, it's three files:

Plan: plan for agreement with the user.
Tasks: to-dos that the agent performs and marks as completed.
Walkthrough: work report.

Doesn't matter how your initial work went, whether you came up with the task inside the session, or fed the agent a prepared spec, putting artifacts of work into the repository is an important part of knowledge, from which models will find it easier to onboard into your project.

Specifically for Antigravity, you can create workflows and just call the /artifacts command after you have already worked in the session and are ready to commit.

An example of such a command looks like this:

---

description: Archive agent artifacts to docs/agent following the naming convention

---
1. Move all artifacts you created to docs/agents in the format described below. During work in the current session several circles of artifacts might have formed, we need ALL of them, to reflect the work of the entire session from start to finish.

2. Rename them so they correspond to the following naming convention: `YYYY-MM-DD-short-name-type.md` (e.g., `2025-12-24-fix-auth-plan.md`).

3. Verify that the files were moved correctly.

This is how the directory with artifacts might look after a month of active project development ⬇️

Let's look at an example of a small task, whose story remained in trace artifacts.

I don't even remember the details anymore, but in Docker logs from the Caddy service, we needed to look at the real user's IP, but there by default was the host's IP. Apparently, no spec was changed; they just opened a chat and gave instructions to the model. Here are the results of the artifacts that lie in the repository.

2025-12-02-proxy-fix-plan.md

# Implementation Plan - Proxy Headers Configuration

The application is running behind a reverse proxy (Caddy), so `request.remote_addr` sees the proxy's IP instead of the client's. We need to use Werkzeugs `ProxyFix` middleware to trust the `X-Forwarded-For` headers set by Caddy.


## Proposed Changes  

### Backend

#### [MODIFY] [app/__init__.py](file:///../../backend/app/__init__.py)

- Import `ProxyFix` from `werkzeug.middleware.proxy_fix`.
- Wrap the Flask app with `ProxyFix`.
- Configure it to trust 1 level of proxy headers (`x_for=1`, `x_proto=1`, `x_host=1`, `x_port=1`, `x_prefix=1`).

## Verification Plan

### Manual Verification

- **Local:** Run app, verify it still works (headers might be ignored if no proxy, which is fine).
- **Remote:** Deploy and check logs. The IP should now be the real client IP.

2025-12-02-proxy-fix-task.md

# Tasks

- [x] Add `ProxyFix` middleware to Flask app <!-- id: 0 -->
- [x] Configure `ProxyFix` to trust 1 proxy (Caddy) <!-- id: 1 -->
- [x] Add `ProxyFix` middleware to Flask app <!-- id: 0 -->
- [x] Configure `ProxyFix` to trust 1 proxy (Caddy) <!-- id: 1 -->
- [x] Verify local behavior (should still work) <!-- id: 2 -->

2025-12-02-proxy-fix-walkthrough.md

# Proxy Headers Configuration Walkthrough

We have configured the Flask application to correctly handle `X-Forwarded-For` headers from the Caddy reverse proxy.

## Changes

### 1. Middleware Configuration

- Added `werkzeug.middleware.proxy_fix.ProxyFix` to `backend/app/__init__.py`.
- Configured to trust 1 level of proxy headers (`x_for=1`, `x_proto=1`, `x_host=1`, `x_port=1`, `x_prefix=1`).


## Verification Results

### Local Verification

- ran `uv run python -c "from app import create_app; create_app()"`:
- App initialized successfully.
- **Result:** PASS

## Deployment Instructions

1. **Pull changes** on the server.
2. **Rebuild and restart** the backend container:

`docker compose up -d --build backend`

Let me remind you, these artifacts were simply moved from Google Antigravity IDE's internal artifact storage and placed into the repository. Basically, Cursor and other IDEs can be asked to do the same; you just need to focus on the chat session itself.

Several clarifications

❔ Question: Why so many md files?
💬 Answer: To stop writing code by hand and let models generate it while maintaining code health, flexible architecture, and not increasing project complexity as the number of features and time grow.

❔ Question: Won't there be too many files? If a big team works on the project, that's a bunch of small md files.
💬 Answer: That's not a problem; these files are not for humans, they're for the model. For a human, by the way, it's also useful to read them to better understand the project's development flow. When there are many files, you can ask the model to pack them into one, for example, for the whole month.

❔ Question: Why build traces from md files, if knowledge can be pulled from tests and from commit history in the repository?
💬 Answer: Knowledge about the project is never extra. If we want to stop coding something by hand and continue instructing models so they do our tasks without our interference through the IDE, we need to have duplication of knowledge, so it's easier for models to make decisions during planning, code generation, or during debugging and solving complex, long tasks.

[Using Session Traces to Maintain Context in Spec-Driven Development]

Analyze with AI

Knowledge is the main "source" in Spec-Driven Development

Google Antigravity and the artifacts of this IDE in the work process

Several clarifications

`Google Antigravity` and the artifacts of this IDE in the work process