[Spec-Driven Development in iOS: Practical Experience With Spec-kit]

Introduction

Before discovering spec-driven development (SSD), my work with AI agents was rather chaotic. I struggled to properly constrain the AI, clearly describe user stories, or specify key project details. The result was a contextual mess and model hallucinations.

SSD became my framework for AI-assisted development. I appreciated its clear structure: constitution → specification → plan → tasks → implementation. Instead of endless iterations of chatting with an AI agent, I now have a systematic process where each stage serves a specific purpose.

In this article, I'll share my practical experience applying Spec-kit in two iOS projects: creating a new application from scratch and introducing the approach to an existing project.

What is spec-driven development?

SDD is an approach where you first create a project specification, then allow an AI agent to generate code based on it. Instead of repeatedly prompting the model, hoping for the desired result, you create a document that precisely reflects your intent.

The key philosophy of Spec-kit: specifications become executable, and they directly generate working implementations rather than just guiding them.

Core Spec-kit commands

BASIC WORKFLOW

COMMAND PURPOSE
/speckit.constitution Create project principles and constraints
/speckit.specify Describe what should be built
/speckit.plan Technical plan with chosen tech stack
/speckit.tasks Decompose the plan into tasks
/speckit.implement Execute tasks

ADDITIONAL COMMANDS FOR QUALITY

COMMAND PURPOSE
/speckit.clarify Clarify underspecified areas of the specification
/speckit.analyze Analyze artifact consistency
/speckit.checklist Generate quality checklists

TIP: The /speckit.clarify command is especially useful before creating a plan – it helps identify ambiguities in the specification before starting technical work.

The experiment: two projects

New project: city audio guide

The first project is an iOS application, essentially a city audio guide. Points of interest are placed on a map, and when the user is nearby or taps on a point, they can listen to an audio track or read a description.

⚙️ Tech stack: SwiftUI, Combine, MapKit, AVFoundation.

Existing project: museum application

The second project is a fairly large iOS application for the SJMC museum. The app is event-driven: events are based on geolocation and Bluetooth beacons, and classes interact through reactive patterns using RxSwift. Here, I created a constitution, specification, and plan, and completed several tasks for creating new screens.

⚙️ Tech stack: UIKit, RxSwift, CoreLocation, CoreBluetooth.

My workflow: two-model validation

My work cycle evolved as follows: generation through one AI agent, validation through another in a clean context.

Typical process at each stage:

  1. Execute Spec-kit command → artifact generation
  2. Open new session → validation through another model
  3. Make corrections if necessary
  4. Review the result myself

Example for constitution creation:

Codex CLI: /speckit.constitution



New Claude session: analyze and verify the
constitution



Self-review and final edits

Validation prompt:

"Based on the executed /speckit.constitution command, the file constitution.md was created. Analyze the created document: does it meet the command requirements? Suggest fixes and improvements if needed."

Why two models? Initially, out of curiosity. But based on experience, the reviewing model handles code review better and finds edge cases that the generating model misses. The key is running validation in a clean context window to avoid confirmation bias.

Lessons learned: what worked and what didn't

What worked and improved the quality

Validation at every step. I did the first project without validating intermediate results, which was the main mistake. The better you validate the constitution, specification, and plan, the fewer problems you'll encounter during implementation. Use /speckit.clarify for structured clarification of the constitution and specification, and use /speckit.analyze to verify the plan and tasks.

Consistent implementation phases. Tasks are better broken into phases so that after each phase, the project can be launched and tested. This allows you to catch problems early.

New context window for each implementation phase. With extensive tasks, a single session requires too many steps. The model's context gets cluttered, and quality drops. The solution: start each implementation phase in a new context window.

Parallelism. While the AI agent executes /speckit.implement for one task, you can already plan the next user story. This is real acceleration: you don't idle while the model works. Haven't tried it yet, but I think you could run implementations in parallel across different branches.

Rules for systematic errors. There was an issue with resource typing;  the AI agent periodically hardcoded colors and strings instead of using typed constants. After adding a rule to the constitution, the problem went away.

What didn't work: when SDD becomes overhead

Full-fledged application from the first iteration. I tried to build a complete application from the specification, drowned in validation, and spent many tokens. Better to start with an MVP or skeleton and then iterate.

Too granular tasks. For five-minute tasks, SDD is overkill. The description takes longer than fixing it yourself.

Working without MCP Figma. Without Figma integration, I had to manually describe UI in English. English produced more declarative requirements and task descriptions. Colleagues who tried MCP Figma had negative feedback: screens must be perfectly designed; otherwise, the model copies inaccuracies in spacing. But I think this can be prevented with Figma rules.

Practical recommendations: how to structure your SDD workflow

Managing tokens and context

  • Use a new context window for each implementation phase.
  • Don't load too many actions into a single request.
  • For pinpoint fixes without a spec: that's reasonable, not overkill; just leave artifacts if fixes are critical or systematic.

On costs: The steps before implementation (constitution, specification, plan, tasks) cost about $7–8 on the new project. Implementation phases cost another $10–15. Previously, when I tried similar projects without constraints, costs exceeded $35.

Key points:

  • Specify the interaction pattern between the view and the logic.
  • Describe the tech stack: SwiftUI + Combine, domain structure.
  • Don't forget /speckit.clarify for clarifying the constitution and specification.
  • Immediately define resource handling: typed colors, texts, localization, and images.

Planning tasks

🔳 Tasks should be consistent: after each implementation phase, the project should build.

🔳 Better to validate tasks.md through another AI agent.

🔳 Acceptance criteria are often too specific; be sure to review them.

🔳 Describe tasks in English; it comes out more declarative.

I planned tasks this way: created a folder for the task, and inside it created a description.md file with a user story and technical context, and additionally added a screenshot of the screen to be built

I also kept artifacts from completed /speckit.tasks and /speckit.implement inside the task folder to make validating changes easier

Implementation

  • If the agent systematically makes the same mistake, document it in the constitution.
  • Notice inconsistency in code: describe the problem in detail and ask the AI agent to update the spec steps.
  • Don't modify proposed changes before accepting: first accept, then correct. Sometimes the AI agent got confused and didn't build on its own changes.
  • Validate finished code with an AI agent in a clean session; this finds edge cases better.

Startup checklist

▢ Start with an MVP, not a full-fledged application

▢ Create a constitution with clear constraints on resources and patterns

▢ Use /speckit.clarify before moving to the plan

▢ Validate each step in a new context window

▢ Break the plan into consistent phases (3-10 tasks per phase)

▢ Run each phase in a new context window

▢ Parallelize: while the AI agent implements a task, you can already plan and describe the next one

▢ Use /speckit.analyze to check artifact consistency

Results

Code validity: During review, about 90% of the generated code was valid and required no fixes.

Typical issues:

  • Minor code duplication
  • Occasional resource typing problems (before adding rules to the constitution)
  • Small edge cases
  • In the existing project, during /speckit.plan, there were hallucinations where the AI agent invented entities, logic classes that didn't exist, but after validation, the problem went away


Outcome: Both projects launched and work as expected. The existing project required more thorough validation due to the domain model complexity.

Use cases: where SDD adds value

I recommend it for:

  1. Technical specialists for structuring work with AI agents.
  2. Project managers for forming user stories and plans.
  3. Designers for describing interactions.

Where it's especially useful:

  • New projects, starting with MVP.
  • For existing projects, if the project is huge, consider adding a specification for each module.
  • Forming task skeletons and decomposition.

Where it's overkill:

  • Five-minute fixes
  • Pinpoint bugfixes
  • Minor UI changes

Conclusion

Spec-driven development isn't a silver bullet, but it's a good framework for structuring work with AI agents. The main value is understanding at which stages and how to constrain the model's context.

Four key lessons:

  1. Always validate results. Whether it's a constitution, specification, plan, or finished code, verify in a clean context. One or two iterations are usually enough.
  2. Use additional commands. /speckit.clarify before the plan and /speckit.analyze after tasks will improve the project context for the AI agent.
  3. Start with an MVP. Don't try to create a full-fledged application right away. And remember: for very small tasks, it's faster to do it yourself.
  4. The main win is parallelism: while the model implements a task, you're already describing the next one.