[From Chaos to System: AI Agentic Pipeline for IT Project Estimation]

The problem

Initial software development estimation is painful. Clients want project estimates as quickly as possible, while delivery managers need to understand what the project entails, the amount of development time required, and the risks and constraints that exist.

Typical scenario: a brief arrives describing the project, the delivery manager formulates questions for the client based on it, and assembles a team to conduct the estimate. This process takes an enormous amount of time.

We decided to try speeding this up. Not just "add AI," but build a system that uses an agent pipeline to structurally assess projects.

Why typical questions don't solve the problem

For example, you could ask ChatGPT, "How much does it cost to develop a marketplace?" It will answer. The problem is that the answer will be useless:

  • No understanding of your business context
  • No checking for contradictions in requirements
  • No structure, you show the client
  • No traceability of where each number came from

We took a different path: instead of one regular chat with an LLM, we decided to create a pipeline of steps and specialized roles, each with its own area of responsibility and methodology.

Key idea: pipeline of roles and prompts

In a real company, estimation isn't done by one person: there's a business analyst and a delivery manager who understand requirements. There's an architect who decides how to build it. There's a tech lead and a team that estimates development time.

We reproduced this structure using different AI agents with defined roles:

ROLE AREA OF RESPONSIBILITY OUTPUT
Product Owner Understand what the client wants Structured context, user stories
System Architect Decide how to build it Architectural decisions (ADR), tech stack
Tech Lead Assemble the team and calculate how long it takes PERT estimates, task breakdown
Delivery Manager Compile into a realistic plan WBS, risks, staffing table

Each role receives the result from the previous one and cannot change it, only supplement it with their expertise.

How it works: crypto exchange estimation example

Here, I'll show the pipeline with a real example, estimating an MVP cryptocurrency exchange (Binance/Bybit analog). For each step: prompt fragment and actual result.

Step 1: Product Owner—understand what's missing

The first agent doesn't estimate. Its task is to find gaps in requirements and ask the right questions. Each question includes a recommendation based on domain knowledge.

Input: Client brief is "need a classic crypto exchange with P2P, fiat, and KYC."

Prompt fragment:

## IDENTITY
You are an expert Product Owner. Your superpower is building User Stories
and understanding User Journeys. You always ask, "WHY?" and "WHAT FOR?"

## Phase 0: Document Assessment
Before generating questions, analyze the initial context:
- Completeness Score: Rate documentation from 1-100
- Critical Missing Information: List TOP 5-10 gaps that BLOCK estimation
- Ambiguity Level: [LOW / MEDIUM / HIGH]
- Estimation Readiness: [READY / NEEDS_CLARIFICATION / INSUFFICIENT]

## Question Categories (MANDATORY)
🟡 PROBLEM & VALUE:
- "What specific problem does this product solve?"
- "How do users currently solve this problem?"
- 💡 Tip: Without a clear problem definition, estimates will be inaccurate

🟡 END USER & PERSONA:
- "Who is the primary end user? (demographics, tech savviness)."
- 💡 Tip: Create ONE clear persona, not "everyone."

🟡 REGULATORY & COMPLIANCE:
- Policy requirements (GDPR, HIPAA, PCI-DSS, local laws)
- 💡 Tip: Compliance can add 20-40% to development effort

🟡 DATA VOLUME & PERFORMANCE:
- "How many users/transactions per day? (100 or 100,000?)."
- "What are the performance targets? (< 2s page load?)."

## CRITICAL: For EVERY question, include 💡 Tip/Recommendation

Part of the result:

CATEGORY QUESTION WHY IMPORTANT
Audience Who are the main users? (retail, institutional, miners?) Affects UX and security requirements
Regulatory What KYC/AML requirements are mandatory? Affects provider choice and architecture
Budget What's the maximum project budget? Determines MVP vs. Full scope
Timeline Target launch date? Affects feature prioritization

The agent generated 25+ questions across 10 categories, each with justification and recommendation.

Step 2: Architect—decide how to build

Once the requirements are clear, the architect decides how to build it. Key principle: don't reinvent the wheel. For each complex component, a Buy vs. Build analysis is mandatory.

Input: Client answers + requirement "trading engine for order matching."

Prompt fragment:

## IDENTITY
You are a Senior System Architect. You turn "Wants" (Requirements) into 
"Blueprints" (Design). You master architectural styles (Monolith, Microservices, 
Serverless and system design patterns (API Gateway, CQRS, BFF).

## GOALS
1. Avoid "Reinventing the Wheel": Prioritize "Buy" (SaaS/PaaS) or "Open Source."
2. Use C4 Model (Context, Container, Component) and ATAM (Trade-off Analysis)

## BUY vs BUILD ANALYSIS (MANDATORY for every major component)
For every complex feature, ask:
1. Does a SaaS/PaaS solution exist? (Auth0, Stripe, SendGrid)
2. Does an open-source library exist?
3. What's the effort difference?
   - Custom: X hours
   - Integration: Y hours
   - Savings: X - Y hours
Default: Prefer "Buy/Integrate" unless core IP or compliance blocks it.

## ADR FORMAT (Architecture Decision Record)
- Context: What challenge are we solving?
- Alternatives Considered: MINIMUM 2 alternatives required
- Decision: [Generic category] (CHOSEN)
 - Recommended Product: [Specific product]
- Rationale: Pros, Cons, Buy vs. Build analysis
- Effort Impact: How does this affect the timeline/cost?

## GUARDRAILS: MVP MINIMIZATION
- Single database, not distributed
- Monolith before microservices
- Managed services over custom solutions
- No Kubernetes for MVP (simple PaaS is enough)

Result (ADR-001: Trading engine):

ALTERNATIVE EFFORT PROS CONS
Custom development 1500-2500 h Full control, no licensing High security risk, time-consuming
Open-source (Peatio) 500-800 h Free, proven architecture Requires deep blockchain expertise
White-label (AlphaPoint) ✅ 300-500 h Fast, built-in compliance License: $50-200k/year

Decision: White-label
Savings: 1200h (€120k at €100/h rate)
Rationale: Built-in KYC/AML processes, proven security, fast market entry

The architect formed 10 ADRs for all critical components.

Step 3: Tech Lead—turn architecture into hours

The most complex stage. The tech lead receives project context and architecture and converts it into tasks with estimates. PERT methodology: three estimates for each task, no "single numbers."

Input: Architecture with selected solutions

Prompt fragment:

## IDENTITY
You are a Senior Technical Planner. Think of yourself as a Construction
Project Manager, NOT a Builder:
- You define "Build 3 walls, install 2 windows" (tasks)
- You estimate: "Foundation: 40h, Walls: 60h" (effort)
- You DON'T: Mix concrete, nail boards (implementation)

## PERT ESTIMATION (MANDATORY)
Provide three-point estimates for EVERY task:
- Optimistic (O): Everything works first try, no blockers
- Most Likely (M): Normal development, typical small issues
- Pessimistic (P): Major blockers, unexpected complexity
- Expected = (O + 4×M + P) / 6

Confidence Intervals:
- Standard Deviation (SD) = (P - O) / 6
- P80 (80% confidence) = Expected + 0.84 × SD
- P90 (90% confidence) = Expected + 1.28 × SD

## MVP SCOPE RED FLAGS (Challenge the PO if present)
- ❌ More than 10 epics in MVP → Too much
- ❌ Admin panel with full CRUD → NOT needed (use DB tools)
- ❌ Multiple user roles → Start with ONE
- ❌ MVP estimate > 3 months → Reduce scope

## Phase 0: Project Initialization (ALWAYS INCLUDE)
- Backend: Docker, Logging, Error Handling, API Docs
- CI/CD Pipeline: GitHub Actions, Linting, Tests
- Phase 0 Effort Rule: 10-15% of total development effort

Part of the result (wallet service):

TASK O M P EXPECTED CONFIDENT
Bitcoin wallet integration (Infura/QuickNode) 20 h 40 h 60 h 40 h MEDIUM
Ethereum wallet integration (Infura) 20 h 40 h 60 h 40 h MEDIUM
Deposit address generation 12 h 24 h 36 h 24 h MEDIUM
Withdrawal processing (hot wallet) 16 h 32 h 48 h 32 h MEDIUM
Multi-signature setup (2-of-3) 20 h 40 h 60 h 40 h LOW
Transaction monitoring 12 h 24 h 36 h 24 h MEDIUM
... ... ... ... ... ...
Total wallet service (10 tasks) 164 h 328 h 492 h 328 h MEDIUM

The tech lead collected estimates across all modules: Backend: 1241h, Frontend: 1162h, QA: 1419h.

Step 4: Delivery Manager—compile plan with risks

Final stage: compile all estimates into a unified plan. For each risk, EMV (Expected Monetary Value) is calculated along with a mitigation strategy.

Input: Team estimates + architecture + external dependencies

Prompt fragment:

## IDENTITY
You are the Delivery Manager, the conductor of the orchestra.
You do not play an instrument (code/design), but you ensure the symphony
(project) is delivered on time, on budget, and with high quality.

## GOALS
1. Consolidate all inputs into a single source of truth
2. Create a realistic resource plan (Who does what and when)
3. Identify risks using Boehm's Top 10 + calculate expected monetary value (EMV)
4. Use the critical path method (CPM) to determine project duration

## RISK QUANTIFICATION (EMV—MANDATORY)
Formula: EMV = Probability (%) × Impact (Hours)

Expanded risk table format:
| Risk | Prob | Impact | EMV | Mitigation | Cost | Residual EMV |

Every risk mitigation MUST specify:
- Actions: Numbered list of specific steps
- Responsible: Who owns the mitigation
- Resources/Cost: Hours allocated
- Deadline: When mitigation must be completed
- Fallback: What happens if mitigation fails

Contingency budget calculation:
- Without mitigation: Sum of all EMV
- With mitigation: Residual EMV + Mitigation cost
- Recommended buffer: Residual EMV × 1.2

## STAFFING TIPS
- Month 1: 50% productivity (onboarding)
- QA: Start from Sprint 2, 1 QA per 3-4 developers
- Overlap Senior and Junior for knowledge transfer

Result (top 5 risks):

RISK PROB IMPACT EMV MITIGATION COST RESIDUAL EMV
Stripe limitations for crypto 40% 200 h 80 h Apply during week 0 20 h 20 h
Security vulnerabilities 50% 300 h 150 h Pentest + code review 120 h 30h
Scope creep 60% 300 h 180 h Change control process 40 h 80 h
Hot wallet theft 10% 1000 h 100 h Multi-sig + cold storage 60 h 20h
Regulatory non-compliance 30% 400 h 120 h Crypto lawyers (€30-50k) 80 h 40 h
Total - - 984 h - 560 h 308 h
... ... ... ... ... ...
Total wallet service (10 tasks) 164 h 328 h 492 h 328 h MEDIUM

Buffer calculation:

  • Without mitigation: 984 h buffer needed
  • With mitigation: 308 h (residual) + 560 h (mitigation cost) = 868 h
  • Applied buffer: 25% of base estimate

What we got with a limited project context

After going through the pipeline:

DOCUMENT VOLUME KEY METRICS
Clarifying questions 25+ questions 10 categories, each with a recommendation
Architecture (ADR) 10 decisions Savings ~€300k vs. custom
SWAT estimate 50+ tasks 3982 h (PERT), 9-10 months
Master plan Full WBS €398k, 25% risk buffer

Traceability: Every number in the final estimate is linked to a task → which is linked to an ADR → which is linked to a client requirement.

Conclusion

The goal of this approach:

  1. Accelerate routine work. A draft that previously took a week is ready in an hour. The expert only needs validation and adjustment, not to start from scratch.
  2. Bring estimates closer to reality. The agent calculates using PERT and COCOMO. Based on methodologies, AI agents have benchmarks for more accurate estimation.
  3. Force structural thinking. Every project follows the same path: Context → Architecture → Estimate → Plan. You can't skip a step or "forget" about risks.
  4. Make estimates traceable. Every number in the final plan is linked to a specific task, which is linked to a user story, which is linked to a business goal.
  5. Foundation for the team. Sometimes teams need to conduct deep research to find specific technologies for a project. As a result, we get a set of technologies (with estimates) that can be used as a basis for selection.

Ultimately, agents don't replace experts: they remove routine work and follow methodologies, resulting in an approximate estimate and clearer plan that the team validates.