AI Chronicles

AI-SDLC v1.0 Spec

2025-12-27T00:00:00+00:00

Full Specification (Single Document)

1. Scope and intent

This document is the normative specification for AI-SDLC v1.0.

It defines:

what AI-SDLC is,
what it replaces,
what artefacts are required,
what stages exist,
what rules must be enforced,
what constitutes progress and completion.

This is not a blog post, philosophy paper, or marketing document. It is written to be used as an operational SDLC spec.

2. Definitions

AI executor A non-human system that generates, modifies, or removes implementation artefacts.

Human owner A named individual accountable for intent, constraints, decisions, and outcomes.

Intent A concise statement of purpose, outcome, constraints, assumptions, and stop conditions.

Specification A machine-readable description of expected behaviour, interfaces, data, non-functional requirements, and observability.

Evidence Observed runtime behaviour produced by a deployed system.

Decision A recorded human judgement based on evidence that determines the next action.

3. Core premise

AI-SDLC v1.0 is based on the following premises:

Implementation cost is low.
Change is continuous and expected.
Poor decisions dominate failure modes.
Execution can be automated.
Accountability cannot be automated.

Any process that optimises primarily for human execution speed is out of scope.

4. What AI-SDLC replaces

AI-SDLC v1.0 explicitly replaces SDLC constructs whose primary purpose is to manage human execution and coordination:

task backlogs as the unit of progress,
sprint cycles as a reporting mechanism,
role hand-offs between product, design, and engineering,
status reporting based on activity,
delivery milestones detached from runtime behaviour.

These constructs may exist locally but MUST NOT define progress.

5. What AI-SDLC retains

AI-SDLC v1.0 retains and enforces:

explicit intent and constraints,
architectural decisions where consequences exist,
automated quality verification,
security, privacy, and compliance controls,
observability and rollback,
auditability of decisions.

6. Design principles (normative)

Implementations of AI-SDLC MUST follow these principles:

Intent precedes execution No implementation begins without explicit intent.
Constraints are enforceable Constraints are expressed in a form that automation can block.
Implementation is replaceable Code is an output, not a long-term asset.
Quality gates are automatic Human approval cannot bypass enforcement.
Evidence drives decisions Decisions are based on observed behaviour, not plans.
Humans remain accountable Every decision has a named owner.

7. Required artefacts

An AI-SDLC v1.0 system MUST maintain the following artefacts.

7.1 Intent record

Required fields:

Problem statement
Affected user or system
Desired outcome
Non-negotiable constraints
Explicit assumptions
Stop or change conditions
Named intent owner

The intent record MUST be concise and versioned.

7.2 Formal specification

The specification MUST describe:

Behaviour and interfaces
Inputs and outputs
Data and state
Error conditions
Non-functional requirements
Acceptance scenarios
Required observability

The specification MUST be machine-readable.

7.3 Decision log

Each decision MUST record:

Decision date
Decision owner
Evidence reviewed
Decision taken
Rationale
Resulting action

8. Lifecycle stages

AI-SDLC v1.0 defines a closed decision loop.

Stage 1: Intent definition

A human owner defines intent.

Exit condition: Intent record exists and is approved by the intent owner.

Stage 2: Specification

Intent is translated into a formal specification.

Exit condition: Specification is complete, consistent, and machine-readable.

Stage 3: Automated implementation

AI generates:

application code,
tests,
infrastructure,
instrumentation,
supporting artefacts.

Human review is permitted but not required for generation.

Exit condition: All required artefacts exist.

Stage 4: Automated enforcement

The system enforces:

correctness,
security,
performance,
cost limits,
deployment safety.

Failures MUST block progression automatically.

Exit condition: All enforcement checks pass.

Stage 5: Deployment

Deployment MUST be:

incremental,
reversible,
observable.

Deployment is not completion.

Exit condition: System is running and observable.

Stage 6: Observation

The system produces evidence including:

behaviour,
reliability,
performance,
cost,
failure modes.

Evidence MUST be collected continuously.

Stage 7: Decision

A named decision owner selects one action:

continue,
adjust,
stop,
pivot.

The decision MUST be recorded.

Exit condition: Decision log entry exists.

The loop then repeats.

9. Unit of progress

The only recognised unit of progress in AI-SDLC v1.0 is a decision informed by evidence.

Task completion, feature delivery, or code volume MUST NOT be treated as progress.

10. Roles and accountability

AI-SDLC v1.0 requires the following functions:

Intent owner
Decision owner
System steward
AI executor

One person may perform multiple functions. Every function MUST have a named human owner.

11. Metrics and measurement

AI-SDLC v1.0 SHOULD measure:

time from intent to evidence,
number of assumptions tested,
decision latency,
recovery time from failure,
cost of change.

It MUST NOT optimise for activity metrics.

12. Failure modes (non-compliance)

An implementation is non-compliant if:

code is built without intent,
constraints are documented but not enforced,
deployment occurs without observability,
decisions lack evidence,
accountability is unclear.

13. Versioning and evolution

AI-SDLC versions MUST be:

explicitly numbered,
backwards aware,
revised only with documented rationale.

AI-SDLC v1.0 is intentionally minimal.

14. Summary definition

AI-SDLC v1.0 is a decision-driven software development lifecycle in which humans define intent and constraints, AI performs implementation, automated systems enforce quality and safety, and runtime evidence determines what happens next.

15. Entry and Exit Criteria

This section defines the mandatory conditions under which work may enter the AI-SDLC loop, progress between stages, and exit a loop iteration.

These criteria are normative and enforceable.

15.1 Entry criteria (start of work)

A change, feature, experiment, or system MUST NOT enter the AI-SDLC lifecycle unless all of the following conditions are met:

An intent record exists in intent/intent.md.
The intent record includes:
- a clear problem statement,
- a desired outcome,
- non-negotiable constraints,
- explicit assumptions,
- stop or change conditions,
- a named intent owner.
The intent owner has explicitly approved the intent record.

If any condition is not met, no implementation work is permitted, including AI-generated work.

15.2 Entry criteria for specification

A system MUST NOT move from intent definition to specification unless:

The intent record is complete and internally consistent.
All assumptions are explicitly documented in intent/assumptions.md.
Constraints are stated in a form that can be enforced by automation.

If constraints cannot be enforced, the intent MUST be revised before proceeding.

15.3 Entry criteria for automated implementation

Automated implementation MUST NOT begin unless:

A formal specification exists in spec/.
All required spec files are present, even if minimal.
Acceptance scenarios are defined.
Required observability is specified.

Speculative or exploratory implementation without a specification is non-compliant.

15.4 Entry criteria for deployment

A system MUST NOT be deployed unless:

All automated enforcement gates pass.
Observability is implemented as specified.
Rollback mechanisms are available and tested.
A decision owner is assigned for post-deployment review.

Deployment without observability is explicitly forbidden.

15.5 Exit criteria (completion of a loop iteration)

A single AI-SDLC loop iteration is considered complete only when:

The system has been deployed and observed, or
A conscious decision has been made not to deploy, and
A decision record exists in decisions/ documenting:
- the evidence reviewed,
- the decision taken,
- the rationale.

Without a recorded decision, no progress has occurred, regardless of implementation activity.

15.6 Stop conditions (mandatory)

Every intent record MUST define stop or change conditions.

Work MUST STOP immediately when any stop condition is met.

Examples include, but are not limited to:

evidence contradicts a core assumption,
constraints are violated beyond acceptable limits,
cost exceeds defined budgets,
risk exceeds acceptable thresholds.

Stopping is treated as a successful outcome when driven by evidence.

15.7 Prohibited states

The following states are explicitly prohibited under AI-SDLC v1.0:

implementation without intent,
deployment without observability,
iteration without decisions,
continued work after stop conditions are met,
unowned systems with no decision owner.

Any occurrence places the system in non-compliant status.

15.8 Summary

AI-SDLC v1.0 enforces disciplined flow by defining when work may begin, proceed, and stop.

Speed is permitted. Speculation is not. Progress requires decisions.

16. Compliance Levels and Exceptions

This section defines how compliance with AI-SDLC v1.0 is assessed, how exceptions are handled, and how non-compliance is treated.

Compliance is explicit, auditable, and decision-owned.

16.1 Compliance levels

Every system operating under AI-SDLC v1.0 MUST be in exactly one of the following compliance states at any time.

16.1.1 Fully compliant

A system is fully compliant when:

all mandatory sections of the AI-SDLC v1.0 specification are satisfied,
all required artefacts exist and are up to date,
all enforcement gates are active and passing,
all decisions are recorded and owned.

Fully compliant systems may proceed through the lifecycle without restriction.

16.1.2 Conditionally compliant

A system is conditionally compliant when:

one or more AI-SDLC requirements are intentionally unmet,
the deviation is explicitly documented,
a decision owner has approved the deviation,
the deviation is time-bounded or scope-bounded.

Conditional compliance MUST be recorded as a decision in decisions/.

Conditional compliance is an exception, not a default state.

16.1.3 Non-compliant

A system is non-compliant when:

mandatory requirements are unmet without an approved exception,
enforcement gates are bypassed,
decisions lack evidence or ownership,
prohibited states defined in Section 15.7 occur.

Non-compliant systems MUST NOT be deployed or progressed.

16.2 Exception handling

Exceptions are permitted only under controlled conditions.

An exception MUST:

be explicitly requested,
be reviewed by a named decision owner,
be approved or rejected explicitly,
be recorded in the decision log,
define scope, duration, and review conditions.

Silent or implicit exceptions are forbidden.

16.3 Exception record requirements

An exception decision record MUST include:

the requirement being excepted,
the reason for the exception,
the risk introduced,
mitigation measures,
the duration or condition for expiry,
the decision owner.

Exceptions without an expiry condition are non-compliant.

16.4 Expiry and review of exceptions

All exceptions MUST be reviewed:

at the next decision point, or
when defined expiry conditions are met, whichever occurs first.

Expired exceptions MUST either:

be removed, restoring full compliance, or
be explicitly renewed via a new decision.

Automatic or indefinite rollover of exceptions is not permitted.

16.5 Enforcement of compliance

Compliance status MUST be visible and machine-checkable.

Automation SHOULD:

block deployment of non-compliant systems,
warn on conditional compliance nearing expiry,
surface compliance status alongside runtime evidence.

Human approval MUST NOT override automated blocking of non-compliant states.

16.6 Summary

AI-SDLC v1.0 treats compliance as a first-class system property.

Rules may be bent deliberately. They may not be bent silently. Accountability is explicit.

17. Risk Classification

This section defines how systems are classified by risk under AI-SDLC v1.0 and how risk affects enforcement, review, and decision-making.

Risk classification is mandatory. All systems MUST be assigned a risk level before automated implementation begins.

17.1 Purpose of risk classification

Risk classification exists to ensure that:

higher-risk systems receive stronger controls,
low-risk systems are not burdened by unnecessary process,
enforcement scales with potential impact.

Risk is assessed based on impact, not effort.

17.2 Risk levels

AI-SDLC v1.0 defines three risk levels.

Each system MUST be classified into exactly one level.

17.2.1 Low risk

A system is low risk if failure would result in:

no material user harm,
no regulatory or legal impact,
no financial loss beyond defined tolerance,
no exposure of sensitive data.

Examples include:

internal tools,
prototypes,
experiments with limited scope.

Low-risk systems may operate with minimal enforcement, provided all core AI-SDLC rules are satisfied.

17.2.2 Medium risk

A system is medium risk if failure could result in:

user-facing disruption,
moderate financial impact,
operational instability,
handling of non-sensitive customer data.

Examples include:

customer-facing applications,
internal systems supporting revenue,
systems with availability or performance commitments.

Medium-risk systems require full enforcement of AI-SDLC artefacts and gates.

17.2.3 High risk

A system is high risk if failure could result in:

significant financial loss,
regulatory or legal exposure,
safety or security incidents,
handling of sensitive or regulated data.

Examples include:

financial systems,
regulated platforms,
security-critical infrastructure.

High-risk systems require enhanced controls and stricter decision review.

17.3 Risk declaration requirements

Risk classification MUST be declared in the intent record.

The declaration MUST include:

assigned risk level,
justification for the classification,
named decision owner responsible for the classification.

Risk classification MUST be reviewed whenever system scope or impact changes.

17.4 Impact on enforcement

Risk level directly affects enforcement.

At minimum:

Low risk
- Standard AI-SDLC lifecycle
- Minimal review overhead
Medium risk
- Full enforcement of all AI-SDLC requirements
- Mandatory observability and rollback
High risk
- Full enforcement plus:
  - stricter acceptance criteria,
  - enhanced observability,
  - explicit decision review before deployment,
  - tighter exception controls.

Risk level MUST NOT be used to bypass core AI-SDLC principles.

17.5 Misclassification

Intentional or negligent misclassification of risk is treated as non-compliance.

If observed impact exceeds declared risk level:

work MUST pause,
risk classification MUST be reassessed,
a decision record MUST document corrective action.

17.6 Summary

AI-SDLC v1.0 scales discipline with impact.

Low-risk systems move fast. High-risk systems move carefully. All systems remain accountable.

18. Non-Goals

This section defines what AI-SDLC v1.0 explicitly does not attempt to define or solve.

These non-goals are intentional. They protect the specification from scope creep, misinterpretation, and misuse.

18.1 Team structure and roles

AI-SDLC v1.0 does not define:

team sizes or compositions,
job titles or reporting lines,
organisational design or management structure.

The SDLC defines accountability and function, not organisational charts.

18.2 Tooling and vendor selection

AI-SDLC v1.0 does not mandate:

specific AI models or providers,
programming languages or frameworks,
CI/CD platforms,
observability tools,
infrastructure vendors.

Tool choice is an implementation concern and is intentionally left open.

18.3 Business governance and approval processes

AI-SDLC v1.0 does not replace:

business case approval,
budget approval,
legal or regulatory sign-off,
executive governance processes.

It assumes these exist externally and integrates with them via intent, constraints, and decisions.

18.4 Human performance management

AI-SDLC v1.0 does not:

measure individual productivity,
evaluate human performance,
define incentives or compensation,
optimise for utilisation metrics.

The SDLC governs systems and decisions, not people management.

18.5 Velocity optimisation

AI-SDLC v1.0 does not optimise for:

speed of delivery as an end in itself,
volume of output,
number of features shipped.

Speed is a byproduct of clarity and automation, not a primary goal.

18.6 AI autonomy beyond execution

AI-SDLC v1.0 does not grant AI authority to:

define intent,
change constraints,
approve exceptions,
make final decisions,
assume accountability.

AI executes. Humans decide.

18.7 Universal applicability

AI-SDLC v1.0 does not claim to be suitable for:

every organisation,
every regulatory environment,
every system type.

Adoption requires judgement and may require adaptation beyond v1.0.

18.8 Summary

AI-SDLC v1.0 is deliberately narrow.

It defines how systems are built and evolved when AI performs execution and humans retain responsibility.

Anything outside that boundary is explicitly out of scope.

Appendix A: File and Folder Standard

A.1 Purpose

This appendix defines the mandatory file and folder structure for systems operating under AI-SDLC v1.0.

The structure is designed to:

make intent, decisions, and constraints first-class,
separate human judgement from AI execution,
support automation and enforcement,
ensure auditability and traceability.

This is a normative standard, not a suggestion.

A.2 Root structure (canonical)

Every AI-SDLC–compliant repository MUST follow this structure:

/
├─ intent/
├─ spec/
├─ decisions/
├─ src/
├─ tests/
├─ infra/
├─ observability/
├─ compliance/
├─ runbooks/
├─ ai/
└─ README.md

No directory may be omitted unless explicitly stated.

A.3 intent/ — human purpose and constraints

Purpose

Holds human-authored intent records.

Rules

Written by humans only.
Small, explicit, and versioned.
No implementation detail.

Required files

intent/
├─ intent.md
└─ assumptions.md

intent/intent.md (required)

Must include:

Problem statement
Affected user or system
Desired outcome
Non-negotiable constraints
Stop / change conditions
Intent owner

intent/assumptions.md (required)

Explicit list of assumptions
Each assumption must be testable
Each assumption must link to evidence later

A.4 spec/ — executable system definition

Purpose

Holds machine-readable specifications used by AI executors.

Rules

Source of truth for behaviour.
No narrative prose.
Must be consumable by automation.

Required structure

spec/
├─ system.yaml
├─ interfaces.yaml
├─ data.yaml
├─ nfr.yaml
├─ acceptance.yaml
└─ observability.yaml

Each file MUST exist, even if minimal.

A.5 decisions/ — accountability and evidence

Purpose

Records human decisions made using runtime evidence.

Rules

Append-only.
One decision per file.
Decisions are immutable once recorded.

Structure

decisions/
├─ 0001-initial-scope.md
├─ 0002-adjust-constraint.md
└─ 0003-pivot.md

Each decision file MUST include:

Date
Decision owner
Evidence reviewed (links)
Decision taken
Rationale
Resulting action

A.6 src/ — AI-generated implementation

Purpose

Holds implementation artefacts.

Rules

Primarily AI-generated.
Replaceable at any time.
No business intent defined here.

Notes

Humans may edit, but code is not authoritative.
Spec and intent override code.

A.7 tests/ — automated verification

Purpose

Holds tests enforcing correctness and constraints.

Rules

Tests are mandatory.
Generated by AI, reviewed by humans.
Blocking in CI.

Typical contents

tests/
├─ unit/
├─ integration/
├─ contract/
└─ security/

A.8 infra/ — deployment and environment

Purpose

Defines infrastructure and deployment behaviour.

Rules

Must support rollback.
Must support staged deployment.
Must expose observability hooks.

Typical contents

infra/
├─ environments/
├─ pipelines/
└─ policies/

A.9 observability/ — evidence production

Purpose

Defines how runtime evidence is produced and collected.

Rules

Required before deployment.
Enforced automatically.

Typical contents

observability/
├─ metrics.yaml
├─ logs.yaml
├─ alerts.yaml
└─ dashboards.yaml

A.10 compliance/ — enforced constraints

Purpose

Captures formal constraints that automation must enforce.

Rules

Constraints must be machine-checkable.
Documentation alone is insufficient.

Typical contents

compliance/
├─ security.md
├─ privacy.md
├─ cost-limits.yaml
└─ performance-budgets.yaml

A.11 runbooks/ — operational response

Purpose

Defines how humans respond to failures.

Rules

Short and explicit.
No architecture descriptions.

Typical contents

runbooks/
├─ rollback.md
├─ incident-response.md
└─ escalation.md

A.12 ai/ — AI execution configuration

Purpose

Defines how AI systems operate within the SDLC.

Rules

Controls AI behaviour.
Never contains intent.

Typical contents

ai/
├─ prompts/
├─ policies.yaml
├─ guardrails.yaml
└─ memory.md

A.13 README.md — orientation only

Purpose

Human-readable overview.

Rules

Points to intent, spec, and decisions.
No duplication of authoritative content.

A.14 Authority order (critical)

When conflicts exist, authority is resolved in this order:

intent/
spec/
decisions/
compliance/
observability/
infra/
tests/
src/

Code never overrides intent or decisions.

A.15 Compliance rule

A repository is AI-SDLC v1.0 compliant only if:

all required directories exist,
required files are present,
decisions are logged,
constraints are enforceable,
observability exists before deployment.

A.16 Summary

This file and folder standard ensures that:

humans control purpose and judgement,
AI controls execution,
automation enforces safety,
evidence drives change,
accountability is explicit.

It is intentionally strict.

Appendix B: YAML Specification Schemas

This appendix defines the normative YAML schemas used by AI-SDLC v1.0. These schemas are machine-checkable and enforceable.

B.1 spec/system.yaml

Defines system identity, ownership, and scope.

system:
  name: string
  version: string
  description: string

ownership:
  intent_owner: string
  decision_owner: string
  system_steward: string

scope:
  in_scope:
    - string
  out_of_scope:
    - string

Rules:

system.name, ownership.intent_owner, and ownership.decision_owner are required.
Version changes require a recorded decision in decisions/.

B.2 spec/interfaces.yaml

Defines externally visible behaviour.

interfaces:
  - name: string
    type: api | event | job
    description: string

    endpoint:
      method: GET | POST | PUT | PATCH | DELETE
      path: string

    input:
      schema_ref: string

    output:
      schema_ref: string

    errors:
      - code: string
        description: string

    auth:
      type: none | api_key | oauth2 | jwt

Rules:

Every interface MUST define errors.
Authentication MUST be explicit (none is allowed, but must be stated).

B.3 spec/data.yaml

Defines persistent and transient state.

data:
  stores:
    - name: string
      type: postgres | mysql | dynamodb | redis | filesystem
      purpose: string

      entities:
        - name: string
          fields:
            - name: string
              type: string
              nullable: boolean
              primary_key: boolean

Rules:

Every store MUST declare purpose.
Data creation MUST be explicit (no implicit entities).

B.4 spec/nfr.yaml

Defines enforceable non-functional requirements.

nfr:
  availability:
    target_percentage: number

  performance:
    latency_p95_ms: number
    throughput_rps: number

  reliability:
    error_rate_percentage: number

  cost:
    monthly_budget_limit: number
    currency: string

  scalability:
    max_users: number

Rules:

At least one NFR section MUST be present.
NFRs MUST be measurable at runtime.

B.5 spec/acceptance.yaml

Defines correctness via scenarios.

acceptance:
  - scenario: string
    given:
      - string
    when:
      - string
    then:
      - string

Rules:

Each interface MUST have at least one acceptance scenario.
Scenarios MUST be testable.

B.6 spec/observability.yaml

Defines required runtime evidence.

observability:
  metrics:
    - name: string
      type: counter | gauge | histogram
      description: string

  logs:
    - name: string
      level: info | warn | error
      description: string

  alerts:
    - name: string
      condition: string
      severity: low | medium | high | critical

Rules:

Observability MUST exist before deployment.
Alerts MUST map to NFR breaches or constraint violations.

B.7 spec/security.yaml (optional but recommended)

Defines security posture.

security:
  data_classification: public | internal | confidential | restricted

  controls:
    - name: string
      enforced: boolean

  secrets:
    storage: vault | env | kms

Rules:

If data_classification is not public, at least one controls entry MUST be present.
Secrets MUST NOT be stored in code or committed to the repo.

B.8 Cross-schema invariants

Automation MUST enforce all of the following:

Every item in interfaces has at least one matching acceptance scenario in acceptance.
Every NFR defined in nfr.yaml has at least one corresponding metric in observability.metrics.
Every alert corresponds to a constraint breach (NFR or security control).
No production deployment is allowed without observability.

If any invariant fails, the pipeline MUST block deployment.

B.9 Authority order reminder

When conflicts exist, resolve them in this order:

intent/
spec/
decisions/
compliance/
implementation (src/)

Code MUST NOT override intent or decisions.

AI-SDLC v1.0

2025-12-25T00:00:00+00:00

A short position paper

Purpose

This paper proposes AI-SDLC v1.0, a software development lifecycle designed for a world where AI performs most implementation work and humans own intent, constraints, and decisions.

It does not promote a tool, framework, or methodology brand.
It describes a structural shift in how software is built.

The historical bottleneck

Traditional SDLCs evolved to address a single dominant constraint: human execution.

Historically:

writing code was slow and expensive,
Refactoring carries a high risk,
coordination between people was costly,
Late errors were hard to correct.

Processes such as Waterfall and Agile are optimised around these realities.
Their practices exist to manage execution cost and coordination risk.

What AI changed

AI collapses the implementation cost.

Today, systems can:

generate and rewrite code quickly,
refactor large codebases cheaply,
produce tests and infrastructure automatically.

Execution is no longer the primary bottleneck in many environments.

The new bottleneck

When execution becomes cheap, a different constraint dominates:

unclear intent,
weak constraints,
untested assumptions,
delayed or misleading feedback,
Poor decisions made without evidence.

Speed amplifies outcomes, both good and bad.
Building faster does not help if the direction is wrong.

The SDLC must therefore optimise for decision quality, not task throughput.

Design goals of an AI-native SDLC

An AI-native SDLC should:

make intent explicit before execution,
treat constraints as enforceable system rules,
allow implementation to be replaced easily,
enforce quality and safety automatically,
rely on observed runtime behaviour,
preserve clear human accountability.

These goals define AI-SDLC v1.0.

AI-SDLC v1.0 lifecycle

AI-SDLC v1.0 is a closed decision loop, not a linear delivery pipeline.

1. Intent

A human defines:

the problem,
the desired outcome,
non-negotiable constraints,
key assumptions,
stop or change conditions.

2. Specification

The intent is expressed in a machine-readable form describing:

behaviour and interfaces,
data and state,
non-functional requirements,
acceptance scenarios,
required observability.

3. Automated implementation

AI generates:

code,
tests,
infrastructure,
instrumentation,
supporting artefacts.

Humans review outcomes rather than hand-crafting implementation.

4. Automated enforcement

Quality, security, performance, and cost limits are enforced automatically.
Failures block progression.

5. Deployment

The system is deployed incrementally and safely.

6. Observation

The running system produces evidence through:

user and system behaviour,
reliability and performance data,
operational cost.

7. Decision

A named decision owner chooses to:

continue,
adjust,
stop,
or pivot.

The decision closes the loop.

Unit of progress

In AI-SDLC v1.0, progress is measured by decisions informed by evidence, not by the number of tasks completed.

Progress means:

uncertainty was reduced,
assumptions were tested,
direction became clearer.

Accountability

AI-SDLC v1.0 does not remove human responsibility.

Humans remain accountable for:

intent,
constraints,
decisions,
outcomes.

AI executes. Humans decide.

Conclusion

AI changes the economics of software development.
When execution is cheap, decision quality becomes the dominant factor.

AI-SDLC v1.0 aligns the SDLC with this reality by treating intent, constraints, evidence, and accountability as first-class elements, and by positioning AI as the primary executor rather than the primary decision-maker.

Rethinking the Software Development Lifecycle for an AI Builder

2025-12-24T00:00:00+00:00

Abstract

The software development lifecycle (SDLC) has historically been designed around the constraints of human execution. Planning frameworks, coordination mechanisms, and delivery processes emerged to manage slow, manual implementation and high integration risk. Recent advances in artificial intelligence fundamentally change these constraints. When AI systems can generate, modify, and discard code at low cost, execution ceases to be the primary bottleneck. This paper argues that the SDLC must be redesigned accordingly. It proposes AI-SDLC v1.0, a decision-driven lifecycle in which humans define intent and constraints, AI performs implementation, automated systems enforce quality and safety, and observed runtime behaviour determines direction.

⸻

Introduction

For decades, software delivery has been limited by the cost and risk of implementation. Writing code, integrating changes, testing, and deploying reliably required significant human effort and coordination. The SDLC evolved to manage these realities.

Artificial intelligence changes this equation. Large language models and AI-assisted tooling can now generate working systems, refactor existing codebases, and produce tests and infrastructure at speeds that were previously impractical. As a result, implementation is no longer the dominant constraint in many software projects.

This shift requires a corresponding change in how software development is organised and governed.

⸻

The historical SDLC constraint

Traditional SDLC models assumed: • implementation was expensive and slow, • errors were costly to correct late, • coordination between people was a major risk, • parallel work increased integration complexity.

Waterfall, Agile, and their variants are optimisations around these assumptions. Their practices—backlogs, sprint cycles, hand-offs, and ceremonies—exist primarily to manage human execution and coordination.

When these assumptions no longer hold, the effectiveness of the process degrades.

⸻

The impact of AI on execution cost

AI systems significantly reduce the marginal cost of: • writing and rewriting code, • generating tests and scaffolding, • refactoring across large codebases, • producing infrastructure and configuration artefacts.

As execution cost falls, new risks dominate: • unclear intent, • poorly defined constraints, • untested assumptions, • delayed or misleading feedback, • decisions made without evidence.

In this environment, delivering software faster does not guarantee better outcomes. Speed amplifies both correct and incorrect decisions.

⸻

The new primary constraint: decision quality

In an AI-enabled environment, the dominant constraint shifts upstream: • What problem should be solved? • What outcome matters? • What constraints must not be violated? • What assumptions are being made? • What evidence will justify continuing, changing, or stopping?

The SDLC must therefore optimise for decision quality, not task throughput.

⸻

Design goals for an AI-native SDLC

An SDLC designed for an AI builder should: 1. Make intent explicit before execution. 2. Treat constraints as enforceable system properties. 3. Allow implementation to be replaced without ceremony. 4. Enforce quality, security, and safety automatically. 5. Base decisions on observed system behaviour. 6. Maintain clear human accountability.

These goals inform AI-SDLC v1.0.

⸻

AI-SDLC v1.0

AI-SDLC v1.0 is a decision-driven lifecycle composed of a closed loop rather than a linear delivery pipeline.

6.1 Intent definition

A human defines: • the problem being addressed, • the affected users or systems, • the desired outcome, • non-negotiable constraints, • key assumptions, • conditions that would justify stopping or changing direction.

This intent is concise and explicit. No implementation begins without it.

⸻

6.2 Formal specification

The intent is translated into a machine-readable specification describing: • system behaviour and interfaces, • data and state, • non-functional requirements, • acceptance scenarios, • required observability.

The specification captures assumptions and boundaries rather than promising a fixed solution.

⸻

6.3 Automated implementation

AI systems generate: • application code, • tests, • infrastructure configuration, • instrumentation, • supporting documentation.

Human involvement focuses on review and judgement, not manual construction.

⸻

6.4 Automated enforcement

Automated gates enforce: • correctness through tests, • security and dependency controls, • performance and cost limits, • deployment safety mechanisms.

Failures block progression without negotiation.

⸻

6.5 Deployment

Deployment is incremental and controlled. It is treated as the start of observation rather than the end of development.

⸻

6.6 Observation

The running system produces evidence, including: • user and system behaviour, • reliability and performance data, • operational cost, • failure modes and friction points.

This evidence must be defined prior to implementation.

⸻

6.7 Decision

A named decision owner reviews the evidence and chooses one action: • continue, • adjust, • stop, • pivot.

The decision and its basis are recorded, closing the loop.

⸻

Unit of progress

In AI-SDLC v1.0, the primary unit of progress is a decision informed by evidence, not a completed task.

Progress is measured by: • reduced uncertainty, • validated or invalidated assumptions, • clearer direction.

⸻

Accountability and roles

AI-SDLC v1.0 does not eliminate human responsibility. It clarifies it.

Required functions include: • intent ownership, • decision accountability, • system stewardship, • automated execution.

These functions may be combined or separated as appropriate.

⸻

Implications

AI-SDLC v1.0 does not reject prior methodologies. It recognises that they were optimised for a different constraint set. As implementation cost collapses, SDLCs must prioritise clarity, evidence, and decision-making.

Organisations that continue optimising for execution speed alone risk becoming faster at producing the wrong outcomes.

⸻

Conclusion

AI fundamentally alters the economics of software development. When execution is cheap, decision quality becomes the dominant factor. AI-SDLC v1.0 proposes a lifecycle aligned with this reality: one that treats intent, constraints, evidence, and accountability as first-class elements, and positions AI as the primary executor rather than the primary decision-maker.

⸻

Moving Past the Data: Challenges in Adopting AI in Industry

2025-08-07T00:00:00+00:00

Author: Khalid Taha
Date: 7 August 2025

Abstract

High-quality data is crucial for the successful adoption of AI, yet many initiatives falter due to various other factors. This paper highlights four key structural challenges that obstruct AI implementation: complexity of integration, organizational inertia, ambiguity regarding return on investment, and an excessive focus on data perfection. Insights from industry benchmarks and recent studies suggest that achieving AI readiness requires a holistic approach that goes beyond merely having clean data; it demands systemic alignment throughout the organization.

1. Introduction

Some industry experts link the failures of AI projects to issues with data quality. Yet, few studies systematically assess this belief, and many documented failures stem from broader challenges that aren’t solely related to data inputs. In their 2023 study, Zha et al. introduce a framework for data-centric AI that evaluates readiness in three key areas: the development of training data, the preparation of inference data, and ongoing maintenance. They suggest that placing too much emphasis on model architecture has shifted focus away from critical data and systems engineering challenges. Their findings advocate for a more comprehensive view that emphasizes the need for alignment between technical and organizational systems, rather than just improved datasets.

2. Integration Complexity

AI systems often need to work alongside legacy software, batch processes, and disconnected workflows. Many operational systems struggle to handle real-time outputs or incorporate predictive decisions effectively. This challenge is illustrated by Mazumder et al. (2024) in their study of the Gen-QOT inventory control framework. They model realistic constraints such as specific lead times from suppliers, rules for batch shipments, and patterns of disruption. Even when using clean, simulated data, the outputs of these models can fall flat unless they are seamlessly integrated into existing business workflows. The main hurdle isn’t the quality of the data; instead, it’s the difficulty of applying predictions to decision-making in legacy systems. Numerous projects come to a standstill because downstream systems cannot utilize model outputs.

3. Organizational Inertia

The success of AI hinges on the people involved and the processes in place, not merely on the algorithms themselves. Often, internal resistance, insufficient training, and ambiguous accountability can hinder deployment efforts.

The DataPerf suite (Mazumder et al., 2023) evaluates data-focused interventions in areas like vision, speech, and retail. Their research indicates that human-driven initiatives, such as targeted data cleaning and effective sample selection, led to performance improvements that surpassed those achieved through model tuning. Organizations that engage in ongoing collaboration with their data teams generally outpace those that strive for a flawless dataset from the start.

Ultimately, organizational readiness—characterized by teamwork, iterative development, and clear ownership—proves to be more influential than infrastructure alone.

4. Economic Friction and ROI Ambiguity

Many businesses are often reluctant to adopt AI due to uncertainties around its short-term value. Complex solutions frequently struggle to provide quick or apparent returns. In a study by Chandra et al. (2024), various forecasting models were assessed using retail data. Surprisingly, simpler machine learning methods, such as LightGBM and XGBoost, which are based on decision trees, outperformed deep neural networks. These models not only trained faster and offered more precise explanations for their outcomes but also required considerably fewer resources. Their success stemmed not from sophisticated data or computing power, but from their practical fit with real-world deployment challenges. By selecting models that align closely with business needs rather than focusing solely on technical innovation, companies can effectively lower costs and mitigate risks.

5. Tolerance of AI to Imperfect Data

AI has shown remarkable resilience, performing effectively even in situations where data is noisy, incomplete, or subjected to censorship. A notable development in this area is the FreshRetailNet-50K benchmark introduced by Singh et al. (2024) for demand forecasting during stockouts. Their research explored various time-series models like TimesNet and ImputeFormer, which leverage attention mechanisms to fill in the gaps left by missing demand signals. Despite a significant 30% data loss, these models managed to maintain low bias and high accuracy, challenging the traditional belief that missing data renders forecasting unreliable. Thanks to these advancements, modern models are now equipped to handle the messiness often found in real-world scenarios. As a result, businesses can begin their data-driven initiatives with imperfect information rather than holding out for complete data quality.

6. Conclusion

The successful deployment of AI involves more than just having clean data. It requires the seamless integration of existing systems, alignment with business priorities, and strong collaboration among teams. Benchmarks like DataPerf and FreshRetailNet-50K highlight that readiness stems from organizational adaptability, not merely the quality of input data. Enhancing AI maturity entails investing in system design, refining organizational processes, and embracing iterative practices—not solely focusing on datasets.

References

Zha, D., Bhat, Z. P., Lai, K. H., Yang, F., Jiang, Z., Zhong, S., & Hu, X. (2023). Data-centric Artificial Intelligence: A Survey. arXiv preprint arXiv:2303.10158.

Mazumder, M., et al. (2023). DataPerf: Benchmarks for Data-Centric AI Development. arXiv preprint arXiv:2207.10062.

Mazumder, M., et al. (2024). Learning an Inventory Control Policy with General Inventory Arrival Dynamics. arXiv preprint arXiv:2405.17533.

Chandra, R., Ruj, S., & Pal, A. (2024). Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting. arXiv preprint arXiv:2506.05941.

Singh, K. K., et al. (2024). FreshRetailNet-50K: A Stockout-Annotated Censored Demand Dataset for Latent Demand Recovery and Forecasting in Fresh Retail. arXiv preprint arXiv:2405.10468.

Making a Stateless LLM Project‑Aware

2025-07-26T00:00:00+00:00

Large language models have goldfish memories—they don’t recall past calls unless you hand them that context every single time. Yet you can run a weeks‑long coding project with ChatGPT (or any other LLM) if you wrap a thin layer of “state management” around each request.
Below you’ll find the key ideas in plain, practical language. They work no matter how you connect: web chat, REST API, CLI, whatever.

1 Persist state outside the conversation

Treat the model like a freelance coder who clears their desk at the end of the day. Anything you don’t file away will vanish.
Keep three small documents in your own storage—Git, S3, a database row, anything you control:

File	What’s inside	Why it matters
Blueprint / spec	The lasting requirements and architecture notes.	Gives the model a north‑star every call.
Journal / log	A running human‑readable diary of what changed and why.	Lets you audit progress and decisions.
State file	Exactly one entry like `next_task: “build parser”` plus a `status`.	Tells the model what to do right now.

The assistant only reads and edits these files; it never invents its own version of reality.

2 Embed a deterministic contract in every prompt

Before you hit “send,” your wrapper code builds a prompt that repeats the ground rules:

Load the state file.
If there’s one task marked todo, finish it in this turn.
If nothing is todo, pull the next milestone from the spec and write it into the state file.
Make sure there’s never more than one open task.

Because you restate the contract every time, the model can’t drift—even if the chat history is empty.

3 Return full files, not diffs

Ask for the entire contents of each file it touched. Why? You can overwrite the old file without worrying about merge conflicts, and your test runner can prove everything still works. The transport doesn’t matter—fenced code blocks, JSON parts, multipart: just send the whole thing.

4 Keep each round bite‑sized and atomic

One todo → one model call → one test run.
Tests happen after files are written.
If tests fail, you feed the error log back; the same todo stays put.
If tests pass, mark the task done and queue the next one.

Small, clear steps mean bugs are easy to trace and roll back.

5 Park heavy data elsewhere

Chat messages should stay text‑only. Big binaries (images, model weights, year‑long CSVs) live in cloud storage; the state file just holds a link or hash. That keeps prompts fast and avoids attachment headaches.

6 Automate the envelope, let the model do the thinking

A lightweight driver script can:

Read the three docs from storage.
Build the prompt with the contract above.
Call the LLM.
Parse the reply, save the files.
Run your test suite; if green, commit and push; if red, send the errors back as the next prompt.

The script handles routine plumbing— the model focuses on writing code and updating the state doc.

Take‑away
Long‑running work with an LLM isn’t magic; it’s a simple protocol. Persist a tiny control plane (spec, journal, state) and remind the model of the rules at every turn. Follow that rhythm and your AI teammate will pick up exactly where it left off—no matter how or when you call it.

From Code to Conversation: Embracing High-Level Prompt Languages

2025-02-07T00:00:00+00:00

Programming is evolving. As AI continues to advance, the role of developers is shifting from writing every line of code to specifying what software should do at a much higher level of abstraction. This evolution suggests a future where developers may work primarily with a standardized prompt language above traditional programming languages like Python.

A New Layer of Abstraction

For decades, the programming landscape has moved upward along the abstraction ladder. Initially, programmers had to manage low-level machine instructions. The advent of higher-level languages allowed us to focus more on logic and less on hardware details. With AI-powered code generation, there’s potential to push this even further. Rather than manually coding every detail, we can describe user stories, requirements, and edge cases in natural or formal prompt language.

Imagine a scenario where you write a detailed prompt that encapsulates what the software should do. This prompt—validated by a syntax checker and processed by a compiler or interpreter designed for the prompt language—would then be translated by AI into working code. This approach could streamline development, reduce ambiguity, and let developers concentrate on the strategic aspects of system design.

The Advantages of a Prompt Language

Higher-Level Thinking: Prompt language allows us to operate at a higher level than traditional code. Instead of focusing on minute details, developers would define the software’s goal, letting AI handle the translation into executable code.
Clarity and Precision: With a standardized syntax and semantics, a formal prompt language could reduce the ambiguities inherent in natural language. Early error detection through syntax checking would improve the development process and reduce the need for multiple follow-up prompts.
Enhanced Collaboration: As AI handles more low-level implementation, developers can focus on system architecture, user experience, and overall strategy. This shift would foster a more collaborative environment where human insight and machine precision work hand in hand.

Challenges on the Path Forward

Despite its promise, adopting a prompt language is not without challenges:

Ambiguity in Natural Language: While natural language is flexible and accessible, its inherent ambiguity can lead to misinterpretations. A formalized prompt language must strike a balance between accessibility and precision.
Learning Curve: Developers may need to acquire new skills to craft high-level prompts effectively. Transitioning from writing detailed code to articulating comprehensive requirements will require adjustments in mindset and practice.
Human Oversight Remains Crucial: Even as AI-generated code becomes more reliable, human expertise is indispensable. Developers will still need to validate the code, manage edge cases, and ensure the final product is robust and secure. AI acts as a powerful tool, but the developer is ultimately responsible for quality and integration.

A Collaborative Future

As AI matures, we can expect a future where the line between coding and conversation blurs. Developers might spend more time refining their high-level specifications and less time wrestling with syntax errors or debugging low-level code. This shift could democratize software development, making it accessible to a broader range of people and fostering more incredible innovation.

In this emerging paradigm, the relationship between humans and machines involves collaboration. Developers provide the vision and context, while AI handles the detailed implementation. This partnership increases productivity and opens up new avenues for creativity as more focus is on conceptual design and system architecture.

Conclusion

The evolution toward a standardized prompt language represents a profound shift in how we approach programming. By moving to higher levels of abstraction, we can simplify the development process and leverage AI to do the heavy lifting. While challenges remain—especially in balancing precision with accessibility—this new model promises a future where programming is less about writing code line by line and more about designing robust, innovative systems through precise, high-level specifications.

Embracing this shift will require developers to rethink their roles, transitioning from code writers to architects of ideas. As we refine our methods and tools, the promise of a prompt-driven future could fundamentally transform the software development landscape.

Orchestrating LLMs: Building a Python Bookkeeping System with AI Collaboration

2024-11-07T00:00:00+00:00

In my recent project, I embarked on an exciting journey to develop a bookkeeping system using Python, heavily leveraging the capabilities of a large language model (LLM). The goal was to build an application and explore how LLMs can assume various roles in software development—acting as business analysts, system designers, and code developers.

The process began by engaging the LLM to create user stories and tasks and identify their dependencies. The LLM provided detailed tables outlining these dependencies, which I manually inputted into GitHub’s project board. Although the automation wasn’t complete, the collaboration was seamless. I structured the project board with a simple workflow:

User Story
Backlog
To-Do
In Progress
Testing
Done

This was supplemented with labels and milestones—all guided by the LLM’s output.

For each user story, I asked the LLM to generate tasks, and I requested developer instructions for each task. Then, it produced code snippets by prompting the LLM to act as a developer. I was an AI Orchestrator, coordinating this process—copying code into VSCode, testing it using scripts generated by the LLM, and troubleshooting errors with its assistance.

The system features APIs and dummy machine-learning plugins, hinting at future integrations. The LLM also helped generate comprehensive API and plugin user guides, covering the application’s design, implementation, and testing phases.

Throughout this experiment, I tested several LLM models from different providers. While some fell short, one stood out in delivering exceptional results. Although I prefer not to advertise it openly, I’m open to sharing details collaboratively. This project underscores the potential of LLMs in software development and the evolving role of developers in orchestrating AI capabilities.

I’m excited about where this synergy between human coordination and AI can lead, especially with plans to enhance the ML plugins. This approach doesn’t replace developers but augments our ability to build complex systems efficiently.

You can explore the project’s code on my GitHub repository, linked to my LinkedIn profile. I’m sharing this experience through my startup, RobotFace AI, as a testament to the innovative possibilities when humans and AI collaborate.