← All artifacts
decision-log

Abe's AI Methodology

A practitioner's framework for building with AI. Developed from a decade in enterprise systems.

Where I started

Before “AI” was the popular term, Business Intelligence (BI) dominated the corporate world. The technology we now call AI was already there, embedded in enterprise products but far from mature, and without a consumer footprint. My undergrad and research years were in that era. I worked directly with SAP products, earning an institutional SAP certificate in data warehousing, business intelligence, and ERP systems: real-time analytics, in-memory processing, predictive modeling integrated into ERP. Organizations used these tools to compress forecasting cycles, optimize supply chains, and outpace competitors who were still running batch reports. Every company wanted “data-driven” on their slide deck, the way “blockchain” would later, and “AI” after that. Vendors rebranded old products as “intelligent.” The executives chasing the label wasted money. The practitioners who connected each initiative to a stakeholder’s actual problem were the ones who shipped results. That cycle taught me the lesson I still operate on: the technology is never the hard part. Connecting the technology to what actually drives the value proposition is.

The tooling

By 2018-2020, Machine Learning (ML) had entered the mainstream as a subset of AI focused on pattern recognition and prediction. When OpenAI launched ChatGPT in November 2022, the distinction became clear: ML learns from data, generative AI generates from it. The same hype cycle played out again. Companies stamped “AI-powered” on products that hadn’t changed. The SEC fined two firms in 2024 for fabricating AI claims entirely. The industry called it AI washing. But BI never went away either, and AI practitioners are now evolving that space with capabilities that would have been unimaginable a decade ago.

On the development side, GitHub Copilot and other third-party IDE extensions improved the developer workflow with inline suggestions and autocomplete. They lived inside the editor and augmented how you wrote code. Anthropic’s Claude Code took a different approach entirely. Claude Code is a CLI-based agentic coding tool that operates directly in your terminal with full access to your filesystem. I’ve used it since August 2025 and saw the earlier inception of it in the Sonnet days. Even then, it was promising and provided a unique developer experience in the AI space: planning, code generation, execution, and testing brought into a single interface, operating inside your development environment rather than alongside it.

Then Anthropic released Claude Opus 4.5 in November 2025, and my skepticism broke. I did not see AI development progressing much further at that point. Diminishing returns felt inevitable. Models training on AI-generated data felt like a ceiling. Opus 4.5 proved me wrong. The leap in reasoning, code generation, and sustained context handling across long sessions was tangible in every project I ran through it. On a broader scale, I watched software development itself become democratized. Anyone could bring an idea to a functional prototype in minutes. That spawned real concerns: the flood of AI slop, the complacency of vibe coding, and codebases that look complete but collapse under real-world pressure. But seasoned software engineers and developers understood this moment for what it was: the bottleneck had shifted from keystrokes to judgment. Architecture, systems thinking, domain knowledge, knowing when the AI is wrong, these fundamentals became more valuable, not less. AI didn’t close the gap between disciplined and undisciplined practitioners. It widened it.

What changed my practice was the broader conversation moving from Prompt Engineering to Context Engineering. Prompt engineering focused on crafting the right question. Context engineering focuses on architecting the right environment: what information the model sees, what persists across interactions, what tools are available, and how outputs flow back into the pipeline. When I started applying that to my own development process, encoding project strategy into instruction files, structuring PRDs as agent-readable contracts, building validation loops that feed findings back into design, the methodology below came together.

How I work with AI

The industry already has useful pieces: Spec-Driven Development for defining intent, Test-Driven Development with AI agents, multi-agent orchestration through frameworks like CrewAI and LangGraph. None of them close the full loop from strategy to validation. What I care about is integrating the full pipeline, from business strategy through execution and validation, in a single loop.

It starts with encoding the strategic foundation directly into the AI tooling. Instruction files like CLAUDE.md and AGENTS.md carry the project’s mission, constraints, and ground rules where the agents can actually use them. The project’s mission and constraints, scoped and made executable. From there, I write a Product Requirements Document where every acceptance criterion maps to a testable outcome. The PRD is a contract between me and my agents. If I can’t write the test for it, the requirement is too vague.

Design and planning come next, evaluating the tech stack against requirements and mapping how end users will actually interact with the system. Then execution: AI agents running parallel workstreams with test-driven development as the quality gate. Each iteration measures against the PRD criteria. You ship when the MVP satisfies the contract.

The step most people skip is validation. I use pre-programmed inspection agents to run visual and functional checks against the original success criteria. They catch what human review gets lazy about: layout regressions, broken interaction flows, scope drift between what was specified and what was built. If the gate fails, I loop back to design with specific findings.

Scaling means running multiple independent loops simultaneously, each with its own PRD, its own agents, its own validation gates, across feature suites or entirely different systems. I built a framework called Banh Mi Ops to handle that orchestration layer.

Every individual phase here exists somewhere in the industry. TDD with agents has academic papers. Spec-driven development has dedicated tooling. Multi-agent frameworks are mature. I built with all of them. The integration is what matters. The person who understands the stakeholder’s problem is the same person running the agents that solve it, with no handoff where intent gets diluted.

AI Applications Strategist Engineer

I call this role the AI Applications Strategist Engineer. The full vertical, holding business strategy, technical architecture, agent-driven execution, and quality validation in a single loop, is a gap the industry hasn’t named yet.

The strategist defines the value proposition. The engineer ships it. Same person.

Every system runs inside a set of constraints. Product requirements, security policies, organizational rules, the rules of the domain it operates in. I read those constraints and write the code that enforces them. The industry treats that as two jobs. For this role it’s one muscle.

The BI era needed this role but the technology couldn’t support it. Now it can.


RELATED

Banh Mi Ops: An AI Agentic Framework.mdAgentic Orchestration · Claude Code · Open SourceOpen Artifact
Search my experience