The engagement model determines the outcome Two organisations hire AI consultants for the same type of project — a predictive model for operational optimisation. Organisation A hires a firm on a time-and-materials basis with a vague scope: “build us an AI solution for demand forecasting.” Organisation B hires a firm with a phased engagement model: assessment → POC → production build → handoff, with a defined decision gate between each phase. Organisation A’s project runs for 8 months. The scope shifts three times. The team delivers a model that works in a notebook but is not integrated with the ERP system. The stakeholders are unsure whether the project succeeded. Organisation B’s project runs for 5 months. Each phase had a defined deliverable and a go/no-go decision. The assessment phase identified a data quality issue that was resolved before the POC began. The POC proved feasibility and quantified ROI. The production build delivered an integrated, monitored system. The handoff transferred operational knowledge to the internal team. McKinsey’s 2023 State of AI report found that organisations with structured AI adoption processes are 2.4× more likely to report significant value from AI investments. Industry experience consistently shows that phased AI engagements with explicit decision gates reduce project failure rates significantly compared to open-ended implementations. These are survey-based correlations from large samples, not controlled experiments — they indicate a strong directional pattern, not a guaranteed outcome. The difference is not the technical talent — both firms had competent engineers. The difference is the engagement structure. This pattern is consistent across our own engagements and across the broader industry data: the structure prevents failure more reliably than the talent prevents it. What does each phase of a structured AI engagement deliver? The assessment phase answers a simple question: should this project be started? The assessment is short, focused, and explicitly designed to produce a go/no-go recommendation before significant investment is committed. Data readiness evaluation. Hands-on examination of the actual data — not a metadata review, but inspection of data quality, coverage, and accessibility against the specific requirements of the proposed model. We connect to the data sources, examine representative samples, measure quality metrics (completeness, consistency, timeliness), and identify gaps that would prevent the model from achieving the required performance. Use case viability assessment. Is the proposed use case technically feasible with the available data and current model capabilities? Is there a simpler non-AI solution that would deliver the same outcome? Does the use case have measurable success criteria and quantifiable business value? Integration complexity mapping. What systems must the model integrate with? What are the API capabilities and limitations of each system? What is the estimated integration effort? Risk identification. What are the specific risks to project success — data quality, integration complexity, organisational readiness, regulatory constraints — and what is the mitigation strategy for each? Assessment deliverable. A concise report with a recommendation: proceed to POC (with defined scope), modify the scope (with specific modifications), or do not proceed (with specific reasons). The assessment report is valuable regardless of the recommendation — it provides the organisation with a data-backed evaluation of their AI readiness for this use case. McKinsey’s 2023 State of AI survey found that organisations using structured assessment before committing to AI projects reported 2.5× higher satisfaction with project outcomes. Phase 2: Proof of Concept (4–8 weeks) The POC phase tests the technical approach with the actual data, against the success criteria defined during the assessment. The POC structure described in our POC methodology includes four required sections: technical approach, success criteria, ROI measurement, and packageable value. Scope boundary. The POC explicitly does not build for production. It tests feasibility on representative data at manageable scale. The POC code is prototype quality — functional but not production-hardened. The purpose is to de-risk the production investment, not to deliver the production system. Decision gate. At the end of the POC, the results are evaluated against the predefined success criteria. Three possible outcomes: proceed to production build (the POC met the criteria and the ROI justifies the investment), iterate (the POC showed promise but needs additional work before the production decision — typically data quality improvements or model refinement), or stop (the POC did not meet the criteria and the evidence does not justify further investment). The decision gate is the structural element that prevents the engagement from becoming an open-ended exploration. The decision is made on evidence (the POC results against the criteria), not on opinion or momentum. Phase 3: Production Build (8–16 weeks) The production build phase takes the validated POC approach and builds it for production operation: hardened code, integration with production systems, monitoring infrastructure, automated evaluation pipelines, and deployment automation. Architecture design. The production architecture is designed for the operational requirements: latency, throughput, availability, scalability, and maintainability. The architecture may differ significantly from the POC architecture — the POC ran in a notebook on a single machine; the production system may require API serving, load balancing, GPU infrastructure, and database integration. Integration development. The integration work identified during the assessment phase is executed: connecting to data sources, building API endpoints for downstream systems, implementing authentication and access control, and building data pipelines that feed the model with current data. Monitoring and evaluation. Automated monitoring that tracks model performance in production (accuracy metrics, latency, error rates, data drift), with alerts that trigger when performance degrades below the defined thresholds. Automated evaluation pipelines that periodically assess the model against the test set to detect quality regression. Testing. Unit tests, integration tests, and end-to-end tests that validate the complete pipeline. Load testing that verifies the system can handle the expected production volume. Regression tests that run automatically on every code change. Phase 4: Handoff (2–4 weeks) The handoff phase transfers operational knowledge and responsibility from the consulting team to the client’s internal team. This phase is explicitly planned and resourced — it is not an afterthought. Documentation. Architecture documentation (what the system components are and how they interact), operational runbooks (how to monitor, how to troubleshoot common issues, how to retrain the model), and decision logs (why specific technical choices were made, what alternatives were considered, and under what conditions the team should reconsider those choices). Training. Hands-on training sessions for the internal team covering: model monitoring and alert response, retraining procedures, evaluation pipeline operation, and integration troubleshooting. The training is practice-based — the internal team performs the operations with the consulting team providing guidance, not just observing a demonstration. Support transition. A defined support period (typically 4–8 weeks) during which the consulting team is available for questions and escalation while the internal team operates independently. The support period has a defined end date — it is a transition mechanism, not an ongoing dependency. Handoff completion criteria. The handoff is complete when the internal team has independently operated the system through at least one retraining cycle, one monitoring alert response, and one operational incident — demonstrating that they can maintain the system without the consulting team. Why the structure matters The phased structure with decision gates serves two purposes. For the client: it limits financial exposure (each phase is committed independently, and the engagement can be stopped at any decision gate without losing the value from completed phases). For the project: it ensures that prerequisites are met before downstream phases begin (data readiness before POC, feasibility validation before production build, production system before handoff). The enterprise AI project failures we observe most often are projects that skipped the assessment phase (built on data that was not ready), skipped the POC phase (committed to production without validating feasibility), or skipped the handoff phase (delivered a system that the client cannot maintain). In regulated industries the cost of delay compounds further — pharma companies that delay AI adoption face ongoing manufacturing losses while waiting for organisational alignment that a structured engagement would provide from the start. When external AI consultants are not the right choice Not every AI initiative benefits from an external consulting engagement. There are situations where consultants are a poor investment — and recognising them early saves budget and avoids misaligned expectations. The organisation has no data to work with. If the key datasets for the proposed use case do not exist and cannot be collected within the project timeframe, consultants cannot compensate for the absence of data. The right investment is data infrastructure and collection processes, not model development. The problem is already solved by an off-the-shelf product. If a commercial SaaS product addresses the use case at an acceptable quality level, custom AI development is unnecessary. Consultants who recommend a build when a buy would suffice are optimising for their engagement, not for the client’s outcome. The organisation has strong internal ML engineering but lacks strategic direction. If the gap is executive alignment on AI priorities rather than technical execution, a strategy advisory engagement (days, not months) is more appropriate than a full consulting build. Stakeholder commitment does not exist. If the executive sponsor is uncommitted, the budget is provisional, or the business team has not agreed to integrate the model output into their workflow, the consulting engagement will deliver a technical artifact that no one adopts. The prerequisite is organisational commitment, which consultants cannot create. The project is exploratory with no defined success criteria. Open-ended “explore what AI can do for us” engagements rarely produce actionable outcomes. Consultants work best when there is a specific question to answer or a specific problem to solve — not when the engagement is a substitute for internal strategic thinking. The phased structure with decision gates is not unique to any single consultancy — it is the standard engagement model for AI projects where the technical risk justifies incremental commitment over a single large contract.