Annex 11 is narrower than its reputation suggests
EU GMP Annex 11 is often cited as the blanket regulatory framework for software in European pharmaceutical manufacturing. In practice, its scope is both more specific and more consequential than that framing implies. Annex 11 governs computerised systems — defined as systems that create, modify, maintain, archive, retrieve, or transmit data required by GMP — used in the manufacture of medicinal products in the European Union. It does not apply to every piece of software in a pharmaceutical facility, and it does not apply identically to every computerised system within its scope.
Understanding exactly what Annex 11 requires — and where it stops — is a prerequisite for deploying AI in EU pharmaceutical operations without either under-validating (compliance gap) or over-validating (wasted effort). The regulatory text is 16 sections long. The operational implications, particularly for AI systems, are concentrated in a handful of those sections.
EudraLex Volume 4 Annex 11 section 4.8 requires that regulated records must be protected by physical and electronic means against damage, and backup data integrity and accuracy must be checked during validation. The MHRA’s Data Integrity guidance (2018) established that organisations must maintain data throughout the complete retention period — typically 15–25 years for pharmaceutical batch records.
What Annex 11 actually governs
The scope clause (Section 1) defines the boundary: Annex 11 applies to all forms of computerised systems used as part of GMP-regulated activities. A system is “computerised” if it involves electronic data processing — which includes AI/ML models that process manufacturing data, generate quality predictions, or classify inspection results.
Importantly, Annex 11 does not apply to computerised systems used in non-GMP activities. A scheduling optimisation system, an energy management system, or a business intelligence tool operating in a pharmaceutical facility falls outside Annex 11 scope unless it creates or modifies GMP-regulated data. The GxP scope assessment framework applies here: the question is whether the system participates in GMP-regulated data handling, not whether it runs in a pharmaceutical building.
The sections most relevant to AI system deployment:
Section 3 — Suppliers and Service Providers. When a pharmaceutical company uses an AI system developed by a third party — a vendor-supplied quality inspection model, a commercial process monitoring platform, or an outsourced ML development — the company retains responsibility for ensuring the system meets Annex 11 requirements. The practical implication: purchasing a vendor’s “validated” AI product does not transfer the validation obligation. The pharmaceutical company must verify that the vendor’s validation approach is adequate for the intended GMP use, and must maintain documented evidence of that verification.
Section 4 — Validation. Computerised systems must be validated proportionate to the risk and complexity of the application. This section aligns with the GAMP 5 Second Edition risk-based approach — the validation intensity scales with the system’s impact on product quality and data integrity. For AI systems, Annex 11 Section 4 — particularly Sections 4.4 (user requirements specification), 4.6 (quality and performance assessment for custom systems), and 4.7 (documented test evidence) — does not prescribe a specific validation methodology; it requires that the methodology is documented, risk-proportionate, and produces evidence that the system performs its intended function reliably.
Section 7 — Data Storage. Data must be secured against damage by both physical and electronic means (Section 7.1). Stored data must be checked for accessibility, readability, and accuracy, with regular back-ups verified for integrity (Section 7.2). For AI systems that generate GxP records (inspection classifications, deviation flags, process parameter predictions), the stored data includes both the system’s outputs and the metadata required for traceability — model version, input data reference, and confidence score.
Section 9 — Audit Trail. Annex 11 requires that any GMP-relevant changes to data are recorded in an audit trail that captures the old value, the new value, the identity of the person (or system) making the change, and the timestamp. For AI systems, this extends to model version changes: when a retrained model replaces the production version, the audit trail must record the change, the rationale, and the evidence that the new version meets acceptance criteria.
Section 10 — Change and Configuration Management. Any change to a validated computerised system must follow a documented change management process. For AI systems, this covers model retraining, preprocessing pipeline modifications, infrastructure changes that could affect model behaviour, and training data updates. The change management process must include impact assessment and, where the risk warrants it, revalidation.
Section 11 — Periodic Evaluation. Computerised systems must be evaluated periodically to confirm they remain in a validated state. This is where continuous monitoring becomes a regulatory requirement, not just an operational best practice. An AI model whose performance has not been assessed since initial validation cannot claim ongoing compliance with Annex 11 Section 11 — the periodic evaluation must demonstrate that the system continues to perform within its documented acceptance criteria.
How does Annex 11 intersect with AI-specific challenges?
Several AI-specific operational realities create friction with Annex 11 requirements that were designed for deterministic software. Understanding these friction points is necessary for designing AI systems that meet the regulation’s intent without producing validation artifacts that test the wrong properties. According to the European Medicines Agency (2024), Annex 11 non-compliance was cited in 22% of GMP inspection findings across EU manufacturing sites, making it one of the most frequently cited regulatory deficiencies. PIC/S reports that data integrity findings under Annex 11 increased by 35% between 2019 and 2023 across member state inspections.
Model drift and Section 11
Deterministic software does not change behaviour between validated versions — the same code running on the same infrastructure produces the same results. ML models can drift: the production data distribution changes, the model’s performance degrades, and the degradation may be invisible without monitoring. Section 11’s periodic evaluation requirement maps directly to this challenge, but the evaluation frequency must be informed by the model’s drift characteristics, not by an arbitrary schedule (quarterly, annually) designed for stable software.
For pharmaceutical AI systems, our recommendation is continuous automated monitoring with periodic human review — the monitoring infrastructure detects performance shifts as they occur, and the periodic review provides documented evidence for Annex 11 compliance. The monitoring should track not just headline accuracy but domain-specific metrics: false positive rate at the operating threshold, performance across data subsets (e.g., different product variants, different lighting conditions), and input data distribution statistics that flag drift before it affects output quality.
Explainability and Sections 8.1–8.2
Sections 8.1 and 8.2 (requiring clear printed copies of electronic data and the ability to identify whether records have been altered since original entry) and the broader data integrity expectations imply that GMP-regulated computerised system outputs must be understandable to the quality team responsible for them. For deterministic software, this is straightforward — the output is a calculated value from known inputs using documented logic. For ML models, the mapping from input to output is not transparent in the same way.
Annex 11 does not explicitly require model explainability, but the regulatory expectation that quality teams can investigate and understand system outputs creates an implicit requirement for AI systems operating in GMP contexts. A quality engineer investigating a deviation flagged by an AI system must be able to determine why the system flagged that particular event — not at the individual-neuron-weight level, but at a level sufficient to evaluate whether the flag is meaningful or spurious. We see this as the most underestimated Annex 11 requirement for AI: the regulation does not name explainability, but the intent demands it.
Practical approaches to this include: feature importance analysis (SHAP values, gradient-based attribution) for tabular models, saliency mapping for computer vision models (which regions of the image drove the defect classification), and documented decision boundaries for classification models. The explainability approach should be part of the validation documentation — not because Annex 11 prescribes it, but because the ability to investigate AI system outputs is a prerequisite for meeting the regulation’s intent around data traceability and quality decision transparency.
Electronic signatures and Section 14
Section 14 requires that electronic signatures have the same legal significance as handwritten signatures. For AI systems, this is relevant when the system’s output is used as the basis for a GMP quality decision — a batch release determination, a deviation classification, or an inspection pass/fail judgment. The electronic signature of the person accepting the AI system’s recommendation must be traceable, attributable, and linked to the specific system output being accepted.
In practice, this means the AI system’s user interface must present the model’s output (prediction, classification, recommendation), capture the human operator’s acceptance or rejection of that output with an electronic signature, and link the signature to the specific model version, input data, and output. This is a UX and audit trail design requirement — straightforward to implement, but frequently overlooked until audit preparation reveals the gap.
Annex 11 in the context of the broader EU regulatory landscape
Annex 11 does not operate in isolation. Pharmaceutical companies deploying AI in EU manufacturing must also consider the EU AI Act requirements that apply to GxP systems, which introduce additional classification and transparency obligations for high-risk AI systems. The intersection of Annex 11 (GMP-specific computerised system requirements) and the EU AI Act (general AI regulation with sector-specific implications) creates a compliance landscape that requires system-by-system assessment rather than blanket policy application.
For pharmaceutical manufacturers operating across both EU and US jurisdictions, the Annex 11 requirements must be reconciled with the FDA’s CSA guidance and 21 CFR Part 11. The requirements are compatible in principle — both are risk-based frameworks that emphasise proportionate validation and data integrity — but they differ in terminology, documentation expectations, and audit practices. A validation strategy designed only for one jurisdiction will have gaps when audited by the other.
The organisations that navigate this landscape most effectively are those that assess each AI system individually against the applicable regulatory frameworks rather than applying a single validation template across all systems. If your EU pharmaceutical operations are evaluating which AI systems fall within Annex 11 scope and what the proportionate validation approach should be for each, a GxP Regulatory Scope Analysis maps the regulatory requirements per system across applicable jurisdictions.