GAMP Software Categories: How to Classify Pharmaceutical Systems for Validation

The category determines the validation effort

GAMP software categories are the classification framework that determines how much validation effort a computerised system requires in a pharmaceutical environment. The rule is simple in shape: more complex and more configurable software requires more thorough validation. The difficulty lies in applying it to modern software — particularly AI and machine learning, where a single deployed system mixes commercial frameworks, pre-trained model weights, custom training pipelines, and configured inference infrastructure.

The ISPE’s GAMP 5 framework defines four active software categories (Category 2 was retired in the current edition). We treat the category not as a label but as a budget — it tells the validation team how much evidence to gather and where to focus risk-based testing.

Category definitions and validation requirements

Category	Name	Description	Validation approach	Examples
1	Infrastructure software	Provides the computing environment	Qualification — verify installation and configuration	Operating systems, databases, virtualisation platforms, network firmware
3	Non-configured products	Used as delivered, without configuration	Verification of intended use, vendor documentation review	Laboratory instruments with embedded firmware, standard calculators
4	Configured products	Configured for the specific application	Configuration verification, functional testing of configured features	ERP, LIMS, MES, SCADA, CRM with workflow configuration
5	Custom applications	Developed specifically for the intended use	Full lifecycle validation — requirements, design, code review, testing	Bespoke manufacturing control systems, custom analytics applications

Category 1 systems require documented evidence that they are installed correctly and operate as expected — no detailed functional testing. Category 5 systems require full lifecycle documentation: user requirements, functional specifications, design specifications, code review, unit testing, integration testing, and user acceptance testing. Category 3 and 4 sit between those poles, with the configured surface area governing how much functional testing is appropriate.

How is AI/ML software classified under GAMP 5 — Category 3, 4, 5, or something new?

Machine learning systems do not fit cleanly into a single traditional category, and forcing them into one is the most common classification mistake we see when reviewing pharmaceutical AI projects.

Consider a computer vision model for pharmaceutical quality inspection. The components decompose as follows:

A commercial ML framework (PyTorch, TensorFlow) — Category 1 infrastructure.
A pre-trained model architecture (ResNet, YOLO) used unmodified — Category 3; fine-tuned on company data — moves toward Category 5 in the fine-tuning component.
A training pipeline written in-house that ingests facility data, applies augmentation, and produces model weights — Category 5 custom development.
Inference hardware with configured drivers (CUDA, TensorRT, container runtime) — Category 4 configured product, on top of Category 1 infrastructure.

The system spans multiple categories simultaneously. The GAMP 5 Second Edition resolves this by directing teams to classify based on the system’s overall risk to product quality rather than forcing each component into a single bucket. In practice, this is observed-pattern guidance from our own validation engagements: the training pipeline and the resulting model are classified as Category 5, while the underlying framework and infrastructure stay at their respective lower categories, and the validation plan is built around the highest-risk component rather than the lowest.

The detailed methodology — including how the ISPE GAMP AI guidance reframes this classification when models retrain continuously — is covered in how to classify and validate AI/ML software under GAMP 5.

Common classification mistakes

A short diagnostic checklist for teams reviewing their own GAMP classification:

Classifying commercial ML platforms as Category 3 across the board. A pre-trained model used without modification may genuinely be Category 3. The same model fine-tuned on company data has a Category 5 component — the fine-tuning step itself. The platform classification cannot absorb the training pipeline.
Treating all custom code as Category 5. A Python script that reformats CSV data is technically custom software, but its risk to product quality may not warrant full Category 5 validation. GAMP 5 explicitly supports risk-proportional treatment; documentation depth follows risk, not file count.
Ignoring infrastructure classification. The GPU hardware, CUDA drivers, and container runtime that an ML model runs on are Category 1 infrastructure. They still require qualification — a model validated on one GPU configuration is not automatically valid on another, because the inference behaviour can shift with kernel version, driver, or numerical precision setting.
Static classification for systems that retrain. A model that updates weekly on new production data is not a snapshot. Treating it as a one-time Category 5 deliverable misses the entire continuous-validation problem the GAMP AI guidance was written to address.

The practical decision

Classification is not an academic exercise. It determines how much time, effort, and documentation the validation team must produce before the system can be used in production. Over-classification — treating every component as Category 5 — wastes resources and slows deployment. Under-classification — treating a custom-trained model as Category 3 — creates regulatory exposure that surfaces during inspection. The answer is accurate classification based on system architecture, a documented risk assessment, and the specific GAMP 5 guidance, followed by proportionate validation effort.

We see two patterns consistently. Teams that start with the artifact (an architecture diagram, a data-flow map) and assign categories per component land closer to a defensible classification. Teams that start with a procurement label (“it’s a SaaS product, so Category 4”) tend to misclassify the parts that matter most for ML — the training data and the model weights themselves.

How do you handle systems that span multiple GAMP categories?

Modern pharmaceutical systems frequently combine components from multiple GAMP categories. An MES (Manufacturing Execution System) typically includes Category 3 infrastructure components (operating system, database), Category 4 configured software (the MES platform), and Category 5 custom components (site-specific business logic, integrations with other systems).

Our validation approach for mixed-category systems: assess each component against its applicable category, but validate the system as an integrated whole. Component-level testing verifies individual functions. Integration testing verifies that components interact correctly. System-level testing (OQ, PQ) verifies end-to-end workflows that span multiple components.

The risk assessment for mixed-category systems focuses on the interfaces between components, where failures are most likely. A misconfigured integration between the MES and the LIMS may result in incorrect test results being associated with the wrong batch — a high-impact failure that occurs at the interface rather than within either system individually. This is an observed pattern across our regulated-systems engagements rather than a benchmarked failure rate.

We document mixed-category systems using a system architecture diagram that maps each component to its GAMP category and identifies the interfaces between components. This diagram becomes a key input to the risk assessment and a reference document for change control — when a change is proposed to one component, the diagram shows which interfaces, and therefore which integration tests, may be affected.

GAMP Software Categories: How to Classify Pharmaceutical Systems for Validation

The category determines the validation effort

Category definitions and validation requirements

How is AI/ML software classified under GAMP 5 — Category 3, 4, 5, or something new?

Common classification mistakes

The practical decision

How do you handle systems that span multiple GAMP categories?

FAQ

How to Classify and Validate AI/ML Software Under GAMP 5 in GxP Environments

GAMP 5 Guidelines: How to Apply Risk-Based Validation to Pharma Software

GAMP Software Categories Explained: What Each Category Means for Pharma Validation

Validation-Ready AI for GxP Operations in Pharma