How “automated visual inspection” works in practice in production? The phrase gets applied to everything from a basic blob-detection script to a multi-camera deep learning pipeline running at 2000 parts per minute. The engineering challenge is different at each end of that spectrum. This article focuses on the middle ground: inspection systems that require machine learning rather than classical image processing, deployed on real production lines where uptime and false-reject cost matter. For the hardware and deployment context, see the manufacturing inspection decision framework. Hardware setup for automated inspection The hardware stack for automated visual inspection has four components: imaging hardware, compute, integration layer, and rejection mechanism. Getting any one wrong limits what the software can achieve. Imaging hardware means camera, lens, and illumination as a system — not three separate decisions. The lens determines field of view and depth of field; the camera determines resolution, frame rate, and dynamic range; the illumination determines whether defects are visible at all. Specify minimum detectable defect size first, then work backwards to pixel size, then field of view, then sensor resolution. For most production inspection: GigE Vision cameras are the standard interface (deterministic, well-supported, industrial-grade) Monochrome sensors outperform colour sensors for contrast-based defect detection (higher quantum efficiency per pixel) Colour cameras are necessary for colour-based defects (wrong component colour, discolouration) Compute placement matters. Edge-deployed inference (GPU or accelerator card co-located with the camera) gives deterministic low latency and avoids network dependencies. Cloud or server inference introduces latency and a single point of failure across multiple inspection stations. Illumination control requires shutting out ambient light or making the inspection enclosure light-tight. Ambient light variation between day and night shifts degrades model performance more than almost any other variable. Practical comparison The three main model types for visual inspection have different trade-offs: Model Type Use Case Annotation Requirement Inference Speed Interpretability Classification Is this part good or bad? Image-level labels only Fast Low — no spatial output Object detection Locate and classify defects Bounding box annotation Moderate Medium — shows defect location Segmentation Precisely delineate defect area Pixel-level masks Slower High — shows exact defect extent In our experience, object detection is the right starting point for most defect inspection. It provides spatial output (where is the defect?) that operators need to understand and verify rejections, it handles multiple defect types and multiple defects per image without modifications, and annotation effort is lower than segmentation. Classification is appropriate when the only output needed is pass/fail and spatial localisation is not required — for example, verifying that a label is present and correctly aligned without needing to identify specific label defects. Segmentation is necessary when defect area or shape is part of the accept/reject criterion — for example, a scratch that covers more than 2mm² must be rejected, but smaller scratches are acceptable. Training data requirements The most common failure mode in automated inspection projects is insufficient training data for rare defect types. Typical issues: Production defect rates of 0.1–1% mean that capturing enough defective samples during normal production takes weeks or months Defects are not uniformly distributed — some defect types are far rarer than others Models trained with too few samples of a defect type learn unreliable decision boundaries Practical approaches to the data scarcity problem: Deliberate defect generation: produce defective samples intentionally during setup for training purposes Augmentation: geometric transforms, lighting variation, and noise injection expand the effective dataset but do not replace real defect variation Synthetic data: for structured defects with known appearance (scratches, dents), synthetic rendering can supplement real data — but verify that synthetic defects match real defect statistics before relying on them Anomaly detection approaches: for very rare defects, train on good-parts-only using reconstruction-based or feature-distribution methods (PatchCore, PaDiM) — acceptable when defect appearance is unpredictable Deployment on production lines Deploying to production requires more than a working model. These are the integration steps that are typically underestimated: Model serving: the model must run within the inspection cycle time. Profile inference latency on the target hardware before integration. If a 50ms cycle time is required and inference takes 40ms, there is no margin for anything else. Warm-up and startup: deep learning models have GPU warm-up latency on first inference. Do not start the line until the model has processed at least one batch; otherwise the first parts through are uninspected. Result persistence: log every inference result with the part image, timestamp, and decision. This is essential for post-hoc analysis when false reject rates are higher than expected and for auditing. Model versioning: when you retrain and redeploy, the new model must pass a validation gate (measured against a fixed test set) before going live. Avoid “update and hope” deployments. Drift monitoring: production conditions change. Lighting ages, part geometry drifts within tolerance, surface treatment varies by supplier batch. Monitor pass/fail rates and score distributions over time; a sudden shift in false reject rate is a diagnostic signal, not just a nuisance. Managing false-reject rates False rejects are the primary operational complaint about automated inspection systems. In our experience, teams underestimate FRR during commissioning because commissioning conditions are more controlled than steady-state production. False-reject diagnostic checklist Illumination stable across full operating shift? Check pass/fail rate by time of day. Part fixturing consistent? Variable orientation causes lighting geometry to change. Part cleanliness controlled? Coolant residue, dust, and condensation are common FRR triggers. Training data representative of current production? Check if part appearance has changed since training. Confidence threshold calibrated on held-out validation set? Threshold should not be tuned on training data. Multiple defect detectors interfering? Check whether overlapping detection regions cause double-counting. A sustained FRR above 1% typically justifies a full re-evaluation of illumination or training data rather than threshold adjustment. Threshold adjustment reduces FRR by increasing the false accept rate — that is the wrong trade-off for most inspection applications. Production readiness criteria Before signing off an automated inspection system as production-ready: Detection rate on held-out test set meets specification (typically ≥99% for critical defects) FRR on held-out good-parts set meets operational threshold (typically ≤0.5%) System runs without failure for 72 hours in a soak test at production throughput Operator interface for reviewing rejected parts is usable and understood by line operators Model performance monitoring dashboard is live and assigned to a responsible engineer Rollback procedure to manual inspection is documented and tested Meeting these criteria before go-live avoids the common outcome where a system “goes live” in a degraded state and requires months of remediation before it outperforms manual inspection.