Statistical Process Control Examples for CV Defect Detection on the Line

A computer-vision inspection model ships to the line, and the team keeps watching it the way they watched it in the lab: one rolling accuracy number on a dashboard. That single number is exactly what hides the failure modes the line actually produces. By the time aggregate accuracy crosses an arbitrary threshold, the line has often already passed defective units downstream — the regression announced itself only after it had cost something.

The fix is not a better threshold. It is treating the model’s output as a statistical process and watching it the way a quality engineer has watched a stamping press or a fill line for the better part of a century. Statistical process control gives you control charts with limits derived from the line’s own baseline, and a small set of rules that flag a process drifting before it crosses the accuracy cliff. A run of points trending toward a control limit is a signal a single accuracy number will never show you.

This article is the worked-examples companion to our broader treatment of SPC on the production line for CV inspection and the chart-selection walkthrough. Here the focus is concrete: what the charts look like, how the limits are computed, and which signals map to model drift.

How Does Statistical Process Control Work for CV Defect Detection?

SPC monitors a process by plotting a metric over time against limits that describe what the process does when nothing is wrong. Those limits — typically a centre line and upper/lower control limits set at roughly three standard deviations of the baseline variation — are not targets. They are the voice of the process: the range the metric naturally wanders within when the line is stable. A point outside them, or a non-random pattern inside them, means something changed.

For a CV inspection model, the “process” is the model’s behaviour against the stream of parts crossing the camera. The metric is not loss or validation accuracy — it is something the line produces and you can act on: the defect rate the model reports, the rate at which it rejects good parts, or the distribution of its confidence scores. The shift from a lab dashboard to a control chart is the shift from how accurate is the model to is the model’s output behaving the way it did when we validated it. Those are different questions, and only the second one catches a packaging redesign or a lighting change as it happens.

The mechanism matters because of what it surfaces early. When a defect-detection model moves from pilot to the production line, the environment stops being stationary — lighting drifts across a shift, a supplier changes a substrate, a camera mount loosens by a fraction of a degree. Each of these nudges the model’s output before it degrades accuracy enough to be visible in aggregate. SPC is built to catch exactly that kind of slow, directional movement.

Three Worked Chart Examples (With Explicit Assumptions)

Below are three concrete examples. The numbers are illustrative — chosen to show how the limits are derived, not as benchmarks from any one line. Each assumes you have collected a baseline window during a known-good production run.

Example 1 — Defect Rate on a p-Chart

You inspect parts in subgroups (say, hourly batches) and the metric is the proportion flagged as defective. A proportion of a count over variable subgroup sizes is the textbook case for a p-chart.

Assumption: baseline run of 40 hourly subgroups, average subgroup size ~500 parts, observed average defect rate p̄ = 0.018 (1.8%).
Centre line: 0.018.
Control limits: p̄ ± 3·√(p̄(1−p̄)/n). For n = 500 that is 0.018 ± 3·√(0.018·0.982/500) ≈ 0.018 ± 0.0178, so roughly 0.0002 to 0.0358.
What a violation means: a subgroup at 4.5% defect rate is above the upper limit. Either the line genuinely got worse, or the model started over-flagging. The chart does not tell you which — it tells you to investigate before the next shift.

This is an observed-pattern worked example: the arithmetic is exact, but the specific rates are illustrative of how the limits fall out, not a measured rate from a named line.

Example 2 — False-Reject Rate on an np- or p-Chart

The defect rate alone can look stable while the model quietly starts rejecting good parts — the cost shows up as scrap and operator overrides, not as missed defects. Track the false-reject rate separately, validated against operator dispositions or a downstream re-check.

Assumption: baseline false-reject proportion of 0.6%, subgroup size 500.
A run of seven consecutive subgroups all above the centre line — none individually outside the limits — is itself a signal under the standard runs rules. That pattern is the fingerprint of a lighting drift or a contrast change that is pushing borderline-good parts over the rejection boundary.

Holding false-reject rate within its control limits is one of the directly measurable ROI outcomes of SPC-grounded monitoring: it is the number that keeps the line from quietly bleeding good product.

Example 3 — Confidence Distribution on an X̄-R Chart

Confidence scores are continuous, so a proportion chart does not fit. Plot the mean and range of the model’s confidence for a subgroup of parts on an X̄-R chart (mean chart paired with a range chart).

Assumption: baseline mean confidence 0.91 with a stable within-subgroup range.
What it catches: a downward trend in mean confidence — say six points each lower than the last — before any part is misclassified. The model is becoming less certain on the new part appearance, which is the earliest possible warning of a distribution shift. The range chart catches the opposite failure: confidence becoming erratic, widening, before the mean has moved.

Which Chart for Which Metric?

The chart type is determined by the data the metric produces, not by preference. This table is the decision surface.

Metric	Data type	Chart	Catches early
Defect rate	Proportion of variable-size subgroups	p-chart	Genuine quality drop or model over-flagging
Defect count (fixed subgroup)	Count, fixed n	np-chart	Same, simpler arithmetic when n is constant
False-reject rate	Proportion	p-chart (separate chart)	Lighting/contrast drift pushing good parts over the boundary
Confidence (mean)	Continuous	X̄ (with R or S)	Model losing certainty before misclassifying
Confidence (spread)	Continuous	R or S chart	Erratic confidence — distribution shift, mount vibration

The single most common mistake is to plot only the defect rate. A defect-rate chart can sit comfortably inside its limits while false-reject rate climbs and confidence erodes. Two or three charts, watched together, is the minimum honest instrumentation for a line-side model.

How Do You Set Control Limits From the Line’s Own Baseline?

This is the part teams skip, and skipping it is what turns SPC back into the arbitrary-threshold habit it was meant to replace. Control limits are not chosen — they are computed from a baseline window of stable, known-good production.

The procedure:

Collect a baseline during a run you trust — same lighting, same supplier, same parts the model was validated against. Long enough to span the normal sources of variation (a full shift at minimum, several shifts is better).
Compute the centre line as the baseline average of the metric.
Compute the control limits from the baseline’s own variation using the chart’s standard formula (the ±3-sigma expressions above), not a number someone liked.
Freeze the limits. They describe the validated process. You recompute them only after a deliberate, documented change — a recalibration, a new model version, a known process change — never to accommodate drift.

The discipline is in step four. If you keep widening the limits because points keep falling outside, you have re-derived the accuracy dashboard with extra steps. The limits are the contract; a violation is the line telling you the contract is broken. For the broader tooling around collecting and charting this baseline, see our walkthrough of pairing SPC tooling with CV defect detection.

Which SPC Rules Signal Drift Before Accuracy Degrades?

A point outside the control limits is the obvious signal. The more valuable ones are the pattern rules — the Western Electric / Nelson rules — because they fire while every individual point is still inside the limits. These are what give SPC its early-warning property over a single accuracy number.

The rules that map cleanly to CV model drift:

One point beyond a control limit — an abrupt shift. A camera knocked out of position, a sudden lighting failure, a wrong model version deployed.
A run of seven (or more) consecutive points on one side of the centre line — a sustained level shift. A supplier change or a recalibration that moved the operating point.
A trend of six or more points steadily increasing or decreasing — directional drift. The classic signature of lighting changing across a shift, lens contamination accumulating, or a model degrading against a slowly shifting part appearance.
Increasing spread on a range chart — the process becoming erratic before its mean moves. Often a mechanical issue: vibration, an intermittent connection, inconsistent part presentation.

The trend and run rules are the point of the exercise. A model that is slowly losing accuracy produces a trend on a confidence chart long before the aggregate number moves enough to trip a threshold. That lead time is the difference between recalibrating during a planned stop and discovering the regression in a customer complaint.

How Does an Out-of-Control Signal Connect to the Rollback Decision?

A control chart that no one acts on is decoration. The value of SPC-grounded monitoring is that an out-of-control signal feeds a defined response — and on a production line, that response is usually one of: alert and continue, recalibrate, or roll back to a known-good model version.

In our experience instrumenting industrial CV deployments, the signal-to-action mapping is worth defining before go-live, not improvised during an excursion. A single point beyond a limit on the defect-rate chart might pause auto-disposition and route parts to manual check. A confirmed downward trend on confidence might trigger a scheduled recalibration. A sharp shift coincident with a deployment is the cleanest rollback trigger there is — the chart timestamps the regression to the change.

These control charts and limit definitions are not just operational hygiene. They are the monitoring evidence an industrial CV inspection reliability artefact set signs against, and the drift signals are a direct input to what a production AI reliability audit actually tests. The chart is what makes “we monitor for drift” auditable rather than aspirational.

Designing the monitoring scheme is part of how a computer-vision inspection system earns the right to run unattended on a line, and it is one of the first things we set up in an engagement scoped to a line-side CV deployment.

FAQ

How does statistical process control work for CV defect detection, and what does it mean in practice?

SPC plots a metric the line produces — defect rate, false-reject rate, or confidence — over time against control limits derived from a stable baseline. In practice it reframes monitoring from “how accurate is the model” to “is the model’s output behaving the way it did at validation,” which is the question that catches lighting drift and process changes early.

What are concrete SPC control-chart examples for monitoring an inspection model on the line?

A p-chart for defect rate (proportion of variable-size subgroups), a separate p- or np-chart for false-reject rate, and an X̄-R chart for the mean and spread of confidence scores. Each catches a different failure: genuine quality drops, lighting drift pushing good parts over the rejection boundary, and the model losing certainty before it misclassifies.

How do you set control limits from a line’s own baseline rather than an arbitrary accuracy threshold?

Collect a baseline during a trusted known-good run, compute the centre line as the baseline average, and compute the limits from the baseline’s own variation using the chart’s ±3-sigma formula. Then freeze the limits — recompute only after a deliberate, documented change, never to accommodate drift, or you have re-created the arbitrary threshold.

Which SPC signals indicate model drift before aggregate accuracy degrades?

The pattern rules fire while individual points are still inside the limits: a run of seven points on one side of the centre line signals a sustained shift, and a trend of six steadily moving points signals directional drift. A trend on a confidence chart appears long before aggregate accuracy moves enough to trip a threshold — that lead time is the whole point.

How does an out-of-control SPC signal connect to the rollback decision on the production line?

An out-of-control signal feeds a pre-defined response — alert and continue, recalibrate, or roll back to a known-good model version. A sharp shift coincident with a deployment is the cleanest rollback trigger, because the chart timestamps the regression to the change.

What should you track on the chart — defect rate, false-reject rate, or confidence distributions?

All three, on separate charts watched together. A defect-rate chart can sit inside its limits while false-reject rate climbs and confidence erodes, so plotting only the defect rate is the most common mistake. Two or three charts is the minimum honest instrumentation for a line-side model.

What are the standard SPC control-chart rules and how do they map to CV model drift?

The Western Electric / Nelson rules: one point beyond a limit (abrupt shift — knocked camera, wrong model version), a run of seven on one side (sustained shift — supplier change), a trend of six (directional drift — accumulating contamination, slow degradation), and increasing range-chart spread (erratic process — vibration, inconsistent part presentation). The trend and run rules are what give SPC its early-warning property over a single accuracy number.

What does an SPC control chart for an inspection model look like, and which chart type fits which metric?

Proportion metrics like defect rate and false-reject rate fit p-charts (or np-charts at fixed subgroup size); continuous confidence scores fit an X̄ chart paired with a range or S chart. The data type the metric produces determines the chart, not preference — a continuous score does not belong on a proportion chart.

A control chart is only as honest as the baseline behind it and the response defined for a violation. The open question on most lines is not which chart to draw — it is whether anyone has agreed, in advance, what an out-of-control run on the confidence chart obliges the line to do before the next shift starts.