254 lines
10 KiB
Markdown
254 lines
10 KiB
Markdown
|
|
# Plan: Hybrid trackId ALPRChecker with Auto-Detection of Pipeline vs Full-Frame Mode
|
||
|
|
|
||
|
|
**Date**: 2026-04-05
|
||
|
|
**Status**: Approved for implementation
|
||
|
|
**Affects**: ANSALPR_OD (Layer 2: ALPRChecker, Layer 3: ensureUniquePlateText)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Problem
|
||
|
|
|
||
|
|
When ANSALPR is used in a **pipeline** (vehicle detector crops each vehicle → ALPR runs on each crop independently), ALPRChecker (Layer 2) merges plates from different vehicles because:
|
||
|
|
|
||
|
|
1. LP bounding boxes are **crop-relative** — plates from different vehicles end up at similar (x, y) positions within their respective crops
|
||
|
|
2. ALPRChecker matches by **IoU on bounding boxes** → high IoU between crop-relative boxes → falsely merges
|
||
|
|
3. **Levenshtein fallback** can also merge similar plates (e.g., "29BA-1234" and "29BA-1235", distance=1)
|
||
|
|
4. **Proximity guard** catches remaining cases → all vehicles get the same locked plate text
|
||
|
|
|
||
|
|
**Result**: All detected vehicles return the same license plate.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Solution
|
||
|
|
|
||
|
|
Three-part fix:
|
||
|
|
|
||
|
|
1. **Auto-detect** full-frame vs pipeline mode by tracking image size consistency per camera
|
||
|
|
2. **Full-frame mode**: Enable Layer 2 + Layer 3, with **hybrid trackId matching** (trackId primary, Levenshtein fallback for lost tracks)
|
||
|
|
3. **Pipeline/crop mode**: Disable both layers — pass raw OCR through immediately
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Design: Tri-State Mode Flag
|
||
|
|
|
||
|
|
```
|
||
|
|
_alprCheckerMode:
|
||
|
|
-1 = auto-detect (default)
|
||
|
|
0 = explicitly disabled (raw OCR always)
|
||
|
|
1 = explicitly enabled (ALPRChecker + dedup always)
|
||
|
|
```
|
||
|
|
|
||
|
|
| `_alprCheckerMode` | Image size | Layer 2 (ALPRChecker) | Layer 3 (ensureUnique) |
|
||
|
|
|---|---|---|---|
|
||
|
|
| `0` (explicit off) | Any | **OFF** — raw OCR pass-through | **OFF** |
|
||
|
|
| `1` (explicit on) | Any | **ON** — hybrid trackId + Levenshtein | **ON** |
|
||
|
|
| `-1` (auto, default) + size varies | Pipeline detected | **OFF** — raw OCR pass-through | **OFF** |
|
||
|
|
| `-1` (auto, default) + size constant 5+ frames | Full-frame detected | **ON** — hybrid trackId + Levenshtein | **ON** |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Design: Hybrid trackId ALPRChecker
|
||
|
|
|
||
|
|
### Why trackId is better than IoU for full-frame mode
|
||
|
|
|
||
|
|
| Scenario | Current (IoU) | Hybrid (trackId) |
|
||
|
|
|---|---|---|
|
||
|
|
| Two vehicles side-by-side, similar plates | **False merge** (IoU overlap) | **Correct** (different trackIds) |
|
||
|
|
| Fast-moving vehicle | **May lose history** (IoU=0) | **Correct** (ByteTrack tracks motion) |
|
||
|
|
| Identical plates in frame (fleet) | **Merges into one** (Levenshtein=0) | **Correct** (separate trackIds) |
|
||
|
|
| Plate occluded, reappears with new trackId | **Recovers** (text similarity) | **Recovers** (Levenshtein fallback migrates history) |
|
||
|
|
|
||
|
|
### Algorithm: `checkPlateByTrackId(cameraId, ocrText, trackId)`
|
||
|
|
|
||
|
|
```
|
||
|
|
Step 1: Age all plates for this camera (framesSinceLastSeen++)
|
||
|
|
Step 2: Periodic pruning (every 30 calls, remove stale entries >180 frames)
|
||
|
|
|
||
|
|
Step 3 — Primary: hash lookup plates[trackId]
|
||
|
|
If found:
|
||
|
|
→ Append raw OCR to textHistory (not corrected — avoids feedback loop)
|
||
|
|
→ majorityVote() on history
|
||
|
|
→ Lock logic:
|
||
|
|
- Not locked + 3 consistent votes → LOCK
|
||
|
|
- Locked + exact match → return locked text (fast path)
|
||
|
|
- Locked + vote drifted (Levenshtein > 1) + 3 new votes → RE-LOCK
|
||
|
|
- Locked + noise → return locked text (resist)
|
||
|
|
→ Return result immediately
|
||
|
|
|
||
|
|
Step 4 — Fallback: Levenshtein scan for lost tracks
|
||
|
|
For each existing plate entry:
|
||
|
|
If Levenshtein(detectedPlate, lockedText) ≤ 1:
|
||
|
|
→ MIGRATE: move history from old trackId to new trackId
|
||
|
|
→ Return locked text
|
||
|
|
If not locked, check last 3 history entries:
|
||
|
|
→ Same migration logic
|
||
|
|
|
||
|
|
Step 5 — No match: create new entry
|
||
|
|
plates[trackId] = { textHistory=[detectedPlate] }
|
||
|
|
Return raw OCR text immediately
|
||
|
|
```
|
||
|
|
|
||
|
|
### Frame-by-Frame Behavior (what LabVIEW sees)
|
||
|
|
|
||
|
|
| Frame | OCR Read | Returned to LabVIEW | Internal State |
|
||
|
|
|-------|----------|---------------------|----------------|
|
||
|
|
| 1 | "29BA-12345" | **"29BA-12345"** (instant) | New entry, history=[1 read] |
|
||
|
|
| 2 | "29BA-12345" | **"29BA-12345"** (voted) | history=[2 reads], not locked (need 3) |
|
||
|
|
| 3 | "29B4-12345" | **"29BA-12345"** (voted, corrected OCR error) | history=[3 reads], not locked |
|
||
|
|
| 4 | "29BA-12345" | **"29BA-12345"** | **LOCKED** (3 consistent votes) |
|
||
|
|
| 5+ | "29B4-12345" | **"29BA-12345"** (locked, resists noise) | Lock held |
|
||
|
|
| 50+ | consistently "30CD-567" | **"30CD-567"** | **RE-LOCKED** to new plate |
|
||
|
|
|
||
|
|
**Key**: Every frame gets an immediate response. No waiting, no buffering. Frame 1 returns raw OCR. Subsequent frames return increasingly stable text.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Design: Auto-Detection (`shouldUseALPRChecker`)
|
||
|
|
|
||
|
|
Tracks image size per camera. If size is constant for 5+ consecutive frames → full-frame mode. If size changes → pipeline mode.
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
bool shouldUseALPRChecker(const cv::Size& imageSize, const std::string& cameraId) {
|
||
|
|
if (_alprCheckerMode == 0) return false; // explicit off
|
||
|
|
if (_alprCheckerMode == 1) return true; // explicit on
|
||
|
|
|
||
|
|
// Auto-detect: check image size consistency
|
||
|
|
auto& tracker = _imageSizeTrackers[cameraId];
|
||
|
|
if (imageSize == tracker.lastSize) {
|
||
|
|
tracker.consistentCount++;
|
||
|
|
if (tracker.consistentCount >= 5) tracker.detectedFullFrame = true;
|
||
|
|
} else {
|
||
|
|
tracker.lastSize = imageSize;
|
||
|
|
tracker.consistentCount = 1;
|
||
|
|
tracker.detectedFullFrame = false;
|
||
|
|
}
|
||
|
|
return tracker.detectedFullFrame;
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Files to Modify
|
||
|
|
|
||
|
|
| File | Change |
|
||
|
|
|------|--------|
|
||
|
|
| `modules/ANSLPR/ANSLPR.h` | Add `TrackedPlateById` struct, `trackedPlatesById` map, `checkPlateByTrackId()` declaration to ALPRChecker class |
|
||
|
|
| `modules/ANSLPR/ANSLPR.cpp` | Implement `checkPlateByTrackId()` (after existing `checkPlate()` at line 288) |
|
||
|
|
| `modules/ANSLPR/ANSLPR_OD.h` | Add `_alprCheckerMode`, `ImageSizeTracker`, `shouldUseALPRChecker()`, public `SetALPRCheckerMode()`/`GetALPRCheckerMode()` |
|
||
|
|
| `modules/ANSLPR/ANSLPR_OD.cpp` | Implement `shouldUseALPRChecker()`; guard 5 `checkPlate` + 3 `ensureUniquePlateText` call sites |
|
||
|
|
| `modules/ANSLPR/dllmain.cpp` | Add `ANSALPR_SetALPRCheckerMode` DLL export |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Call Sites to Guard
|
||
|
|
|
||
|
|
### 5 checkPlate call sites (replace with conditional):
|
||
|
|
|
||
|
|
| Line | Function | Image size source |
|
||
|
|
|------|----------|-------------------|
|
||
|
|
| 975 | `RunInferenceSingleFrame` | `frameWidth`, `frameHeight` |
|
||
|
|
| 1476 | `Inference` (no-bbox path) | `input.cols`, `input.rows` |
|
||
|
|
| 1655 | `Inference` (bbox path) | `input.cols`, `input.rows` |
|
||
|
|
| 1707 | `Inference` (full-frame fallback) | `input.cols`, `input.rows` |
|
||
|
|
| 2312 | `RunInference` (batch) | `input.cols`, `input.rows` |
|
||
|
|
|
||
|
|
Pattern at each site:
|
||
|
|
```cpp
|
||
|
|
// Before:
|
||
|
|
lprObject.className = alprChecker.checkPlate(cameraId, ocrText, lprObject.box);
|
||
|
|
|
||
|
|
// After:
|
||
|
|
if (shouldUseALPRChecker(cv::Size(frameWidth, frameHeight), cameraId)) {
|
||
|
|
lprObject.className = alprChecker.checkPlateByTrackId(cameraId, ocrText, lprObject.trackId);
|
||
|
|
} else {
|
||
|
|
lprObject.className = ocrText; // raw OCR pass-through
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3 ensureUniquePlateText call sites (wrap with conditional):
|
||
|
|
|
||
|
|
| Line | Function | Image size source |
|
||
|
|
|------|----------|-------------------|
|
||
|
|
| 997 | `RunInferenceSingleFrame` | `frameWidth`, `frameHeight` |
|
||
|
|
| 1726 | `Inference` | `input.cols`, `input.rows` |
|
||
|
|
| 2330 | `RunInference` (batch) | `input.cols`, `input.rows` |
|
||
|
|
|
||
|
|
Pattern at each site:
|
||
|
|
```cpp
|
||
|
|
// Before:
|
||
|
|
ensureUniquePlateText(output, cameraId);
|
||
|
|
|
||
|
|
// After:
|
||
|
|
if (shouldUseALPRChecker(cv::Size(frameWidth, frameHeight), cameraId)) {
|
||
|
|
ensureUniquePlateText(output, cameraId);
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## DLL Export API
|
||
|
|
|
||
|
|
```cpp
|
||
|
|
// Declaration (ANSLPR.h):
|
||
|
|
extern "C" ANSLPR_API int ANSALPR_SetALPRCheckerMode(ANSCENTER::ANSALPR** Handle, int mode);
|
||
|
|
|
||
|
|
// Implementation (dllmain.cpp):
|
||
|
|
extern "C" ANSLPR_API int ANSALPR_SetALPRCheckerMode(ANSCENTER::ANSALPR** Handle, int mode) {
|
||
|
|
if (!Handle || !*Handle) return -1;
|
||
|
|
auto* od = dynamic_cast<ANSCENTER::ANSALPR_OD*>(*Handle);
|
||
|
|
if (!od) return -2;
|
||
|
|
od->SetALPRCheckerMode(mode);
|
||
|
|
return 1;
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**LabVIEW usage**:
|
||
|
|
- `ANSALPR_SetALPRCheckerMode(handle, -1)` → auto-detect (default, no call needed)
|
||
|
|
- `ANSALPR_SetALPRCheckerMode(handle, 0)` → force disable (guaranteed raw OCR)
|
||
|
|
- `ANSALPR_SetALPRCheckerMode(handle, 1)` → force enable (guaranteed stabilization)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Backward Compatibility
|
||
|
|
|
||
|
|
- Default `_alprCheckerMode = -1` (auto) + pipeline (varying sizes) = both layers disabled = raw OCR = **same as if ALPRChecker never existed**
|
||
|
|
- Default auto + full-frame (constant sizes) = auto-enables after 5 frames = **improved accuracy** over current IoU-based approach
|
||
|
|
- Explicit `mode = 0` = **guaranteed off** regardless of image size — raw OCR always
|
||
|
|
- Explicit `mode = 1` = **guaranteed on** regardless of image size
|
||
|
|
- Existing `checkPlate()` methods are **not modified** — remain available for other code
|
||
|
|
- New `checkPlateByTrackId()` is additive — no existing API changes
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Verification
|
||
|
|
|
||
|
|
1. **Pipeline mode**: Call ALPR with different-sized vehicle crops → each returns independent OCR, no cross-contamination
|
||
|
|
2. **Full-frame mode**: Call ALPR with same-sized frames → after 5 frames, Layer 2+3 auto-enable, trackId-based stabilization active
|
||
|
|
3. **Track recovery**: Occlude a plate → ByteTrack assigns new trackId → Levenshtein fallback migrates history, lock preserved
|
||
|
|
4. **Explicit disable**: `ANSALPR_SetALPRCheckerMode(handle, 0)` → raw OCR always, no stabilization
|
||
|
|
5. **Explicit enable**: `ANSALPR_SetALPRCheckerMode(handle, 1)` → both layers always active
|
||
|
|
6. **Build**: Compile DLL, verify no linker errors
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Performance Comparison: Current vs Hybrid
|
||
|
|
|
||
|
|
### Matching step (per plate, per frame)
|
||
|
|
|
||
|
|
| | Current (IoU + Levenshtein) | Hybrid (trackId + Levenshtein fallback) |
|
||
|
|
|---|---|---|
|
||
|
|
| Primary lookup | O(n) linear scan + IoU | O(1) hash map |
|
||
|
|
| Fallback | O(n) Levenshtein scan | O(n) Levenshtein scan (only on miss) |
|
||
|
|
| Memory | vector (contiguous) | unordered_map (heap nodes) |
|
||
|
|
| False merges | Possible (IoU overlap or Levenshtein ≤ 1) | **Impossible** via primary path |
|
||
|
|
| False splits | Rare (IoU + text recovers) | Possible (new trackId after occlusion), **recovered by fallback** |
|
||
|
|
|
||
|
|
### Accuracy
|
||
|
|
|
||
|
|
| Scenario | Current | Hybrid |
|
||
|
|
|---|---|---|
|
||
|
|
| Dense traffic, similar plates | Degrades (false merges) | **Better** (trackId separation) |
|
||
|
|
| Fast-moving vehicles | May lose history | **Better** (ByteTrack tracks motion) |
|
||
|
|
| Frequent occlusions | Good recovery (text similarity) | Good recovery (Levenshtein fallback migrates) |
|
||
|
|
| Fleet vehicles (identical plates) | Merges | **Better** (separate trackIds) |
|