BmpToJpeg was slow (~25-45ms for 4K) due to two bottlenecks: 1. cv::imdecode for BMP parsing (unnecessary for uncompressed BMP) 2. TurboJPEG CPU encoding (~11ms for 4K) Fix 1: Zero-copy BMP parsing — parse header directly and wrap pixel data in cv::Mat without allocation or copy. Eliminates ~47MB of heap allocations per 4K frame. Fix 2: NvJpegCompressor class using nvJPEG hardware encoder on NVIDIA GPUs (~1-2ms for 4K). Integrated into CompressJpegToString so all 5 JPEG encoding callsites benefit automatically. Reusable GPU buffer avoids per-frame cudaMalloc/cudaFree. Silent fallback to TurboJPEG on Intel/AMD or if nvJPEG fails. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
224 KiB
224 KiB