Improve ANSCV with sotfware decoder:
Thread-local staging Mat (video_player.cpp:1400-1407) — single biggest win. Eliminates the 12 MB per-call malloc/free cycle. Contiguous get_buffer2 allocator (video_decoder.cpp:35-102) — keeps the 3 bulk memcpys cache-friendly. Would also enable FAST/zero-copy for resolutions where visible_h % 64 == 0. SW-decoder thread config (video_decoder.cpp:528-540) — thread_count=0, thread_type=FRAME|SLICE. FRAME is downgraded to SLICE-only by AV_CODEC_FLAG_LOW_DELAY, but decode throughput is sufficient for your input rate. SetTargetFPS(100) delivery throttle (already there) — caps onVideoFrame post-decode work at 10 FPS. Keeps the caller path warm-cached. Instrumentation — [MEDIA_DecInit] / [MEDIA_Convert] / [MEDIA_SWDec] / [MEDIA_Timing] / [MEDIA_JpegTiming] — always-on regression detector, zero cost when ANSCORE_DEBUGVIEW=OFF.
This commit is contained in:
@@ -89,6 +89,15 @@ target_link_libraries(ANSCV
|
||||
PRIVATE CUDA::nvjpeg
|
||||
)
|
||||
|
||||
# libyuv — vendored at 3rdparty/libyuv (added by top-level CMakeLists when
|
||||
# the submodule is present). Provides SIMD-accelerated I420→RGB24 used by
|
||||
# CVideoPlayer::avframeYUV420PToCvMat for the SW-decode fast path.
|
||||
if(ANSCORE_HAS_LIBYUV)
|
||||
target_link_libraries(ANSCV PRIVATE yuv)
|
||||
target_include_directories(ANSCV PRIVATE ${CMAKE_SOURCE_DIR}/3rdparty/libyuv/include)
|
||||
target_compile_definitions(ANSCV PRIVATE ANSCORE_HAS_LIBYUV=1)
|
||||
endif()
|
||||
|
||||
# Platform-specific libs
|
||||
if(WIN32)
|
||||
target_link_directories(ANSCV PRIVATE
|
||||
|
||||
Reference in New Issue
Block a user