/ffvideo

an example FFmpeg lib, and wxWidgets Player with video filters and face detection

Primary LanguageC++

FFVideo

Last update July 31, 2021

An example FFmpeg lib, and wxWidgets Player application with video filters and face detection, it is a no-audio video player intended for video experiments and developers learning how to code media applications.

FFVideo00

FFVideo Player supports multiple simultanious minimum delay playback video windows, seek, scrubbing, basic face detection, plus the surrounding code necessary for a moderately professional application, such as persistance for end-user settings, and an embedded web browser providing a 'help window'.

The idea of this application is to provide a basic video app framework for developers wanting to learn, and experiment with video filters without the overhead of audio processing or the normal frame delays of ordinary video playback. If one is training models with video data, and want to preserve the maximum amount of resources for training, this provides a nice framework for doing so. This is a C++ Visual Studio 2019 IDE Solution and Project.

Interfaces

Some of the coding examples in this project include:

  • An FFmpeg library supporting
    • video files, USB Cameras, IP Cameras and IP video services
    • video file seeks, scrubbing,
    • unlimited, chained single source AVFilterGraph filters,
    • frame exporting
    • This project uses the author's modified FFmpeg located here
    • This project also uses the author's SQLite3 wrapper library, located here:
    • And this project uses Jorge L Rodriguez's stb image scaling header only library:
  • A multi-threaded wxWidgets Video Player application
    • Multiple simultanious video windows
    • Exported video frames collected and re-encoded as H.264 .MP4 and .264 Elementary Streams
    • Easy access to AVFilterGraph Video Filters and video experimentation
    • Integration with Dlib and a basic example of Face Detection and of Face Landmark Recovery
    • An embedded web browser as the "Help" window
    • Lots in-code of documentation describing how, what and why

When playing an HD film trailer from a local SSD drive, frame rates as high as 700 70 fps (after reverting fo FFmpeg 4.2.3) can be achieved while only using 2 cores of a Ryzen 7. Extended time tests show no memory leakage or stale/dead thread acculumation.

Although vcpkg is used, integration issues led to independant building of Boost, FFmpeg, GLEW, and WxWidgets. For these reasons the following environment variables are used within the project's Visual Studio 2019 solution and VS projects to locate these libraries:

  • BoostRoot Set to root of the Boost 1_76_0 directory hierarchy
  • FFmpegRoot Set to the FFmpeg installation directory root, author used version 4.4 4.2.3 - new build with FFmpeg 4.2.3 is more stable, see below
  • FFmpegDebugRoot Set to the FFmpeg debug build directory root
  • FFvideoRoot Set to the github root of this project
  • GLEWRoot Set to root of GLEW 2.2.0
  • kvsRoot Set to the github root of https://github.com/bsenftner/kvs
  • vcpkgRoot Set to the installation root of vcpkg
  • WXWIN Set to the installation root of wxWidgets, author used version 3.1.3 with the optional wxWebView component built and required for this project

Through vcpkg integration with Visual Studio, the TurboJPEG, jpeg, png, zlib, and Dlib libraries are also used. Vcpkg also failed to integrate with Visual Studio 2019 out of the box, so a custom triplet file named x64-windows-static-142.cmake was placed inside the $(vcpkgRoot)\triplets directory with this contents:

set(VCPKG_TARGET_ARCHITECTURE x64)
set(VCPKG_CRT_LINKAGE static)
set(VCPKG_LIBRARY_LINKAGE static)
set(VCPKG_PLATFORM_TOOLSET v142)

Once that custom vcpkg triplet was in place, vcpkg correctly generates Visual Studio 2019 x64 static builds using the Windows 142 toolset

Through wxWidgets OpenGL, and an embedded web browser is encorporated. The embedded web browser requires a custom build of wxWidgets.

Dlib's face detection model file shape_predictor_68_face_landmarks.dat is also required for this project, it can be downloaded in compressed form from:

http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2

After decompression, it should be placed in the project's bin directory, just beneath the project's git root

New: Added a menu option for display of detected faces. Beta, only displays first detected face per frame at the moment.

New July 28, 2021: Modified the display of detected faces to optionally standardize the collected face images to be presented as close as possible to Standard Passport Format, with the eyes rotated to be level and re-evaluated to a 300x300 pixel image, regardless of source dimensions.

demonstrating tilted head registration demonstrating multiple heads

New July 29, 2021:

Switched to using the SIMD Library for RGBA to RGB and to grayscale conversions. Multiple face detection code changes, such as adding a precision control and switching to doing face detections in grayscale.

New July 31, 2021:

Experimenting with the 81 point face landmark model from https://github.com/codeniko/shape_predictor_81_face_landmarks
In RenderCanvas.cpp, the bottom of the RenderCanvas() constructor is a call to
m_faceDetectMgr.SetFaceModel( FACE_MODEL::eightyone );
just change that to m_faceDetectMgr.SetFaceModel( FACE_MODEL::sixtyeight ); to revert back to the original face landmarks.
Note there is some glue logic when getting the "face chips" for the 81 point face model to work.

detectedFaceDisplay05

Known issues:

(Rebuilding against FFmpeg 4.2.3 seems to have removed the replay instabilities, but it's not as fast anymore. See note, mid-readme)

PlayAll has been modified to play USB cameras first, one by one, and then the other stream types are played simultaniously. This seems to sidestep USB stream startups being non-thread safe.