Access the power of hardware-accelerated video codecs in your Windows applications via FFmpeg/libavcodec (2023)

Access the power of hardware-accelerated video codecs in your Windows applications via FFmpeg/libavcodec (1)Since 2011, all Intel GPUs (Intel integrated and discrete graphics products) have included Intel Quick Sync Video (QSV), the dedicated hardware core for video encoding and decoding. Intel QSV supports all major video processing applications on many operating systems, includingMPEG. The tutorial focuses on Intel QSV-based video encoding and decoding acceleration in native Windows applications (desktop) using FFmpeg/libavcodecfor video editing. To illustrate the concepts described is open source.3D Transmission Toolkitit is used.

introduction


MPEGis a free and open source software project that includes a large number of libraries for handling multimedia. The functionality of these libraries is used not only by the command line-based FFmpeg executable, but also by open source and commercial products via API calls to the corresponding FFmpeg libraries.

Note: Although FFmpeg supports Intel QSV since version 2.8, it is highly recommended to use the latest version of FFmpeg as it is constantly adding new Intel QSV related features and improving existing ones with each new version.

FFmpeg is part of the workflow of hundreds of software projects related to video processing and streaming. However, not all take advantage of the video processing capabilities of Intel's GPU hardware, leaving significant room for potential performance improvements.

One of the apps on this list is3D Transmission Toolkit– Windows operating system based application implemented with:

  • FFmpeg libavcodec library for software-based h264 video decoding on all systems
  • Proprietary library for hardware-accelerated h264 real-time video encoding on software-backed NVIDIA GPU-based systemsopenh264 libraryfor all other systems.

What is 3D Streaming Toolkit? An open source toolset for creating cloud-based 3D experiences that stream frames in real time to other devices on the network. In particular, the 3DStreamingToolkit usesWebRTCwith extensions for video editing of 3D content. The SDK is available as a native C++ plugin and can be added to any rendering engine to enable WebRTC-based streaming over the network.

We will add FFmpeg/libavcodec based hardware h264 video encoding and decoding to this app.

requirements

The first step is to make sure that the FFmpeg/libavcodec build used by the application is compatible with Intel QSV.

This means that you need to configure and compile FFmpeg with the following options:

--enable-dxva2 --enable-d3d11va --enable-libmfx

Prebuilt FFmpeg packages available fordescargarYou already have these options enabled. If you build FFmpeg yourself from sources, seeFFmpeg Build Guide.

Hardware video decoding via FFmpeg/libavcodec on Windows - instructions.

Hardware video decoding of h264 via the DXVA2 API can be described as the following sequence of actions:

Note: We only move to the next step if the current one has been completed successfully. Otherwise, the corresponding error must be processed and terminated.

  1. Find a suitable decoder like h264 decoder. FFmpeg supports several ways to do this, either by name or by ID, as shown:
    AVCodec* Codec = avcodec_find_decoder(AV_CODEC_ID_H264);
  2. Create the AVCodecContext context with this decoder as input argument:
    AVCodecContext* av_context =avcodec_alloc_context3(códec);
  3. Initialize the avcodec context parameters. While there are a number of parameters and various avcodec API functions to configure, all that is needed in this step is the data pixel format, in our case:
    av_context->pix_fmt = AV_PIX_FMT_YUV420P;
  4. Create the hardware device context of the given type and bind it by setting the appropriate pointer to its reference:
    AVBufferRef *hw_device_ctx = NULL;av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_DXVA2,NULL, NULL, 0));av_context->hw_device_ctx = av_buffer_ref(hw_device_ctx);
  5. Map software and hardware frameworks. Note that it does not allocate memory from data buffers!
    AVFrame *frame = NULL, *sw_frame = NULL;frame = av_frame_alloc());sw_frame = av_frame_alloc();
  6. Allocate a linear buffer to store frame data for later use. Note that at this point we need to know the decoded frame format and dimensions (width and height) to calculate the buffer size. If this is not known in advance, we can allocate the buffer after the first call to the avcodec_receive_frame() function, which will return the shape and dimensions of the frame as part of the AVFrame structure.
    int size = av_image_get_buffer_size(frame->shape, frame->width,frame->height, 1);uint8_t buffer = av_malloc(size);
  7. Opened by decoder:
    avcodec_open2(av_context, códec, NULL));
  8. Step through the video frames for each frame:
    1. Create an AV data packet and link your data with the incoming video data.
      AVPacket;av_init_packet(&packet);//here input_video is the encoded video frame provided by application.packet.data = input_video.buffer;packet.size = static_cast<int>(input_image.length);
    2. Provide raw data packets as input to the decoder.
      avcodec_send_packet(av_context, &Paquete);
    3. Receive a decoded video frame from this packet:
      avcodec_receive_frame (av_context, Marco);
    4. Check the received frame format. If it is a hardware format, copy the frame from the hardware to the software. Otherwise, just use a received framework - it's a software framework. The hardware frame cannot be used directly!
      if (frame_->format == AV_PIX_FMT_DXVA2_VLD) {/* Get data from GPU to CPU */av_hwframe_transfer_data(sw_frame, frame, 0);}
    5. Copy the image data to a pre-allocated linear buffer for later use, e.g. B. to preview or write to a file:
      av_image_copy_to_buffer(buffer, size,(const uint8_t * const *)frame->data,(const int *)frame->linesize, frame->format,frame->width, frame->height, 1);
    6. Delete received packet:
      av_packet_unref(&Paquete);
  9. Do the final cleanup: free the avcodec context, software and hardware frameworks, and linear buffer. Clear hardware device context:
    avcodec_free_context(&av_context);av_frame_free(&frame);av_frame_free(&sw_frame);av_freep(&buffer);av_buffer_unref(&hw_device_ctx);

Example for FFmpeghw_decode.cwhich works unmodified on Windows with the arguments "<dxva2>" or "<d3d11va>" and "<input file>" can be used as a reference for the above steps, with some differences caused by taking the video file as input not used. video image sequence.

Note that this code will work without modification on any GPU capable of hardware video acceleration, including third-party external GPU-based systems.

Hardware video encoding via FFmpeg/libavcodec on Windows - instructions.

Hardware video encoding of h264 on Intel QSV compatible systems can be described as the following sequence of actions:

Note: We only move to the next step if the current one has been completed successfully. Otherwise, the corresponding error must be processed and terminated.

  1. Find the right encoder, e.g. B. the H264 encoder. FFmpeg supports multiple ways to do this: by name or by ID. We search by name to ensure that the Intel Quick Sync Video compatible encoder is selected:
    AVCodec* Códec = avcodec_find_encoder_by_name(“h264_qsv”);

    Create the AVCodecContext context with this encoder as input argument:
    AVCodecContext* av_context =avcodec_alloc_context3(códec);
  2. Start the avcodec context parameters needed for encoding. That is, the following parameters must be set according to the video stream data, as shown in the example below, but more parameters can be applied to provide the desired encoding quality; seeAVCodecContextDescription.
    av_context ->large = large; av_context -> altura = altura; av_context -> time_base = (AVRational) {1, 25}; av_context -> velocidad de photogramas = (AVRational){25, 1}; av_context ->sample_aspect_ratio = (AVRational){1, 1}; av_context ->pix_fmt = PIX_FMT AV_PIX_FMT_QSV;
  3. Create the hardware device context of the specified type:
    AVBufferRef *hw_device_ctx = NULL;av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_QSV, NULL, NULL, 0));
  4. Assign the bound hardware framework context to the hardware device context:
    AVBufferRef *hw_frames_ref = NULL;hw_frames_ref = av_hwframe_ctx_alloc(hw_device_ctx)
  5. Fill in the hardware framework context parameters and finalize the context before use. Although several parameters need to be defined, the ones required in this step are listed below:
    AVHWFramesContext *frames_ctx; frames_ctx = (AVHWFramesContext *)(hw_frames_ref->data);frames_ctx->format = AV_PIX_FMT_QSV;frames_ctx->sw_format = AV_PIX_FMT_YUV420P;frames_ctx->width = largura;frames_ctx->height = altura;av_hwframe_ctx_init(hw_frames_ref);
  6. Bind the hardware frames context to the avcodec context by setting the appropriate pointer to your reference
    av_context ->hw_frames_ctx = av_buffer_ref(hw_frames_ref);
  7. Map software and hardware frameworks. Note that it does not allocate memory from data buffers!
    AVFrame *hw_frame = NULL, *sw_frame = NULL;hw_frame = av_frame_alloc();sw_frame = av_frame_alloc();
  8. Allocate data buffers to store software frame video data and populate sw_frame input data. Note that at this point, to calculate the buffer size, we need to define the format and dimensions (width and height) of the input frame as shown:
    //Allocate buffer sw_frame->width = width; sw_frame->height = height; sw_frame->format = AV_PIX_FMT_YUV420P; av_frame_get_buffer(sw_frame, 0); // fill sw_frame data with fread or memcpy operation type // or set the pointer to your externally preallocated data directly sw_frame->data[0]= input_video.ptrY;sw_frame->data[1]= input_video .ptrU ;sw_frame->data[2]= input_video.ptrV;
  9. Allocate data buffers to store hardware frame video data:
    av_hwframe_get_buffer(av_context ->hw_frames_ctx, hw_frame, 0);
  10. Opened by encoder:
    avcodec_open2 (av_context, códec, NULL);
  11. Step through the video frames for each frame:
    1. Copy the data from the software framework to the hardware framework for further coding:
      av_hwframe_transfer_data(hw_frame, sw_frame, 0);
    2. Create a data package:
      Paquete AVPacket;av_init_packet(&packet);
    3. Provide the encoder with raw hardware frame data:
      avcodec_send_frame (av_context, hw_frame);
    4. Read the encoded data from the encoder:
      avcodec_receive_packet(av_context, &Paquete);
    5. At this point, the data field of the packet contains encrypted data and the size field - the size of the encrypted data. Use it directly for storage, network transmission, etc.:
      uint8_t* encoded_image = packet.data;// orfwrite(packet.data, packet.size, 1, output);
    6. Delete received packet:
      av_packet_unref(&Paquete);
  12. Do the final cleanup: free the avcodec context, software and hardware frameworks, and linear buffer. Clear hardware device context:
    avcodec_free_context(&av_context);av_frame_free(&sw_frame);av_frame_free(&hw_frame);av_buffer_unref(&hw_device_ctx);

Hardware Accelerated Video Encoding for Windows does not have a corresponding FFmpeg sampler, but doesvaapi_encode.cdesigned for the Linux family of operating systems can be easily modified by changing the encoder name and hardware pixel format used.

3D Streaming SDK: Hardware video processing on Intel GPUs. Changes and Results.

All hardware video encoding and decoding steps described above on Intel GPUs by libavcodec were introduced by Intel in the open source 3D Broadcast SDK code. That is, for the h264 WebRTC video encoding module of the Microsoft 3D Streaming SDK, the h264_decoder_impl.cc and h264_encoder_impl.cc files have been modified accordingly.

The results were then evaluated on Intel Gen 9 GPU-based systems (using SpinningCubeServer-v2.0-based DirectX 3DStreamingToolkit for video encoding and DirectX-NativeClient-v2.0 for decoding) and compared to the results. 3DSstreamingToolkit originals.

The original results of the 3DStreamingToolkit on Intel GPU-based systems using the openh264 library for software video encoding and the libavcodec library for h264 software video decoding are as follows: While for the associated encoder-decoder pair with 1280 x 720 resolution it is working, FPS over 60 can be reached, video quality is very low as below picture and CPU usage is around 30%:

Access the power of hardware-accelerated video codecs in your Windows applications via FFmpeg/libavcodec (2)

Setting the highest encoding quality for openh64 programmatically increases CPU load to 100% and FPS drops:

Access the power of hardware-accelerated video codecs in your Windows applications via FFmpeg/libavcodec (3)

Hardware video encoding and decoding based on the newly implemented libavcodec library sees FPS above 60 with high image quality while CPU load is ~5%.

Access the power of hardware-accelerated video codecs in your Windows applications via FFmpeg/libavcodec (4)

The following table summarizes the results obtained:

FPScpu-latest
Original FFmpeg implementationLow quality: >6030%
High Quality: ~40100%
HW based FFmpeg implementationLow quality: >605%
High quality: >605%

conclusions

Accelerating hardware video rendering on Intel GPUs in native Windows applications via the libavcodec library is easy to implement, provides significant benefits for overall media workload performance and image quality, reduces CPU and leaves room for other CPU tasks.

Top Articles
Latest Posts
Article information

Author: Fredrick Kertzmann

Last Updated: 03/21/2023

Views: 5536

Rating: 4.6 / 5 (66 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Fredrick Kertzmann

Birthday: 2000-04-29

Address: Apt. 203 613 Huels Gateway, Ralphtown, LA 40204

Phone: +2135150832870

Job: Regional Design Producer

Hobby: Nordic skating, Lacemaking, Mountain biking, Rowing, Gardening, Water sports, role-playing games

Introduction: My name is Fredrick Kertzmann, I am a gleaming, encouraging, inexpensive, thankful, tender, quaint, precious person who loves writing and wants to share my knowledge and understanding with you.