How Machine Learning Is Changing Video Compression

Artificial intelligence is bringing new solutions for nearly every industry. Now, AI innovators are setting out to solve video compression issues.  

The video editing industry hasn’t changed significantly since the introduction of video encoding for broadcast, almost two decades ago. However, developments in AI are gearing up to change that. The increasing popularity of video content is pushing companies to create and upload high-quality video content constantly, but quality videos are heavy and tend to slow the page load rate. 

Effective video compression minimizes the bit rate while keeping the image quality. Machine-learning enhanced algorithms overcome this challenge by a series of techniques, for example, intelligent motion estimation. This article provides an overview of video compression and how machine learning is improving its techniques. 

What Is Video Compression?

Video compression techniques and tools aim to reduce the size of a video by eliminating redundancies. The reduction in size results in lower bandwidth requirements and smaller storage needs. The tools that compress the video files are called video codecs.

Video codecs

Codec, which stands for “coder-decoder,” is software that applies algorithms to the video. The algorithm search for redundancies, deletes them, thus reducing the size of the file. Redundant data can be, for example, repetitive images or a background that repeats throughout a video. 

Compression changes the original format of the video into a format supported by the video player. The video codec determines the format of the video. 

Video compression methods

Compression can be lossy or lossless. Lossy compression involves eliminating redundant data permanently. This method allows you to compress the file to smaller sizes. However, the compression deletes the extra data permanently, which affects the quality when decoding the file. Lossless compression, on the other hand, eliminates redundant data without affecting quality. Unfortunately, lossless compression does not allow for drastically reducing the file size. 

Video compression can be done according to two approaches: intra-frame and inter-frame. Intra-frame means that the codec compresses the current frame. Inter-frame compression means the codec eliminates redundant data in successive video frames. 

These differences matter when it comes to choosing video codec standards. For example, the Motion JPEG (M-JPEG) standard uses intra-frame compression, whereas the Motion Picture Expert Group (MPEG) standard uses inter-frame compression.

Some popular compression standards are:

  • M-JPEG—this standard has a low CPU utilization. However, it degrades the quality of frames with a lot of details or textures. 
  • MPEG—good for video streaming as it is compatible with all major video players. However, the compression is not that efficient for complex frames, and has a high usage of CPU. 
  • H.264—this standard is the most efficient for video streaming, and it’s compatible with all major video players. The downside is that it has the highest CPU utilization. 

Video formats

Video codecs should not be confused with video formats. The format is the extension you see after the file name, for example, .jpg or .mp4. Videos are packaged into data containers called wrapper formats. These formats contain the information required to play the video, including the audio, images, and metadata. The most common video formats are .mov, mp4 and .mpeg. The format needs to be compatible with the browser or video player. 

Bit rates

Another important concept in video compression is bit rate. The term refers to the number of bits per second transmitted over the Internet at a given time. A bit is the basic unit of information representing the data in the audio or video file. Bit rates are used to measure the quality of resolution in an audio or video file. The more bitrates the file uses, the higher the quality. 

Artificial Intelligence Concepts: Machine Learning vs Deep Learning

Machine learning and deep learning are subsets of artificial intelligence. Below are two simple explanations of the terms. 

Image Source

  • Machine learning—a technique that assists a machine in learning from structured data without human intervention.
  • Deep learning—a subset of machine learning, where the algorithms function in layers that provide a different interpretation to the source data. This network of algorithms, called Artificial Neural Networks (ANN), are inspired by the neural networks present in the human brain. 

Machine learning is the most commonly used technique in the first generation of AI-based video compression software. Innovations have started applying deep learning techniques to improve AI-based video compression. For example, Convolutional Neural Networks are used to improve video compression, especially for video streaming. 

Machine Learning Algorithms for Video Compression

Machine learning algorithms can be classified into three categories: supervised, unsupervised, and reinforcement learning. Supervised learning means that the algorithm learns from the data, much like a student learns from a teacher. A human is supervising the learning process. 

Supervised learning starts with the machine learning a function that maps an input to an output variable. This mapping takes the variables from example input-output pairs. Supervised learning algorithms are the ones used for video encoding and compression. 

Artificial intelligence is present in modern video compression tools. These software solutions feature machine learning techniques that automate the compression and formatting of videos. This enables you to compress videos during upload. You can also compress the videos after uploading them when delivering to users.

Benefits of Machine Learning for Video Compression

Traditional video compression requires a sizeable amount of skill, time, and effort. Applying machine learning addresses these challenges by automating the processes. Other benefits of machine learning include:

  • Development savings—video codecs are complex algorithms. Coding a video codec may take months of work. Machine learning software shortens the process since the software learns and adapts the algorithms, with little human intervention.
  • Improve encoders density—some ML implementations require less computing power than other algorithms. Many ML algorithms are implemented in Graphical Processing Power (GPU), instead in Computer Processing Power (CPU). GPU has smaller logical cores than CPUs, which enables them to process simpler computations in parallel. 

AI-based video compression trends

  • Customization—while today compression tools are generic, it is possible that in the near future compression algorithms can be customized, as they evolve at a fast pace. 
  • Autoencoders—one of the applications of neural networks to video compression has lead to the development of autoencoders. Autoencoders are ANNs that learn a compressed input and then reconstruct it without supervision. This is especially useful for the field of video compression.

Wrap Up

Video compression technology is accelerating its development thanks to machine learning algorithms. As video streaming becomes the norm, and the number of videos online grows exponentially, the need for AI-based compression increases. 

For a long time, machine learning video compression has been the basis of AI-based compression. Now that deep learning has taken off; we’re seeing more advanced AI-based compression. Algorithms with neural networks are set to help video compression technology reach a new and improved level. 


Article update (March 9, 2020): Sections of this article is attributed to “AI Technology is Changing the Future of Video Compression” written by Jean Louis Diascorn and published at the 73rd Annual NAB Broadcast Engineering and Information Technology Conference.

Note: This is a guest post, and the opinion in this article is of the guest writer. If you have any issues with any of the articles posted at www.marktechpost.com please contact at asif@marktechpost.co

Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Samsung NEXT, NetApp and Imperva, producing technical and thought leadership content that elucidates technical solutions for developers and IT leadership.

✅ [Featured AI Model] Check out LLMWare and It's RAG- specialized 7B Parameter LLMs