Friday, 12 July 2013

A fantastic article from learning solutions magazine on Video compression.

Sign up with Learning Solutions magazine (free) to receive excellent e-learning information like this video compression blog entry below:

Video Compression Secrets: Smaller Files, Better Quality


by Stephen Haskin
July 8, 2013
Feature
Last year, YouTube reported that users were uploading 48 hours of video to the service every minute. That’s over 69,000 hours of video a day, every day. Petabytes of video, every day (a petabyte is a million gigabytes). The only way this works is because YouTube compresses all that video during upload. But you aren’t dealing in petabytes of video so why does this matter to you?
Most of us already use or are beginning to use video in our eLearning. If you’re in the “most of us” category, you are compressing your video whether you know it or not. Some of us can use our own servers for video rather than YouTube’s or Vimeo’s or some other video service, but we all have file-size limitations and bandwidth issues.
How you compress your video and what form you compress it to makes a difference. But what do we know about compressing video except to use the Adobe Media Encoder or a standalone program like Sorenson Squeeze (or DivX, Microsoft, Apple, etc.) and several other video compressors? Does the file type you create matter? Yes. Does the file size matter? Yes. Let’s talk about file size first.

What affects file size?

Any timeline-based project (Figure 1) must be compressed in order to put it on YouTube, Vimeo, your servers, or wherever you need to store and access it. Furthermore, while you can stream an uncompressed video file (.avi), it’s not easy. The bandwidth requirements are such that your IT department will not have anything nice to say about it.

Figure 1: Timeline sample

The video you upload to YouTube gets compressed the way YouTube wants to compress it, no matter how you compressed it originally. Although the server sizes seemingly approach infinity at YouTube, when it comes to uploading, the only worry about file size is how long it might take to upload a large file. It’s the streaming part that matters. YouTube, and a few of the other video services, also automatically deliver different video codecs to different platforms like iOS or Windows. If you’re delivering to multiple platforms, it’s worth a look.
Video file size depends on many variables: HD (High Definition) vs. SD (Standard Definition), frame rate, color depth, even the amount of movement in the video. There are three measurements that count for a lot in determining the final file size.
First, the actual pixel dimensions of the finished product. (Figure 2) Standard video (SD) is generally 720 X 480 pixels—that equals 345,600 total pixels per frame. If you can do native HD video, the dimensions are 1,920 X 1,080, which equals 2,073,600 pixels. That’s a lot more pixels per frame to stream.

Figure 2: Standard video frame size (blue center area) vs. native HD video frame size (green outer area)

The second measurement is the length of the video. When anything changes on the video timeline you’re creating, the file size changes after you render your video. Change something else on your timeline and the resulting file size changes again. And so on.
The third measurement is frame rate. Most of the video we do is about 30 frames per second (fps). If you cut the number of frames in half, the video file size will be cut in half.
There are more variables, though. Here are some of them, including the three just mentioned:
  • Pixel dimensions
  • Frame rate (15, 24, 25, 30 or whatever)
  • How frequent are your key frames (the frame where the entire frame is recorded)
  • Progressive or interlaced frames
  • Constant or variable bitrate streaming
  • Buffer size
  • Audio sample rate
  • Render quality
There’s even more, if you want to dig into it. No wonder video quality can be so variable … and it’s very confusing to finish your video in order to stream it.

An explanation of containers and codecs

When it comes to your rendered video, there are two kinds of software that are in play. The first is the development environment or format. This is a container that the server sends to the browser. The viewer’s browser needs to have the correct decoder add-on installed in order for viewers to see the video. There are several of these containers, but most of the time we use the Flash (.flv or .swf), QuickTime (.mov), or Windows Media Video (.wmv) containers. To be sure, there are more, but I’m going to keep this explanation simple to keep the complexity down.
The second part of a video is the codec inside the format. A codec consists two parts—the encoder, which is what you use while you’re encoding the video from your timeline or file. The second part is the decoder; this part resides on the viewer’s computer as an add-in to decode the video inside the container. The codec is what actually encodes and compresses the video when you’re rendering it.
A very common codec is H.264. It’s called a block codec because it looks for differences in blocks of video as it encodes and plays back. H.264 is used for HD video compression. Now, there is no reason H.264 can’t compress standard-definition video—note that this is true for all codecs. Most of my clients work in SD video or NTSC to keep streaming rates lower. You should also note that if your finished video is going into an Articulate Studio or Storyline project, Articulate will not show HD video ... yet. In fact, the dimensions for Articulate have been 640 X 480, an unusual size that fills the template screen.
A second widely used codec is Windows Media. This codec can be used with H.264 and is the standard codec for Blu-Ray discs. Really. It can have other codecs in it besides H.264, but for purposes of this article I’m going to use the .wmv because it is easy to encode from the timeline in Premiere Pro.
Conceptually speaking, compressing video is actually pretty simple. If you look at a video from one frame to the next, some pixels change and some are exactly the same as in the frame before. Why encode a pixel that’s exactly the same as a pixel in the frame before? That’s a waste of bytes. So the software essentially ignores the pixels that don’t change and changes the ones that do change. If only it were that simple. There are very complex algorithms that go along with this, but the essence is the same.

Lossy vs. non-lossy codecs

Lossy is what it seems—it means just that. Some of the data in a video is lost when it’s encoded. Can you see the loss? It depends on how good (or bad in this case) your encoder is. There’s actually a lot of superfluous data in a frame of video and if some of it gets lost, so much the better for file size because you can’t see it. In analog TV, there was about a 10% loss due to scanning and over-scanning of the images. Digital video doesn’t have any of the “safe area” problems that the good old analog version had. In analog video, you had to make your screen supers fit within the safe area, somewhat more than 10% of the area of the picture. If you ran the super or information to the edge of the screen, chances are that by the time it got to someone’s home, it looked like it got cut off … and it did!

The problem with codecs and formats

The problem with all those codecs and formats is there are just too many of them. When you look at the dropdown menu in the Adobe Media Encoder (Figure 3), you’ll be presented with no less than 26 media formats.

Figure 3: Adobe Media Encoder dropdown menu

And that’s 26 formats without all the Sorenson presets included in the dropdown. If you include them, it’s over 40 formats to choose from. Everything from uncompressed .avi to Windows Media .wmv is available. Depending on the format you select, you’ll either have just a few codecs and sizes to choose from (try .f4v, which is a Flash video format), or you will get a second dropdown menu with a bewildering number of codecs to select to encode your project (Figure 4).

Figure 4: Codec formats

But wait … there’s more. Depending on the codec you select, there are other moving targets that will make your video file size larger or smaller, have better or worse temporal quality, or make it able to be seen by any number of devices and browsers. This is where the real problem begins for delivery to multiple device and browser versions. Most of us know that Flash won’t play in iOS, but mp4 will when encoded with H.264. But that codec and format combination won’t play in older versions of Firefox or Internet Explorer. So you can be forced into a lose-lose situation when you need to play your project across multiple devices and platforms.

Video cards

Video cards have had a great impact in the last few years on video editing and rendering. It used to be that your video card didn’t make much difference when it came to editing, encoding, and compressing video, but it sure does now. The newest programs from Sorenson and Adobe make use of the GPU (Graphics Processing Unit, aka the chip on your video card) with many video cards.
A decent video card now has about two to four gigabytes of RAM. Three years ago, video cards weren’t used for much other than displaying video and rendering video games. The fact is, video cards run significantly faster than an Intel or AMD processor when editing or rendering. And they have more cores. More cores means faster rendering and that means your render times are shorter. Those cores were designed for nothing but video, unlike the general purpose processors that power all the other stuff on our computers. With the launch of Creative Suite 5 (CS5), Adobe released the Mercury playback engine which rendered video on the Premiere Pro timeline in real time (well, not exactly real time, but it would require another article to explain that) by using the GPU to do the heavy lifting on timeline rendering. Adobe has had some issues with their Dynamic Link engine (automatically installed with CS5 or CS6 suites) rendering some dynamic links with After Effects, but it still plays back well without rendering.

Finally: the rendering process

There are two kinds of video rendering, and the kind of rendering you’re planning on or need to do will greatly influence the tool you use. The first (and simplest) rendering is from the timeline in Adobe Captivate, Premiere Pro, Vegas, Final Cut Pro, or whatever timeline-based program you’re using. This is direct rendering off the timeline, and you’re pretty much stuck using the encoding software that came with your video editing package. This works well for those who are placing your video in PowerPoint, Articulate, or whatever.
When you need to render to multiple platforms, dimensions, and codecs, or you’re serving your own video directly, you’ll probably render from your timeline to an .avi or uncompressed .mov file first, and then render to the different formats using a tool like Sorenson Squeeze. There are several products in the marketplace, and I don’t want to write reviews of them here, but with a separate rendering software package, you can set up a render “queue” with all the file types and dimensions you’ll need for your finished product. You can also do that with the Adobe Media Encoder, but a product like Squeeze has more options if you need them.

Methodology

Table 1 shows you the results (size, quality, and time-to-render) that I got from a variety of format and codec combinations. Let me explain what I did before you look at the table.
After a lot of consideration, I simplified the process as much as I could. It was Occam’s razor. Simpler is better. There are just too many variables. And no matter what we do, your results will be somewhat different. Simplest is best … at least in this case.
The method: I used a 10-second video clip from a time-lapse project I did last year. The clip has some, but not complete, movement in it. However, there are a lot of light changes and that (to a rendered video anyway) is movement. We also rendered as close to 30 fps as we could (some codecs only allowed 29.97 fps) and progressive scan. If your project doesn’t have much change or movement, 15fps might be good enough, but apples to apples, we needed to pick something.
To do the rendering, I used a variety of different formats and keyframe rates (not frames per second) so that there is some variation that you might be able to key in on. There are so many formats that you can use that I limited it to five, with a two-keyframe interval where that was available as an option. We also limited the number of codecs. It gets silly after a while, and some of the formats can only render one codec. The temporal quality is purely a judgment call on my part. If the video looked substantially different when I looked at it, I marked it down or up. When there was blocky pixilation, it really got marked down. The size speaks for itself.
Table 1: Results from various combinations of format and codec
Format Codec Overall Size Temporal Quality Time to render (min:sec.)
.avi Uncompressed 272Mb Excellent :9.5
.avi NTSC DV 37Mb Excellent :9.1
.avi Custom 37Mb Excellent :7.6
.f4v Match Source 2.28Mb Fair :7.6
.f4v Custom 7.74 Mb Good :19.4
H.264 Custom-Android 5.29 Mb Good :17.1
H.264 Apple TV 4.22 Mb Fair :8.0
QuickTime H.264 (keyframe 1) 10.41 Mb Good :6.8
QuickTime H.264 (keyframe 30) 4.22 Mb Fair :7.8
QuickTime mp4 (keyframe 1) 14.12Mb Good :9.1
QuickTime mp4 (keyframe 30) 2.40 Mb Fair :9.3
Sorenson Flash Flv (720) 2.22Mb Fair 1:01.8
Squeeze mp4 mp4 4.12Mb Fair :35.3
Squeeze QuickTime Export jpeg 14.12Mb Good :15.8
Squeeze wmv 2k 1.12Mb Good-very good :53.7
Windows Media DV NTSC 2.7Mb Very Good :19.5
Windows Media Custom 2.59Mb Very Good :20.1
Windows Media Custom 30p 2.65 Very Good-Excellent :11.2

Conclusions

There are 18 versions of the 10-second video. There could have easily been over 100 versions. It’s difficult to make concrete conclusions and recommendations because everyone’s video project, bandwidth allowances and needs, and temporal qualities are different.
As a general conclusion, we would say if you can use YouTube, by all means do so. Yes, you get the frame around the picture, but YouTube can adapt to the constantly changing environment in video as well as delivering your content to different platforms. If you’re only encoding one or two videos at a time, Sorenson Squeeze is not worth the money. But if you have a training department and are encoding tens or dozens of videos at one time, then it becomes a bargain. I didn’t get to test the Sorenson Server, but I’m sure it’s good. The idea of encoding on the fly when there’s a lot of different formats and devices to deliver to makes a product like Squeeze an attractive alternative to rendering and then showing.
The best codec to keep file sizes small is Windows Media (.wmv). Microsoft has done an excellent job of combining small file size and very good temporal quality. The format isn’t necessarily good for iOS devices, but if you can manage to get your projects on YouTube (where you can make a video private or even have a password), you will be able to deliver everywhere. And that’s a plus, in my opinion.
Note that in some of the circumstances, there are innumerable codecs and sizes for a given format. In H.264 format (not codec) there are 24 presets for Android phones and tablets and 21 presets for iDevices!!! Sometimes you’ve got to choose one and let it go.

No comments:

Post a Comment