GStreamer: state of the union

October 30, 2017

This article was contributed by Mischa Spiegelmock

The annual GStreamer conference took place October 21-22 in Prague, (unofficially) co-located with the Embedded Linux Conference Europe. The GStreamer project is a library for connecting media elements such as sources, encoders and decoders, filters, streaming endpoints, and output sinks of all sorts into a fully customizable pipeline. It offers cross-platform support, a large set of plugins, modern streaming and codec formats, and hardware acceleration as some of its features. Kicking off this year's conference was Tim-Philipp Müller with his report on the last 12 months of development and what we can look forward to next.

The core team has been sticking to a more or less six-month release schedule, adjusted somewhat for other timelines. The project is aiming to land the next 1.14 release enough ahead of the Ubuntu 18.04 long-term support version so that for the next few years there is a relatively recent version for developers to base their work on.

Project components

There is a system of categorization (worth a read for the amusing descriptions) for plugins based on the film The Good, the Bad and the Ugly. These plugins comprise most of the useful functionality of GStreamer. "gst-plugins-good" is made up of plugins with solid documentation, tests, and well-written code that should be used as examples for writing new plugins. "gst-plugins-ugly" is similar in quality to "good" but may pose distribution issues because of patents or licenses. "bad" is for all the rest that are of varying quality, perhaps not well documented, or not recommended for production use and to base new plugins on.

There is an ongoing mission in the GStreamer project to consolidate the platform by trying to promote plugins and pieces of code from the "bad" repository into "good" by fixing whatever may be at issue. There is now an effort to clearly document why a plugin remains in the "bad" category so that contributors know what needs to be fixed and maintainers can remember why a plugin was considered unfit and re-assess it at a future date.

Patents on the MP3 and AC-3 audio codecs have expired. The mpg123 decoder and the lame MP3 encoder have been moved to "good", though the GPL liba52-based a52dec decoder for AC-3 must remain in "ugly". GStreamer itself is released under the LGPL, so GPL plugins that could pose problems for distributors wind up in "ugly".

Performance and multiprocess improvements

Ongoing efforts to improve parallelism have paid off; the operations of video scaling and conversion are now multithreaded, and an upcoming ipcpipeline can be used to split pipelines across multiple processes. Multiprocessing can be used to isolate potentially dangerous demuxers, parsers, and decoders. This is a concern for anyone parsing user input in the form of media files, which is a longstanding source of application vulnerabilities.

There have been fruitful efforts to use an abstracted zero-copy mechanism in DMABuf and other operations involving passing buffers between sources, elements, and sinks. Memory allocation queries for the tee pipeline element are now aggregated for zero-copy. High-speed playback in the DASH HTTP adaptive streaming format used by Chromecast among other things is being enhanced. Playing a media file at faster than the normal rate, such as listening to a podcast at 2x speed, normally consumes more bandwidth than regular playback. New support has landed to reduce those bandwidth requirements for DASH.

Hardware acceleration support continues to improve in the form of improvements to the integration with the video acceleration API (VA API). Encoders now have ranks so that they can be chosen to prefer those that have hardware acceleration. Support for libva 2.0 (VA API 1.0) has been enhanced. Static analysis issues found by the Coverity tool were all fixed. There is a new low-latency mode for H.264 decoding. Constant bit rate for VP9, variable bit rate for H.265, and a "quality" parameter for encoding are all now supported.

Other new features

There is now comprehensive support in the upstream gst-plugins-bad for using Intel's Media SDK, which is an SDK offered for recent Intel chip platforms such as Apollo Lake, Haswell, Broadwell, and Skylake. This enables hardware acceleration for encoding and decoding common video formats, video post-processing, and rendering. The goal is to make it easy for developers to "use MSDK in their GStreamer-based applications with minimal knowledge of the MSDK API".

Work to allow x264enc (using the GPL-licensed libx264 encoder) to be used with multiple bit depths was described as "very hard" due to the fact that the bit depth must be specified at compile time for the library. Now multiple versions of the library can be compiled and then selected at run time.

Timed Text Markup Language support has been added. This is part of the SMPTE closed-captioning standard for online video, and has potential to be a general intermediary representation for text subtitles and captions.

rtpbin is now enhanced for accepting and demultiplexing bundled RTP streams for purposes of constructing a WebRTC pipeline. This greatly simplifies the process for doing W3C-standards-based live video streaming and conferencing in a web browser using gstreamer and will soon be mandated by WebRTC. More on new WebRTC features further down.

A rewrite of splitmuxsink to be more deterministic about splitting output streams based on time or size has been done. Typical uses of this element involve segmenting recordings of things such as surveillance video inside of older container formats (e.g. classic MP4) that cannot be split in arbitrary locations without properly finalizing the file by writing out headers and tables and beginning a new file. In the upcoming release the hlssink2 element will take elementary streams as input and output chunks for HTTP Live Streaming (HLS) making use of the splitmuxsink element.

In addition, GstShark was demonstrated at the conference, which enables rich pipeline tracing and graphing capabilities. It is particularly useful for pinpointing causes of poor frame rates or high latency.

A casual mention was made of the fact that GStreamer now has the first implementation of the Real Time Streaming Protocol (RTSP) version 2.0, both for client and server. RTSP is in wide use for controlling live media streams from devices such as IP cameras.

There is interest in using the systems programming language Rust for GStreamer development to improve memory and thread safety and to use modern language features. In another talk, Sebastian Dröge described the current state of the GStreamer Rust bindings. They have been in development for quite some time and many people are actively developing with them. The bindings provide a mostly complete set of functionality for both application and plugin development, and no longer require the use of "unsafe" sections by users of the bindings. They are mostly auto-generated now via GObject introspection and have a native Rust feel while retaining the same API usage patterns familiar to anyone used to working with GStreamer.

Future work

Debugging improvements slated for the coming 1.14 release include debug log ring buffer storage, which is useful for keeping recent logs for long running tasks or in disk-constrained environments, Müller said. A "new more reliable" leak tracer that is now more thread-safe and supports snapshots and dumping a list of live objects is also planned. The leak tracer is currently very Unix-specific as it relies on Unix signals, so work is needed to come up with a suitable mechanism on Windows as well.

Plans of varying concreteness were mentioned for future work and improvements:

Adaptive streaming with multiple bit rates could be improved for DASH and HLS.
Internally implementing a stream-handling API could be done in more demuxers as well as adding better handling of stream deactivation support.
More native sink elements for output on Windows, iOS, and Android are needed along with native UI toolkits.
Windows code needs to be updated to use newer APIs and legacy support for Windows XP should be dropped.
Android, iOS, and macOS "need some love" to catch up with the latest versions.
Adding support for the ONVIF surveillance camera interoperability standard, including for things like audio back channel and special modes.
Better OpenCV integration for the popular computer vision library.
Support for virtual reality formats.

The conference kickoff concluded with an open question about how the project can better interact with users and contributors. The existing workflow of attaching patch files to Bugzilla entries may feel cumbersome to some contributors (such as the author) compared to modern pull-request workflows. There is a desire to move to an open source solution such as GitLab, which would provide pull requests and help track sets of changes that may span multiple repositories. GitHub was explicitly mentioned as a platform that the project will not be moving to due to its proprietary nature, which is something the free-software project is exactly not about. While there is already a mailing list and excellent IRC channel, there is a possibility that a "proper" forum may be coming soon as well for people to have discussions and post multimedia.

GStreamer and WebRTC

A hot topic right now both at the recent IETF conference and at the GStreamer conference was WebRTC support; Müller mentioned that he was getting asked 30 times a day "how do I stream to my web browser?" WebRTC is a draft standard being worked on by the W3C and IETF to enable live video and audio streaming in a web browser, something that until recently was only achievable with Flash, in practice, or in limited server-side use with HLS for cross-platform and browser compatibility. WebRTC makes peer-to-peer videoconferencing in a web browser possible, although it has other uses cases, such as simplifying the streaming of live video to or from a web browser and even telephony, as in several existing WebRTC-to-SIP gateways.

Development has been active and ongoing for rich WebRTC support in GStreamer. Matthew Waters came to Prague all the way from Australia to talk about building a selective-forwarding server that can support multi-party conferencing with GStreamer. WebRTC is a peer-to-peer protocol, so multi-party conferences without a server in the middle handling media streams is prohibitively expensive for a large number of users. This is because each and every peer in the call will stream to every other peer in a mesh-style network. Another possible design for multi-party WebRTC is a central mixing server, also called an MCU (Multi-point Control Unit), which moves most of the cost to the provider instead. A middle ground where the server only forwards the media streams to the other peers (a Selective Forwarding Unit) is a good compromise for sharing the computational and bandwidth costs between the user and the provider.

Waters was able to achieve this by creating a new element, webrtcbin, that provides the necessary network transport requirements for a WebRTC session such as DTLS and SRTP (encrypted datagram formats for streaming media), trickle ICE for network traversal and shorter call setup times, and the trusty rtpbin element for RTP all wrapped up with an API similar to W3C JavaScript PeerConnection API.

While it can be complicated to write a server to handle the many moving parts required to do WebRTC well, GStreamer makes it eminently practical to construct fully customized client and server applications with this relatively new protocol. As is frequently the case, the GStreamer project is not only on the forefront of emerging media technologies but the talented and dedicated community is quick to showcase examples and demonstrations of how to make use of the features in a non-trivial application that makes it look easy.

[Videos of this year's talks can be found online.]

Index entries for this article
GuestArticles	Spiegelmock, Mischa
Conference	GStreamer Conference/2017

to post comments

GStreamer: state of the union

Posted Nov 1, 2017 5:42 UTC (Wed) by eru (subscriber, #2753) [Link] (2 responses)

Patents on the MP3 and AC-3 audio codecs have expired.

I wonder about MP2 video? The standard is over 20 years old by now (first released 1996, says Wikipedia), but related plugins are still in the ugly category at https://gstreamer.freedesktop.org/documentation/plugins.html

GStreamer: state of the union

Posted Nov 1, 2017 5:49 UTC (Wed) by sfeam (subscriber, #2841) [Link]

I think these are in 'ugly' because they use a GPL (as opposed to LGPL) library, as mentioned in the article.

GStreamer: state of the union

Posted Nov 1, 2017 19:49 UTC (Wed) by ocrete (subscriber, #107180) [Link]

According to this Wikipedia article, the last US patent on MPEG 2 expires in February 2018, although the 2 patents that expire in 2018 seem to be related to the Transport Stream instead of the video codec, so the last one of the video codec ones may expire in December!

https://en.wikipedia.org/wiki/List_of_United_States_MPEG-...