If you’re a “beast mode” developer you’re probably here because you said to yourself. “I want to build a real time communication app without proprietary technologies”. Now if you’re a novice developer you probably said “I want a chat app, and I want it now”. In any case you are in the right place. I’m going to show you how to build FaceTime in an hour….Ok not really, but I will show a quick and easy way to get a video chat application going. So lets start this thing off right! With the all important question that starts every blog. “What is ___”?
What is WebRTC?
WebRTC is the same as many other technologies looking to solve the problem of how to do real time communication in the browser. For example, under the hood WebRTC is doing the same things that Adobe did with RTMFP. The main difference is WebRTC is not proprietary, and doesn’t require a plugin.
The three APIs that really make WebRTC tick are:
- GetUserMedia (camera and microphone access)
- PeerConnection (sending and receiving media)
- DataChannels (sending non-media directly between clients)
Does this stuff work for native mobile applications?
So were you like me a few years ago when that shiny new toy called WebRTC came to be? You got really excited, while shedding tears of joy, but then you searched for…”WebRTC on mobile”. Yeah my heart dropped too. Nothing. Today things are different. There are now core libraries that Google uses for Chrome that allows for different build targets including Objective-C and Java. What’s really awesome is there are now build scripts out there that make compiling for these targets easier.
Will you just get to the code already?
Alright. Alright. Enough small talk. Let’s get to some code.
The first thing we need to do is setup our HTML.
As you can see, we use a tag with an id of for the data that is being streamed from your desktop computer camera. In addition, you can see the container. This container will be used to display the tag of the data that is being streamed from your peer’s device, in our case an iPhone camera. SimpleWebRTC does a little magic for us to populate this container every time a new peer is connected to your chat room.
We first need to initialize SimpleWebRTC. We provide it with a few configuration parameters.
So SimpleWebRTC needs a reference to the target elements. The parameter is the id for the tag we created earlier in our HTML. Next you see the parameter, which is the id reference to the container that will contain the video stream from our peer’s device. The flag is used to dictate if you want to automatically request permission to get the user’s media data (camera and microphone). Setting this flag to will force a call to a wrapper method that will at some point call , which was one of the three important APIs I talked about earlier. The last parameter is the url to the signal server (we’ll talk about this later).
Now we are ready to roll, or since we are talking about SimpleWebRTC.
At this point I had a big kool aid smile on my face because I knew it was officially ON! So lets get to chatting.
Calling creates an Offer SDP (Session Description Protocol). Then that Offer is sent to your Signal Server, which then is sent to all the peers in the chat room. Those peers then return what is called an Answer SDP. This passing of metadata is what allows you to directly communicate with other peers in a WebRTC application.
The mobile client is very similar to the web client. This is where we will be using Google’s core WebRTC library, that is compiled using the Objective-C target. The great thing about SimpleWebRTC is that it comes with it’s on build for the WebRTC library. So we can be lazy and not do it ourselves.
Just like the web client we need to configure SimpleWebRTC, establish a connection to the signal server, and then join a room. Under the hood everything is still the same. We request permission to get the user’s media data, and connect to your peers by creating an Offer/Answer SDP.
The signal server is the middle man. It handles the transferring of configuration metadata from one peer to another. One peer passes their Offer SDP to the signal server, the signal server then passes that to another peer, then an Answer SDP is generated and now you are ready to directly communicate with the other peer (through a UDP/TCP connection). The signal server is no longer needed to communicate between the two peers. Now, we could get into STUN/TURN servers, but I think we should save that for the real nerds. If you really want to know more about signal, STUN, and TURN servers, then check this out.
So what’s next?
Dude. Nothing is next. We basically have FaceTime right in front of us. Video chatting at it’s finest, but maybe you are an overachiever. You want more right? Well I’ll tell you what, try adding messaging to the application, or peer to peer file sharing. Maybe you go big and start thinking about a large scale project you want to do utilizing WebRTC, like realtime multiplayer gaming. The skies the limit!