Can Alec Saunder Woo Developers Back to the Blackberry Platform?
150 Years Ago Today, the USA Got Wired!

My Rant: Who Are We Building RTCWEB/WebRTC For? Telephony Developers or Web Developers?

IetflogoYesterday morning I did something I haven't done in eons. Many years, probably. (I can't remember.) I fired off a "rant" on an IETF mailing list.

I've been a huge proponent of the "RTCWEB/WebRTC" work going on in the "RTCWEB" Working of the IETF and the "WebRTC" of the W3C. I've mentioned it in many of my presentations. I've advocated for people to join the mailing lists. I've written about it a good bit on Voxeo's standards blog when I was at Voxeo.

We have an opportunity to make it easy for web developers to add "real-time communications" via voice, video, IM, etc., to web applications. We can make that work from directly within the browser.

Think of it... HTML5 with the ability to quickly add voice, video, chat... and without the need for a browser plugin or extension in Flash, Java, etc. (the limitation of all of today's proprietary options).

It's the opportunity to move real-time communications into the very fabric of the Web.

Awesome potential!

The work has been moving along quite rapidly in both the IETF and the W3C. Extremely active (high-volume!) mailing lists. Many Internet-Draft documents being created. Regular conference calls, interim meetings, face-to-face meetings. Some truly brilliant - and passionate - people involved. (Read the RTCWEB overview Internet-Draft for more background.)

But I still can't escape the feeling that the direction isn't quite right... as a friend said to me:

my feeling is that this is being appoached as SIP 5.6.2 - a minor tweak on an established standard - not WebRTC 0.9 - a new dawn in a new world.

I haven't honestly had the time to read all the messages with the crazy amount of traffic (which is good - shows people are passionate about the topic!), but I've felt increasingly frustrated with reading the messages that I have read that we're collectively in the midst of developing something that few developers will actually use.

So I ranted. Will it do any good? Maybe. Maybe not.

What the rant really needs is to be backed up by people who have the time to join in the process and contribute suggestions for how a RTC API that would appeal to "web developers" would look like.

Care to help out? The mailing list is open to anyone to join.

Anyway here's the rant... (and yes, for the truly pedantic, I am very aware that I ended with a </rant> but did not start with a <rant>)...


I need to rant. I've been lurking on this list from the beginning but with a new job I haven't been able to really keep up with the volume of messages... and every time I get ready to reply I find that others like Hadriel, Tim, Neil, Tolga or others have made the points I was going to make...

But I find myself increasingly frustrated with the ongoing discussions and want to ask a fundamental question:


Is it for:

1. Telephony developers who are tired of writing code in traditional languages and want to do things in new web ways;

2. Web developers who want to add real-time comms (as in voice, video and chat) to their existing or new web applications;

3. Both 1 and 2.

If the answer is #1, then I think everything is going along just wonderfully. We can go ahead and use the SIP/SDP/etc. stuff that we all in the RAI area are all used to and understand just fine. Heck, let's just all end the discussions about a signalling protocol and agree on SIP... get the browser vendors to agree on baking a SIP UA into their browsers... and call it a day and go have a beer. Simple. Easy. Done.

And the only people who will ever use it will be people who work for RTC/UC/VoIP vendors and random other programmers who actually care about telephony, etc.

But that's okay, because the people who do use it (and their employers) will be really happy and life will be good.

If the answer is #2, then I think we need to step back and ask -


Here's the thing... in my experience...


Never have. Never will. (In fact, I may be understating that. It may actually be 99.99999%.)

If they are with startups, they want to build nice bright shiny objects that people will chase and use. They want to make the next Twitter or FourSquare or (pick your cool service that everyone salivates over). If they are with more established companies, they want to create easy-to-use interfaces that expose data or information in new and interesting ways or allow users to interact with their web apps in new and useful ways.

And they want to do all this using the "languages of the web"... JavaScript, PHP, Ruby, Python, etc.

They want "easily consumable" APIs where they can just look at a web page of documentation and understand in a few minutes how they can add functionality to their app using simple REST calls or adding snippets of code to their web page. Their interaction with telephony is more along these lines:

"Wow, dude, all I have to do is get an authorization token and curl this URL with my token and a phone number and I can create a phone call!"

And the thing is... they can do this **TODAY** with existing proprietary products and services. You can code it all up in Flex/Flash. You can write it in Java. You can use Voxeo's Phono. You could probably do it in Microsoft's Silverlight. I seem to recall Twilio having a web browser client. A bunch of the carriers/operators are starting to offer their own ways of doing this. On any given week there are probably a dozen new startups out there with their own ideas for a new proprietary, locked-in way of doing RTC via web browsers.

Web developers don't *NEED* this RTCWEB/WebRTC work to do real-time communications between browsers.

It can be done today. Now.

The drawback is that today you need to have some kind of applet/plugin/extension downloaded to the browser to allow access to the mic and speakers and make the RTC actually work. So you have to use some Flash or Java or something. AND... you are locked into some particular vendor's way of doing things and are reliant on that vendor being around.

THAT is what RTCWEB can overcome. Make it so that web developers can easily add RTC to their web apps without requiring any downloads, etc. Make it do-able in open standards that don't lock developers in to a specific product or vendor.

But if we are targeting "web developers", that is who we need to satisfy... and we need to understand that they *already* have ways to do what we are allowing them to do.

If we come out with something that is so "different" from what "web developers" are used to... that requires someone to, for instance, understand all of what SIP is about... that requires a whole bunch of lines of code, etc.... well...

... the web developers out there will NOT launch an "Occupy RTCWEB" movement claiming that they are the "99% who don't care about telephony"... they will simply... not... use.... RTCWEB!

They will continue to use proprietary products and services because those work in the ways that web developers are used to and they make it simple for a web developer to go add voice, video and chat to a web app. Sure... they will still require the dreaded plugin/extension, but so be it... the "open standard" way is far too complicated for them to look at.

And all the work and the zillions of hours of writing emails and I-Ds that this group has done will all be for nothing. Well, not nothing... some of the telephony-centric developers will use them. But the majority of the web developers out there may not because there are other simpler, easier ways to do what they need to do.

So I go back to the question - who are we building RTCWEB for?

Is the goal to enable the zillions of web developers out to be able to use real-time communications in new and innovative ways? Or is it solely to make it so that VoIP/UC/RTC vendors can make a softphone in the browser that calls into their call center software?

RTCWEB *can* enable both... but to me it's a question of where the priority is.

The question is - will the RTCWEB/WEBRTC API/protocol/whatever be so simple and easy that web developers will choose to use it over Flash/Phono/Twilio/Java/whatever to add RTC functions to their web apps?

If the answer is yes, we win. Open standards win. Maybe we upgrade from having a beer to having champagne.

If the answer is no, what are spending all this time for?



NOTE: And, as I suppose must be the case with any good rant, mine was not entirely accurate. As multiple people pointed out (one example), my ending where I ask about whether people would choose the RTCWEB/WebRTC API over Flash/Phono/Twilio/Java/whatever is not entirely on target. The question is really... will vendors creating libraries like Phono choose to use the RTCWEB/WebRTC protocols/APIs or will they continue to use their own proprietary solutions?

As people pointed out, there will be a hundred different JavaScript libraries created (like Phono) that will consume the RTCWEB work... and most web developers will use those libraries and not program directly with the RTWEB/WebRTC APIs and protocols.

Fair enough... but the question remains - will the RTCWEB work make it easy enough for all those JavaScript libraries to blossom?

Others pointed out that I'm really talking about the web API that would be exposed via the W3C's work versus the low-level API coming out of the IETF. And yes, that is perhaps technically true... but the reality is that it is the same set of people working in two different mailing lists... and both efforts are contributing to the end result.

In the end, I want to see a result of all this RTCWEB/WebRTC work that developers will actually deploy and use!

I want open standards to win.

If you found this post interesting or useful, please consider either: