Thursday, May 3, 2007

VOIP: What is CCXML?

What is CCXML?

CCXML - or Call Control XML - is the W3C standard markup language for controlling how phone calls are placed, answered, transfered, conferenced, and more. CCXML works hand-in-hand with VoiceXML to provide a 100% standards and XML based solution for any telephony application.

Voxeo supports CCXML in all of our VoiceCenter and VoipCenter products. Voxeo also sells an OEM C++ source code CCXML engine. Our CCXML products have routed over 1 billion calls since commercial introduction in 2002.

Below you will find a great introduction to CCXML by RJ Auburn, Voxeo's CTO - who is also the editor and chair of the CCXML standard. In addition, you'll find a CCXML executive briefing that details the benefits of CCXML. Finally, you can review Voxeo's complete CCXML reference guide and tutorials here.

Introduction to CCXML

RJ Auburn
Chief Technology Officer, Voxeo Corporation
Editor and Chair, W3C CCXML working group

As editor of the new CCXML specification I am proud to report that the W3C has just released the first working draft of CCXML, a language to provide Call Control support for telephone applications.

What is Call Control?

Newton's Telecom Dictionary, 16th edition, defines Call Control as "the term used by the telephone industry to describe setting up, monitoring, and tearing down telephone calls". Fundamentally, Call Control is session control for telephone calls.

What is CCXML?

Traditionally, Call Control has required interaction with and understanding of telephony API's which often change from one platform to another.

CCXML is the "Call Control eXtensible Markup Language". It is an XML based language that can control the setup, monitoring, and tear down of phone calls. CCXML allows the industry to leverage the strength of Web platforms and technologies to intelligently control calls on and off the telephone network. Additionally, CCXML will create a high-level industry standard for Call Control that can run over any telephony platform.

Why not build Call Control into VoiceXML?

VoiceXML was never designed to support advanced Call Control features - it was designed to be a dialog control language and it does that quite well. While VoiceXML does support basic Call Control features via the tag, those features are in fact too basic for many telephony applications. Because of this limitation, vendors looked at adding Call Control to VoiceXML using several different approaches. Some companies added robust Call Control support by extending VoiceXML while others created their own Call Control languages such as CallXML from Voxeo. While these solutions provided enhanced Call Control support they were incompatible with each other and did not address all of the Call Control scenarios reviewed by the W3C.

VoiceXML controls the presentation of media inside of calls. It uses a model based on forms and transactions that occur in a linear fashion - a model that works very well for user driven voice interfaces. Call Control uses a model based on events and commands that can occur at any time. The fundamental differences in these models made it very difficult - if not impossible - to deliver robust Call Control inside of VoiceXML itself.

Additionally, many W3C members and telephone industry leaders wanted a language that could be used outside of VoiceXML itself. While only some phone calls require automated voice interaction, every phone call requires Call Control. As a result, CCXML could end up being used and supported by everything from PBX's to the telephone switches that run the phone network itself. Many of these telephony platforms have no need or support for the things VoiceXML itself can do.

What does CCXML allow you to do?

There are a number of features that VoiceXML currently can't supply that CCXML will:

  • Support for multi-party conferencing, plus more advanced conference and audio control. Any telephone conferencing application requires such features.

  • The ability to give each active line in a voice application its own dedicated VoiceXML interpreter. Currently, many VoiceXML platforms initiate a second call or "call leg" to transfer a call from an automated VoiceXML platform to another telephone user. The second leg of a transferred call on these platforms lacks a VoiceXML interpreter of its own, limiting the scope of possible applications that can occur on that second leg.

  • Sophisticated multiple-call handling and control, including the ability to place outgoing calls at any time, initiated outside of the VoiceXML platform.

  • Handling for richer and more asynchronous events. Advanced telephony operations involve substantial signaling, status events, and message-passing. VoiceXML does not currently have a way to integrate these asynchronous "external" events into its event-processing model.

  • An ability to receive events and messages from systems outside of the CCXML or VoiceXML platform. Interaction with an outside call center platform, calls started asynchronously from the VoiceXML platform, and communication between multiple "clustered" VoiceXML or CCXML platforms all require event interaction from one platform to another.
CCXML allows developers to write advanced applications that require these features. Examples of such applications include
  • "Follow me, Find me" applications that find the person you are trying to call by dialing their cell phone, home phone, and office phone in parallel.

  • Call center applications that intelligently gather information from the caller and then pass that information on to the call center agent.

The W3C Voice Browser Working Group decided to tackle Call Control and came up with a set of comprehensive requirements that address the Call Control needs of almost all voice applications. After reviewing those requirements several proposals were submitted. CCXML is the result of those proposals.

What does CCXML bring to VoiceXML?

CCXML adds robust Call Control support to VoiceXML. However CCXML could also be used with other dialog systems such as a traditional IVR (Interactive Voice Response) platforms created before VoiceXML was available.

One critical thing to understand is that CCXML is not a media/dialog language like VoiceXML. It only provides support to move calls around and connect them to dialog resources. CCXML does not provide any dialog resources on its own. (Note: A dialog resource is anything that interacts with a caller via voice, such as a VoiceXML platform or even a second caller at another location.)

What does CCXML look like?

Let's create a CCXML application. The following example was written on the Voxeo CCXML platform implementation. You can access Voxeo CCXML platform for free by signing up at

The First Step

Lets start with the equivalent of a hello world application that conditionally answers the phone based on your caller id, plays a VoiceXML dialog and then hangs up. Being able to conditionally answer a call is one of the new features that CCXML brings to VoiceXML applications.

To start off we create a XML tag and a tag for the document. These are required in all CCXML documents.

Event Handlers

CCXML is based on a state machine model.

In general, a state machine is any program that stores the status of something at a given time and can operate on input (ie: telephony events) to change the state of the "machine" and can optionally cause an action to occur. State machines are used to develop and describe specific device or program interactions.

To summarize, a state machine can be described as:

  • An initial state or record
  • A set of possible input events
  • A set of new states that may result from the input
  • A set of possible actions or output events that result from a new state

In their book Real-time Object-oriented Modeling, Bran Selic & Garth Gullekson view a state machine as:

  • A set of states
  • A description of the initial state
  • A set of input events
  • A set of output events
  • A function that maps states and input to output
  • A function that maps states and inputs to states called a state "transition"

There are a number of ways to represent state machines, from simple tables, to C switch and case statements, to graphical design tools. CCXML uses XML tags to represent the state machine which will control one or more telephone calls.

The CCXML tag contains all the event handlers, or "transitions" for our call-control application. In CCXML you write an event handler for your application and then the handlers transition tags will receive all the matching events that occur during a call.

We Are Now Connected

Next we add a tag for the call that we just answered. We are looking for the "connection.CONNECTION_CONNECTED" event. This event will only come into the CCXML platform once the call has been connected.

Running a Dialog

Let's start a VoiceXML dialog script from the connected call event handler on the call we are connected to. We do this with the tag and by specifying the URL source of the VoiceXML document we want to run. Once we do this the CCXML platform will connect the call to a VoiceXML resource and play the script to the caller.

Here is the content of hello.vxml:

1.0//EN' '' >

Hello World.

Ending a Dialog

We now add the tag to catch the event that indicates the VoiceXML dialog has ended. We do this by catching the "dialog.exit" event.


Next we add the tag to the dialog exit handler to disconnect the caller from the CCXML platform. We will also write to the log using Voxeo's tag.

Ending the Call

We are almost done with our first CCXML script. We only need to add some clean up code to exit the CCXML interrupter. We do this by adding a tag to catch the "call.CALL_INVALID" event:


The End Is Nigh

Finally we add a handler for the call_invalid event that occurs when a call ends, including an tag to leave the CCXML platform:

Congratulations, you have now written your very first CCXML application! If you would like to learn more about CCXML there are a number of tutorials that will help get you up to speed on of CCXML that you can access at

CCXML Executive Briefing

Companies are recognizing the important role that advanced call control functionality plays in effectively communicating with customers, employees and partners. The challenge is finding a cost-effective method that leverages existing Web and data systems investments to delight customers and deliver significant returns. Using a band-aid approach to upgrade older generation, proprietary Interactive Voice Response (IVR) and contact center systems is very expensive and ineffective in the long run. Widely popular, open standards VoiceXML solutions are a big step in the right direction...