Have you ever taken a voice or video call over a software based phone (soft phone/VOIP phone)? Then you might have been curious to learn how they are implemented and what are the internet protocols, standards and technologies working together behind the scene to make the communication over IP a reality.
More interestingly, would you like to write a SIP soft phone of your own? yes? of course I would say it is not that hard. Well... when I heard the word SIP soft phone, I didn't have any idea what it does or how to implement one. But gradually I developed my understanding by referring to existing products and reading related RFCs which defines standards for communication over SIP and starting from the root level, ended up developing a working SIP soft phone client with one of my colleagues.
More interestingly, would you like to write a SIP soft phone of your own? yes? of course I would say it is not that hard. Well... when I heard the word SIP soft phone, I didn't have any idea what it does or how to implement one. But gradually I developed my understanding by referring to existing products and reading related RFCs which defines standards for communication over SIP and starting from the root level, ended up developing a working SIP soft phone client with one of my colleagues.
In this article, I will give a basic idea as what a SIP soft phone means and does, starting from some background details.
Why communication over IP?
While communication technologies evolve very rapidly, communication over IP has received increasing popularity mainly due to the flexibility and affordability it offers. VOIP provides bandwidth efficiency and low costs by routing the phone calls on the existing data networks and thereby avoiding the need to have two separate networks for voice and data. Also, the possibility of integrating with new features such as: audio/video conference calling, IVR, message or data file exchange in parallel with the conversation and secure calls using standardized protocols such as Secure Real Time Protocol, is another main reason for the popularity of calls made over soft phones.
Soft Phone
The term soft phone refers to a software-based phone (soft phone) that is installed in the user's PC and that converts voice and video into IP packets and vice versa for voice over IP (VoIP) telephony service. The first VoIP softphone was VocalTec's Internet Phone, which was introduced in 1995. There are many VOIP software available today under both free and commercial license. Some of the popular soft phone clients are Skype, GTalk, LinPhone, Yahoo Messenger, Cisco IP Communicator and Ekiga. They have been implemented in various ways using both proprietary and open protocols and standards.
SIP+soft phone
The word SIP comes in to play here when the built-in IP signaling protocol used in a particular soft phone, is open standard SIP (stands for: Session Initiation Protocol). SIP, as stated in RFC 3261, defines an open standard for session initiation between two or more endpoints such as SIP phone clients or between an endpoint and a server. Examples for the soft phone clients which use the open standard SIP as their signaling protocol are: Sip Communicator, Yahoo Messenger, Mirial Softphone, Minisip and Ekiga. Some VOIP software like Skype, implements proprietary protocols for communication, therefore in order to establish a call, it requires that a Skype client itself is there at the other end of the call also. In contrast, since SIP is open standard, different VOIP software products, provided that they are based on SIP, can communicate without compatibility issues.
SIP works in conjunction with other protocols such as:
· SDP-Session Description Protocol (for describing media parameters that will be used during the session),
· RTP-Real Time Transport Protocol (for transporting real time data such as voice and video)
· RTSP-Real Time Streaming Protocol (for controlling delivery of media)
Altogether they build a complete multimedia architecture and there by SIP enables internet end points such as soft phones to discover one another and agree upon a session. Yet again one thing that is important is basic functionality and operation of SIP does not depend on the other protocols it works with.
SIP Identification: Just like you have a unique phone number to your mobile phone, SIP clients also need to be identified uniquely using a SIP URI (Uniform Resource Identifier) which takes the format: sip:username@hostname where username is the ID with which SIP phone user registers with a SIP server, and the hostname is the domain of user’s SIP service provider or just the IP address of the machine where the soft phone is installed. The latter case is used when setting up calls within the same IP space such as a LAN where a SIP server is not involved in setting up the call.
Example SIP URIs: sip:kamal@biloxy.com, sip:sama@atlanta.com, sip:kamal@192.16.8.105.
There are basically two parts within the implementation of a SIP based soft phone. One is the signaling module which focuses on signaling, negotiation, establishing of a call/media session as well as terminating and rejecting of it. This is known as the SIP stack. The other one is the media module which performs the tasks of conversion of the analog voice, video signals to digital format and compression into IP packets, to be transported over the network and exchanging media between the participants of the call/session. It is usually called as the media stack of the soft phone.
There are several drawbacks with VOIP technologies and with SIP itself. As a computer based technology, VOIP too is susceptible to attacks by hackers. Although there are security standards for VOIP such as Secure Real-time Transport Protocol (SRTP) and the new ZRTP protocol, few VoIP solutions support those. Therefore, it is relatively easy to eavesdrop on VoIP calls if you are not on a secure network. Some vendors also use compression to make eavesdropping more difficult. However, real security requires encryption and cryptographic authentication which are not widely supported.