What Is the SIP Protocol?

By: on April 15, 2016

If you want to know what the Session Initiation Protocol (SIP) is, you may also wonder how phone calls travel across the internet.

The answer, unfortunately, is somewhat more complex than this:


A series of tubes may be involved somehow—perhaps as shielding for some cable drops. However, you need to understand SIP in order to fully grasp how internet-based phone systems and network services such as SIP trunking work.

To help, we’ve created a two-part guide to answer your questions about SIP. In this first part, we’ll focus on the protocol itself.

The second part focuses on a service called SIP trunking—the primary usage of the protocol that most IT managers need to be familiar with.

Here are the questions we’ll cover in part one:

Click on a link below to jump to that section.

What Is a Protocol?
What Is SIP Used For?
How Does SIP Work in a VoIP Call?
Okay, So How Much Do I Really Need to Know About SIP?

What Is a Protocol?

“Protocol” puts the “P” in “SIP.” In order to better understand SIP, let’s look at what a protocol is first.

A protocol is a set of rules that defines how two or more computing devices (laptops, smartphones, routers, network switches etc.) communicate with each other.

The internet isn’t simply based on one single protocol, but on a complex and diverse group of protocols collectively known as the “internet protocol suite.”

To keep things simple, we’re going to focus on the protocols involved in making and receiving phone calls over the internet. This is a technology known as Voice over Internet Protocol, or VoIP.

It’s important to remember that VoIP isn’t a protocol itself. Instead, VoIP is an umbrella term for all of the technologies involved in transporting voice information using Internet Protocol.

Communications between networked devices on the internet don’t just involve a single protocol at a time. Instead, multiple protocols work together at the same time by building on top of each other in layers, known as a “protocol stack.”

There are different models of how protocols layer on top of each other. The Open Systems Interconnection (OSI) model developed by the International Organization for Standardization (ISO) is the most common and the easiest to understand:

The OSI Model, With Locations of Protocols
Involved in VoIP Technology

Let’s take a closer look at two layers in the OSI model:

  • The transport layer, which controls the reliability, speed and order of data exchange. Data, including the voice data of a phone call, is broken into packets in order to be transported over the internet. Transport layer protocols also control the routing and ordering of data packets during transmission.
  • The application layer, which specifies the protocols and interfaces that software applications use to communicate over a network connection.
SIP is an application-layer protocol, and it’s the foundation of modern interactive communications over the internet (voice calls, video calls etc.).

What Is SIP Used For?

The SIP protocol doesn’t encode audio information in a phone call, nor does it transport audio information.

Instead, the Session Initiation Protocol is just that: it initiates and terminates communications sessions, whether the session is a voice call between two people or a video conference between a whole team.

Gary Audin is president of Delphi Inc., an IT consultancy specializing in IP communications, and writer at No Jitter, a publication covering the IP communications industry. He explains:

“SIP is a media-independent protocol—it’s not voice, it’s not video, it’s not data—it could be anything. While it’s mostly applied to VoIP, it’s not a VoIP protocol.”

The job of SIP is to set up a call, conference or other interactive communication session and terminate it when it’s over.

SIP does this by sending messages between endpoints on the internet known as “SIP addresses.” A SIP address can be linked to:

  • A physical SIP client, such as an IP desk phone.
  • Or, a software client, such as a computer application that allows you to make and receive calls (known as a softphone).

The box below shows what the initiation of a SIP session looks like. INVITE is a SIP message used to request participation from another SIP client. The chunks of text resembling email addresses are the participants’ SIP addresses:


SIP doesn’t do all that much during the session itself—its primary purpose is to establish the session and then end it.

As Audin puts it, SIP “tells you the presence of the other party, makes a connection and lets you do whatever you want over the connection, but it has no idea of what’s going over the connection.”

This is why SIP can be used for video conferencing and instant messaging as well as making phone calls over the internet. We’ll leave the other usages of SIP aside for now and focus on how the protocol works in the context of a phone call.

How Does SIP Work in a VoIP Call?

Before voice information can be transported over the internet, it must be encoded with codecs that translate audio signals into data.

A variety of codecs are used for this purpose, but two of the most common are:

  • The G.711 codec, which is used for uncompressed digital voice. While audio quality is better than with other codecs, G.711 also uses more bandwidth.
  • The G.729 codec, which is commonly used for compressed voice. It degrades audio quality in order to reduce the amount of data transmitted, thereby cutting the amount of bandwidth consumed by the call.

The encoded packets of audio data are then transported using the real-time transport protocol (RTP): a specialized application-layer protocol for transporting audio and video data when real-time streaming is necessary.

The RTP control protocol (RTCP) works alongside RTP to provide information about RTP packet delivery, which is used in managing the quality of voice service.

The screenshot below shows the RTCP packets in a VoIP call I made to a colleague using Wireshark, a popular open-source tool for analyzing network packets:

RTCP Packets in a VoIP Call
Source: Wireshark.org


RTP packets and SIP packets are themselves transported by protocols at the transport layer such as:

  • Transmission Control Protocol (TCP): A protocol designed to transmit packets in an ordered sequence and to retransmit any packets that get lost along the way. Packet headers specify the order of each packet in the sequence. If packets get jumbled up during transmission, they can be reordered at the receiving end.
  • User Datagram Protocol (UDP): A protocol designed to transmit data without retransmission of lost packets or detection of out-of-sequence packets.

UDP is better than TCP for transporting VoIP calls. Lost and out-of-sequence packets can cause slight audio quality issues, but in many cases these aren’t detectable by the human ear. The amount of delay caused by the re-ordering and retransmission of TCP packets can ultimately result in much worse audio quality problems and dropped calls.

Finally, since SIP is media-independent, another application-layer protocol called the Session Description Protocol (SDP) works alongside SIP—it specifies which types of media the SIP clients involved in the session can actually support.

At this point, you may be asking why SIP is so important if all it does is set up and tear down calls.

Audin explains that the telecommunications industry has standardized on SIP precisely because the protocol isn’t involved in the encoding and transmission of audio data.

“The biggest problem we’ve had over the years with Voice over IP is that the protocols that were written to support it were closed,” he points out.

“What that meant was that when you wanted to fix things up, you had to rewrite the protocol. SIP was designed the opposite way, in order to create a standard protocol where another standard defines the media you’re carrying—so you don’t have to keep rewriting the protocol over and over again.”

Okay, So How Much Do I Really Need to Know About SIP?

You’ll be happy to hear that this high-level overview of the protocols involved in a VoIP call should be enough for most IT managers. Only application developers at telecommunications companies need to understand the mechanics of each protocol and the relationships between them.

If you’re just deploying and administering a VoIP phone system, you likely won’t ever need to understand more about SIP than the details we’ve covered here.

What IT managers do need to understand, however, is a network service known as SIP trunking, which is central to the workings of most IP phone systems.

Next Steps

The second part of our guide covers SIP trunking in detail, but if you feel as though this first part contains as much detail as you need to get started, you can call us at (855) 998-8505. Our advisors can provide a short list of vendors that can help you transition to SIP-based phone service, for free. You can also get pricing details for IP phone systems.

If you do need to know more about SIP, you can explore the courses offered by the SIP School, an organization devoted to SIP training and certification. Once you’ve learned the basics, you can take a look at the SIP Forum, a nonprofit devoted to ensuring interoperability between SIP-based devices and software applications by standardizing how they implement the protocol.

You may also like:

What Is SIP Trunking?

Top Considerations for Selecting and Implementing a SIP Provider (Part 1)

Cloud vs. On-Premise? 2 Unified Communications Case Studies

Compare Business VoIP Solutions