StartReady: UC Appliance Performance Test
About three months ago I started blogging about StartReady. One of the things StartReady recently did, was a performance test of their UC Appliance. In this episode I’d like to share the results with you.
Challenges of Virtualizing of the Mediation Server
As you probably all know, Microsoft doesn’t support virtualizing Microsoft OCS in any way. For most roles, virtualizing is not a real issue, but for the Mediation Server specifically it is. Working with SIP Traffic is highly processor agnostic and performance intensive. When using a Virtualization layer, this can result in performance and timing issues. Of course, this doesn’t happen immediately, but Microsoft is unsure of its exact limitation, hence the non-support statement.
StartReady knew about this when they started building the Appliance and until now, they claim that they haven’t run in any issues. To address the non-support issue they currently offer a Scale-Out Appliance for Mediation Role. This Scale-out scenario places the Mediation role on a separate server (through an automated and scheduled process). In this new environment, the Mediation Role is run in an non-virtualized manner, all within the prescription of Microsoft.
But to be able to give customers a clear understanding of when this Scale-Out Appliance should be used, they needed the results of a clear and precise performance test. With Mitel (Netherlands) they arranged for this. Together they build a test environment to stress the virtualized Mediation Server.
For this test StartReady had three goals in mind:
1. Find out the maximum concurrent phone calls on the StartReady UC Appliance with the virtualized Mediation Server;
2. Stress the RTCAudio codex;
3. Tune the Hyper-V for the Mediation Role.
The test environment is build out of several components, see figure. First, there is the StartReady UC Appliance with System Center Essentials and the 3 OCS Server Roles all virtualized on Windows Server 2008. Secondly, there is the Mitel 3300 IP Communications Platform. This a highly scalable IP-PBX that provides robust call control, extensive features and supports a wide range of innovative desktop devices and applications for medium-to-large enterprises. For generating the calls to the Mitel-platform they used a Linux server with Asterisk’s Call Generator. The Call Generator places the call and playes a .wav file to simulate a conversation.
To make the test environment realistic they virtualized 100 Windows XP Clients with the Office Communicator 2007 installed. They also developed a tool to automatically answer incoming calls with the Communicator. So, the complete call was scripted and automated. To make sure that the audio was really getting through the entire system, they first recorded the audio in the communicator and also logged (client side) the network traffic with WireShark (this is a network protocol analyzer for Unix and Windows). With these basic test results, they were ready to stress the system.
The tests
They ran about 10 tests in total. They started with 10 concurrent calls and increased each run with 10 additional calls. So, finally they would end up with 100 concurrent calls on the UC Appliance.
Each test cycle consisted of the following steps: setup a new call every 3 seconds, automatic answering of the communicator call, play the 4 minute .wav file to simulate audio and place one additional ‘real’ call between a Mitel attached device and a laptop with Communicator running. The last call was done to confirm the quality of the call/audio.
The grand finale was a burst test cycle. The time interval between calls was reduced to 0,5 seconds. This was done to stress the environment even more.
The results
During the test they logged performance counters from the host machine and the virtuals. They mainly focused on CPU performance.
Let’s discuss two tests, because these are the most relevant. The first one is a test run of 95 concurrent callers, see picture. The purple line displays the calls being placed. The bold blue line is the CPU performance of the Mediation Server. So you can see that with this load the mediation server is peaking to about 30/40 percent and has an 25% average. The brown line indicates the CPU performance of the Windows Server 2008 host environment.
The second result I would like to share is the burst test cycle. As already mentioned each 0,5 seconds a new call is placed. So within a time frame of 50 seconds all (in this case 96) calls are placed. This result also shows that the Mediation server isn’t stressed at all. And what’s nice to see is that the OCS server (standard role), the green line, is busy in the beginning and at the end: the OCS server is setting up and disconnecting the calls.
Conclusions
With these results, StartReady claims they can support up to 100 concurrent places calls with limited server load. The test results prove it and Mitel acknowledges it all. To be on the safe side, StartReady advises customer to consider the Scale-Out Appliance if a customer reaches 75 concurrent callers on a regular basis. I think this test makes sense and I agree with them.
What does this mean for customer?
Well, first of all, this test is only relevant for customers that uses OCS to actually make phone calls (interop with phone system). Customer that only use the PC-to-PC call functionality are not limited by it. Next to that, a company that runs up to 75 concurrent calls through OCS is in general a company of about 750 employees (1:10 ratio). And, it would also mean that this 10% of the people ALL use Communicator to place those calls: in ordinary scenario’s people also use Mobile and Fixed Phone lines to call in & outbound. With this in mind, it could mean that customers with more than 1.000 employees might not even need the Scale-Out Appliance. It’s still a complex calculation, but I think that most of the companies currently evaluating Microsoft OCS, can safely invest in the appliance concept.
For more information check out their website on http://www.startready.com.