Microsoft announced the CPS in a joint venture with Dell at the TechEd keynote in Barcelona. I first thought of something similar to the Nutanix boxes or the VMware Evo:Rail. But after visiting the Microsoft booth in the TechExpo I knew I was wrong.
Microsoft CPS is a ready to run Azure consistent cloud in your datacenter. It is a Microsoft validated design and was developed by Microsoft with standard components available from Dell. It comes pre-integrated and pre-deployed to you based on Windows Server 2012 R2, System Center 2012 R2 and Microsoft Azure Pack. It´s single point of support is Microsoft that opens up a Dell call in hardware related issues and takes the necessary steps if software related issues occur.
Later yesterday I attended a session with Vijay Tewari who is the Group Programm Manager for CPS at Microsoft.
I am really hard to be impressed but the specs and the thoughts and experience that seems to be integrated in this solution are just awesome. Even if you don´t have the necessary money – a thing I will come to later in this post – you can take the specs and the concepts behind it as blueprint for your own solution.
Microsoft has been working on CPS for the past 18 months and there are some really big customers like Capgemini using it already in production.
Concepts behind it
The experiences made in Azure should help in building a robust and effective solution. Therefore only proven ideas and concepts were used.
All management functions are virtualized. Efficient packing of storage and VMs should be used in combination with state of the art network offloads. Dynamic scaling is a key architectural component.
The whole applicance is build and delivery within days because it makes only sence in terms of agile scaling.
The CPS Rack
Lets spend some words about the specification of each rack and the hardware that is used. Again, all of the used hardware components are standard components everybody can order from Dell. You have to start with one rack.
Each rack has:
- 512 cores across 32 servers
- 8 TB of RAM with 256 GB per server
- 282 TB of usable storage
- 1360 Gb/s of internal rack connectivity
- 560 Gb/s of inter-rack connectivity
- Up to 60 Gb/s connectivity to the external world
- 2322 lbs
- 16.6 KW at maximum
Hardware components within each rack consist of the following components:
- 5x Force 10 – S4810P
- 1x Force 19 – S55
Compute Scale Unit (32x Hyper-V hosts)
- Dell PowerEdge C6220ii – 4 Nodes per 2U
- Dual Socket Ivy Bridge, E5-2650v2 @ 2.6 GHz
- 256 GB memory
- 2x 10 GbE Mellanox NIC´s (LBFO Team, NVGRE offload)
- 2x 10 GbE Chelsio (iWarp/RDMA)
- 1 local SSD 200 GB (boot/paging)
Storage Scale Unit (4x File servers, 4x JBODS)
- Dell PowerEdge R620v2 Servers (4 Server for Scale Out File Server)
- Dual Socket Ivy Bridge, E5-2650v2 @ 2.6 GHz
- 2x LSI 9207-8E SAS Controllers (shared storage)
- 2x 10 GbE Chelsio T520 (iWarp/RDMA)
- PowerVault MD3060e JBODs (48 HDD, 12 SSD)
- 4 TB HDDs and 800 GB SSDs
Maximum Scale of four racks consists of:
- 32 Compute Nodes
- 256 Sockets
- 2048 Cores
- 32 TB Memory
- 768 HDDs
- 192 SSDs
- 504 TB Storage used for Backup
- 605 TB Storage available for Workloads
- 1.1 PB Total Storage
A single rack can support up to 2000 VM’s (2 vCPU, 1.75 GB RAM, and 50 GB disk). You can scale up to 8000 VM’s using a maximum of four of these racks. Of course these numbers vary if using different machine sizes. All hardware components in the rack are hot-pluggable.
That are impressive numbers.
Lets digg a little bit deeper in the clusters, that are build in this solution:
- The network in the racks has been designed for performance, scale and reliability.
- Flat layer 3 physical network with Hyper-V Network Virtualization for tenant networks
- 40 Gb connections to aggregation layers utilizing equal cost multi-path (ECMP)
- 10 Gb x2 to host for tenant traffic
- 10 Gb x2 for storage (SMB over RDMA)
- Multiple devices for redundancy
- Teamed connections with LACP or pairs for 20 Gbs each physical network
Networking performance of VM to VM connections could be maximized to 18 Gbps. Offtrack Forwarding to 10 Gbps. Offtrack NAT runs up to 8 Gbps and Offtrack 25 with 1,8 Gbps. All these numbers are achieved trough the usage of LACP Teaming, RSS enabled on Host and Guest, VMQ enabled and NVGRE offload being enabled.
This Cluster is the heart of the rack. It consists of 30 VMs on a 6 node Hyper-V failover cluster. It runs:
- 2x Virtual Machine Manager
- 1x VMM Library for Service Templates
- 3x Operations Manager
- 2x Consoles
- 4x SQL Cluster (Enterprise) for VMM, OM, SPF, WSUS, SMA, WAP, DW and Analysis Services
- 3x AD, DNS, DHCP
- 2x ADFS
- 1x WDS
- 1x WSUS
- 2x Iaas RP (SPF)
- 2x WAP tenant
- 3x SMA
- 3x WAP Admin Portal, APL Service Reporting
- 1x DPM
If you ask yourself were the fileservers are. They are the only components that are not virtualized.
Management Cluster Services
The management services are roughly devided into the following services:
- Configure & Deploy
- Service Administration
- Disaster Recovery
- Patching & Upgrade
The Storage cluster
- Tenant: 152 terabytes
- Backup: 126 terabytes ( + deduplication)
- 192 HDDs @ 4 terabytes each
- 48 SSDs @ 800 treabytes each
Storage spaces configuration
4 File servers
- 128 GB memory
- 3x pools (2x tenant, 1x backup)
- 8x 10 GbE RDMA for SMB3
The Management host group, Edge host group, Compute host group and Storage host group are spread over all available racks. But there is only one instance of the Management host group that resides on one rack.
Patching & Updating
The biggest operational cost in cloud environments is patching. Therefore Microsoft has made a great step forward in Patching & Updating infrastructures like this. Hopefully these experiences will float into future versions of WSUS and/or SCCM.
The Process is as follows:
Patches are validated on Microsoft internal stamps > Then deployed on the internal DEV and TEST cloud > Then the customer starts Patching > Runs an Inventory on ist racks first > then updates what is needed on FW, Drivers and BIOS side > then Updates & Validates Windows Server systems > then Updates and validates System Center > in the end updates and validates Windows Azure Pack.
Microsoft itself has an infrastructure for thesting purposes were every day about 20.000 VMs are being deployed. The tests are done through the Windows and System Center teams.
The primary focus is a wide range of use cases. I asked one of the guys at the booth if they would support also VDI workloads. At that time they could not make any announcements. Also if you could use a CPS rack as Azure Backup location was not confirmed. But Microsoft said every use-case that would have been tested successfull internally would later on officially be supported.
The CPS systems can be ordered in November 2014 in US, Canada, Europe & South Africa. Additional countries will follow in 2015.
About the used operating systems
Server core installations are installed wherever possible to maximize performance. GUI versions are used were ever it is needed.
How is the system deployed?
The whole solution comes predeployed to the customer. Putting racks together isn´t that interesting but what´s interesting is the way how the software is deloyed.
VMM is deployed before all other systems because the infrastructure is built through images. PowerShell is used to automate tasks
Private Cloud Simulator
During the development of the solution a private cloud simulator was developed to automate the simulation of certain failures in the infrastructre. It would be great if Microsoft would spread information about it or release it someday to the community.
How much is the fish?
The folks at TechEd doesn´t wanted to talk about prices. But I found a Whitepaper on the Microsoft CPS Site with some examples in US Dollar. https://www.microsoft.com/en-us/server-cloud/products/cloud-platform-system/
The price for one rack is about $1,635.600 based on list prices from Dell.
The Whitepaper comes to the conclusion that an average VM costs round about 4300 Dollars.
Some Tidbits from the presentations
- Automation is part of CPS!
- Worst thing to do in a private cloud is have orphaned resources
- The usage of network offloads is critical
- CPU und memory are the costly components in the cloud
- Dynamic scaling is key feature because you are not service oriented if you cannot scale dynamicaly
- Cableing is optimized for access to change failed hardware components. All cables are labeled
- Active/Active configurationfor the network components is the better alternative ito Active/Passive
- Choose the right amount of failover for the components
- Hard disk is the biggest faulting component in private clouds
- To look at IOPS is not sufficiant in Cloud environments. You have also have to take a look at the storage latency!
- Flat networks are cheaper and easier to integrate
This is excatly what I have been praying for years. These solutions can only work with a huge amount of automation just to be sure administrators and operators can focus on the things they really need to do every day.
The CPS solution for example does automatic consistency checks after restoring data. The reset and the rotation of passwords can be run in full autmated mode or with an alarm through SCOM.
Isn´t that really cool?
What can we learn from this?
Microsoft theme self say that they hide all of the arising events for the whole environment that are not needed for normal operation because they know what the infrastructure does. They show only important messages related to the infrastructure.
That is excactly the point that we need to achieve. Proactive Systems Management with full automation instead of manual reactive systems management.