Private Cloud Architecture – Part 3: Concepts


In the first part of this series (here and here), I started by some discussion regarding the basic definition that we will build upon toward achieving the Private Cloud Promises. In the second part (here and here) I discussed the Core Principles for Private Cloud.

In this post, I will discuss the main concepts behind Private Cloud. The concepts are guided by and directly support the principles we discussed previously.

Approaching Availability Holistically: Uptime was the main measure of availability in general, the more you add nines the more your system is available (for example, 99.999 is better than 99.99).

Let’s define two new terms here: MTBF (Mean Time between Failures) that measures the time between service outages (reliability). MTRS (Mean Time to Restore Service) that measures the resiliency.

In traditional data centers, the availability is solved by throw in more redundant H/W that could pick up the workload to provide more up time. For example, in Active/Passive SQL Server Cluster or having two or more Web Front End (WEF) servers with a load balancer in SharePoint Farm. Private cloud approach this by mainly two software approaches:

  • Virtualization: by removing the traditional model of physical redundancy and replacing it with a virtualization layer. This abstract the service from the server layer and allowing the workload to transport or restart smoothly from a failed virtual server to another. To be fair, you will remove the redundancy from the computing layer, but you will still need it in the storage and network layer. Still a significant cost saving while achieves the same or better availability.
  • Monitoring: Automation of detection and response to failure can reduce MTRS significantly.

Using the same physical hardware: to drive predictability, the underlying infrastructure needs to provide a consistent experience to the workloads that it hosts on the computing, storage and network layers. Private cloud provides that by moving server stock keeping unit to the logical level than the physical level. Once we reach that level of homogenize on the compute layer (so all servers have the same processing power, RAM, same connection to storage resources with same network connectivity), then failed servers could move transparently from one failed server to another without impact on the service behavior.

Shared Pool of Resources: this is a key to the success of Private Could. All resources (compute/storage/network) are grouped in a pool that creates a fabric that hosts the virtualized workloads.

Infrastructure Virtualization: to decrease or eliminate downtime, enhance portability, simplify management and be able to share resources you will need to virtualize all infrastructure components (compute/storage/network)

Fabric Management: Fabric is where all groups of compute, storage and network resources are connected to form the private cloud. It is a different layer above virtualization as an orchestration engine to manage the lifecycle of consumer’s workload. It added a new VM or reduces the number of VMs to the workload according to the need.

Elastic Infrastructure: this enables the perception of infinite capacity by allowing resources to be allocated and released based on demand. Scale down (or releasing resources when not needed) is normally a forgotten practice. It is important to use consumption-based pricing model to incent consumers to be responsible of scaling down their need for resources when not needed.

Service Cost Transparency: this is a direct view of taking service provider’s approach to deliver infrastructure. This will provide a more accurate picture of the true cost of utilizing share resources.

Pay per Consumption: based on classification of services and service cost transparency, business will pay per usage (similar to the electricity utility bill). The main aim is to encourage the business of a good behavior usage of resource based on the pay per need rather than paying a big amount of money for the capital owning without actually a need for all the resources paid for.

In the next blog I will discuss the patterns of implementing Private Cloud.

Credit and thanks go to the Microsoft Team (Kevin Sangwell, Laudon Williams and Monte Whitbeck), the authors of the original document, for allowing me to summarize and publish to the community.