Private Cloud Architecture – Part 4: Patterns

 

In the first part of this series (here and here), I started by some discussion regarding the basic definition that we will build upon toward achieving the Private Cloud Promises. In the second part (here and here) I discussed the Core Principles for Private Cloud. In the third part (here and here) I discussed the Core Concepts of Private Cloud.

In this final post, I will discuss the main patterns for Private Cloud that provides solution to commonly real life problems to enable the concepts and principles discussed before.

Resource Pooling: by combining storage into a storage resource pool and compute and network into compute resource pool, you will be able to divide the resources into partitions for management purposes. You can further think of this in the point of view of

1- Service management (separate resources by security or performance or customer).

2- Systems management (State of VMs: deployed, provisioned and failed).

3- Capacity management (total capacity of your private cloud).

Fault Domain (Physically): Knowing how a physical fault will impact a resource pool will affect the resiliency of the VMs. Private Cloud is resilient to small faults such as a failure of a VM or a Direct Attached Storage (DAS). But imagine that the private cloud is 20 racks each contain 10 servers, and for each rack you have one UPS. If the UPS fails, then the unit of physical fault domain is 10 servers.

Upgrade Domain: although VMs created an abstract layer, you will still need to update or batch the underlying physical server layer. Upgrade domain defines the grouping that can be used to migrate away the work load from it, upgrade the underlying physical servers then you can migrate back the work load to it without disrupting the existing services.

Reserve Capacity: providing a homogenize resource pool based approach provides the advantage of moving a VM from a fault server to a new one with the same capacity without a hit on performance. This means you will need to reserve some capacity to cater for resource decay, fault domain and upgrade domain patterns. There is no right answer for how many servers you will need to reserve as a reserve capacity. Available capacity is equal to the total capacity minus the reserve capacity

Scale Unit: you will need to think of the scale unit pattern from two perspectives, the compute scale which combine servers and network and the storage scale which includes the storage scale unit. The scale unit is the standard increments that will be added to the current capacity of the private cloud.

Capacity Plan: planning capacity will be done by utilizing the above patterns (reserve capacity, scale unit…etc.). You will need to cater for the normal factors (peak capacity, normal growth and accelerated growth). You will need also to think of the triggers defined to increase capacity based on some factors such as the scale unit, the lead time to provide H/W and installation time…etc.

Health Model: you will need to build a matrix to automatically detect if a hardware component failed, which VMs are failed as a result of that. This includes as well other environmental factors such as supply and cooling. Now, consider when a fault happened that moving workloads around the fabric is the responsibility of the health warning model.

Service Class: it describes how applications interact with the private cloud infrastructure. Applications can be designed to be stateless (least costly) as the application itself provides the redundancy and resiliency built in and the application doesn’t use that service from the private cloud fabric, or stateful where applications can benefit from fabric redundancy and resiliency through live migration (moving the workload from one VM to another)

Cost Model: compute cost in private cloud will be provided as a utility in a consumption-based charge model (like electricity), in that case the usage cost should account for the deployment, operations and maintenance.

In conclusion, the cloud computing promises a great deal to manage and utilize the Infrastructure (Compute, Storage and Network). It is still not mature yet, but there is a great effort nowadays in the IT field to shape the future based on that Model.

Credit and thanks go to the Microsoft Team (Kevin Sangwell, Laudon Williams and Monte Whitbeck), the authors of the original document, for allowing me to summarize and publish to the community.