A VERSION CONTROLLED, SELF-ADJUSTING, AND QUALITY OF SERVICE-AWARE LOAD BALANCED CLUSTER

Awarded 2017

VEERABHADRA RAO CHANDAKANNA

Most of the applications that we use today like Gmail, Twitter, and Facebook are Cloud enabled and can scale dynamically. Typically, such applications are hosted on a load balanced cluster. An application accessed via a load balanced cluster is hosted on every member (server) of the cluster. The end user uses a virtual address to send his/her request to a load balancer that acts as an entry point to access the cluster services. The load balancer assigns the request to one of the cluster members. As the load balancer can assign a request to any member of the cluster, all the members must be consistent to produce correct results. A load balancing algorithm attempts to (i) optimize the throughput/response time/resource utilization or (ii) prevent overloading any specific server. The services hosted on the cluster members go through many revisions during their life-cycle. The current solutions are mostly focusing on automating the dynamic scaling, and largely ignored the cluster version management. The key requirements to manage a load balanced cluster are auto provisioning, consistency among the cluster members, a load balancer that can quickly adjust to the changing cluster environment, cluster capacity management, and Quality of Service(QoS) monitoring.

A Self-Adjusting Clustering Framework (SACF) proposed automates the cluster version management, and enforces the consistency among the cluster members. The SACF automates the application management tasks (deploying, upgrading, and retiring) executed in a cluster environment. The framework allows the users to plug-in custom logic to be triggered at various points of managing an application. The framework enforces every active cluster member to be synchronized with the cluster all the time. The framework automatically detects inconsistency of a newly started cluster member and makes the necessary corrections to make it consistent with the current cluster state.

A Sliding window based Self-learning and Adaptive Load balancer (SSAL) that can adapt to both stable and unstable cluster environments is proposed. The SSAL logically divides time into fixed size intervals, assigns the requests in batches, and makes corrections based on the observed servers’ performance in each interval. The SSAL discovers the initial capabilities of the servers and performs incremental corrections needed in the subsequent intervals. It produces throughput better than the current models in both the stable and dynamic cluster environments.

A Quality of Service aware and Self-correcting observation based Load Balancer (QSLB) extends the SSAL to prevent the single point of failure problem, manage cluster capacity, and support QoS monitoring. The QSLB optimizes the throughput and allows (i) redundant QSLBs to collaborate and estimate the cluster members’ capabilities, (ii) the newly started QSLBs to learn the cluster members’ capabilities quickly, (iii) share the cluster capacity among different users in a pre-agreed ratio, and (iv) specify Quality of Service (QoS) benchmarks, monitor the QoS parameters, and recommend changes to meet the set QoS goals. Two cluster capacity estimation models (ARSM and NRPM) to improve the QoS are proposed. Multiple centralized and distributed algorithms to implement the QSLB are proposed and evaluated. This thesis presents a comprehensive approach for managing the load-balanced clusters. The SACF framework provides cluster version management and enforces the consistency among the cluster members. A new observation based load balancer (SSAL) that can produce optimal throughput in both stable and unstable cluster environments has been proposed. The SSAL is extended (as QSLB) to prevent single point of failure, manage the cluster capacity and QoS, and estimate the cluster capacity needed to improve the QoS.