Enterprise services running in the public cloud leverage containerization for exploiting portability and agility at scale. To automatically adapt resources in response to changes in load, an autoscaler applies predefined heuristics to add or remove resources allocated for an application based on resource usage thresholds. However, is not obvious how to effectively configure an autoscaler, specifically how to set scaling thresholds and specify quantity of resources to be added or removed for an application. This is due to the complexities involved in understanding the cumulative impact of several criterion such as the input load on the service, heterogeneity of resources, etc. on performance. In addition, the autoscaler heuristics need to be configured for each individual service, through deep performance analysis, preventing the approach from scaling. Hence, the problem of resource under-utilization persists in presence of autoscaling, leading to extremely high operating costs, energy expenditures, and resource inefficiency. In this paper, we formulate and innovatively apply model-based LQR to perform data-driven adaptive resource allocation in an online setting, for cloud-based services. Our LQR agent can dynamically adapt and scale resources in an application agnostic manner. We demonstrate through real datasets that our approach achieves significantly higher resource utilization than a heuristic-based autoscaler.
 AI & Machine Learning
AI & Machine Learning  Systems & Languages
Systems & Languages