Providing SLA Guarantees in Multi-tenant Serverless Computing Platforms
Serverless computing, as a way to construct the services that enable developers to build more agile applications so they can innovate and respond to changes faster, has been heavily invested by all major cloud providers in the form of Function-as-a-Service (FaaS). In contrast to traditional cloud service architectures, the application logic written by the developers is running in stateless compute containers that are event-triggered, ephemeral, and fully managed by the cloud providers. However, the unique characteristics of FaaS workloads make the performance predictability challenging. No cloud provider allows application owners to specify performance service-level objectives (SLOs), which hinders the adoption of serverless computing to latency-critical applications (e.g., Web services and machine learning model serving). This project aims to close the gap and provides performance SLO guarantees in a multi-tenant serverless computing platform. The results demonstrate that our proposed resource management framework achieves almost 2x better performance than the comparison baseline (i.e., OpenWhisk’s default resource manager), thus optimizing the user-defined SLOs. A real-time web-based profiling dashboard is also implemented to visualize the serverless computing platform performance under different configuration options.