I'll talk about the challenges and solutions we encountered while running on-demand environments at scale on Kubernetes:
- The need for speed: Unlocking lightning-fast environment provisioning times
- Handling high pod churn:
- No weak links: Protecting environments from single points of failure
- Guardrails for k8s controllers: As we were managing multiple environments with numerous Kubernetes objects, we required a reliable and scalable solution to tackle this issue, since any downtime could render an entire environment inaccessible.
- Optimize Kubernetes for faster on-demand environment provisioning
- Identify and resolve potential points of failure while running on-demand environments on Kubernetes at scale
- How to run on-demand environments on spot instances with confidence