5. 2. 2025 Development

How do we reduce overprovisioning and save 30% on the cloud

Martin Peták

Development

#Infrastructure

Our infrastructure must be highly scalable due to fluctuating traffic throughout the day and year, especially during peak periods like Christmas, when traffic can increase by over 50%. With Google Cloud’s native autoscaler, we used to experience overprovisioned nodes, leaving over 40% of headroom – resources our team never used but had to pay for. This prompted the search for an autoscaling solution to adjust the resources provisioned for clusters and workloads dynamically. That is how we found Cast AI.

The results in a wrap

We use Cast AI’s autoscaling capabilities at the node and workload levels to scale cost-efficiently. We use automated scaling and bin-packing mechanisms to provision just enough resources to ensure excellent performance without incurring a high cloud bill.

With automation at both the node and workload levels, we have been able to achieve:
🌟 30% cloud cost savings achieved across Dev and Pre-prod clusters.
🌟 Cost-efficient node autoscaling with minimum headroom.
🌟 Full control over automation mechanisms.

Bin packing Spot VMs in the Dev and Pre-prod cluster for 30% savings

Since implementing Cast AI, Heureka has seen a 30% drop in compute costs in the Dev cluster. This was achieved by bin packing Spot-friendly workloads (stateless workloads that tolerate interruptions) and quickly removing the empty Spot nodes to drive down the number of provisioned CPUs.

The Cast AI autoscaler unlocked more savings by moving Spot-friendly workloads already deployed on Spot VMs to cheaper families while maintaining service uptime and application performance.

In the retail world, scalability is key

To find out more about the results of our cooperation with Cast AI, please go to https://cast.ai/case-studies/heureka/.

Author

Martin Peták

Development

Martin works as an Infrastructure Engineer at Heureka. He enjoys connecting systems and supporting developers in their work. In his spare time, he does things like smartening up his home.

How do we reduce overprovisioning and save 30% on the cloud

The results in a wrap

Bin packing Spot VMs in the Dev and Pre-prod cluster for 30% savings

In the retail world, scalability is key

To find out more about the results of our cooperation with Cast AI, please go to https://cast.ai/case-studies/heureka/.

Author

<We are social too/>

Interested in our work, technology, team, or anything else?
Contact our CTO Lukáš Putna.

How do we reduce overprovisioning and save 30% on the cloud

The results in a wrap

Bin packing Spot VMs in the Dev and Pre-prod cluster for 30% savings

In the retail world, scalability is key

To find out more about the results of our cooperation with Cast AI, please go to https://cast.ai/case-studies/heureka/.

Author

<We are social too/>

Twitter

GitHub

Interested in our work, technology, team, or anything else? Contact our CTO Lukáš Putna.

Interested in our work, technology, team, or anything else?
Contact our CTO Lukáš Putna.