Cost-optimization methods for infrastructure
Cloud technologies provide businesses with an excellent level of automation, flexibility, and convenience in managing IT infrastructure. However, the monthly expenses for its maintenance, especially when it comes to large business projects, are one of the main parts of the expenditure budget, so it is of interest for every business to optimize them. To understand how to reduce IT infrastructure cost-effectively, without harming the business process, you should start with the audit.
Audit most often focuses on detecting uneven workloads of cloud instances and checking the optimality of configurations. The cloud environment allows you to track various metrics of each service independently. Therefore, you can identify unnecessary and rarely used but constantly running virtual machines, instances, or containers that are used occasionally or have been started once and forgotten. However, the audit often reveals more complex issues and a successful combination of cloud options allows to make significant savings. Let’s take a look at some examples.
K8S operators and Terraform utilities for forecasting the cost of infrastructure configuration changes. If the infrastructure is managed by an orchestrator such as Kubernetes, then with the help of its operators you can set up a dashboard in Grafana of how changing the configuration of the services will affect the cost of the resources used by it. Similar functions can also be implemented with the help of utilities for Terraform, moreover, such functionality can be integrated directly into CI, which will allow to control and optimize the infrastructure configurations and reduce cost at the development stage. By implementing such a feature, for example, in dev environments, you will have the opportunity to optimize the configuration code before its release in production.
Optimization of dev environments. Often in application development, is not necessary to use standard types of instances for dev environments. Unlike prod environments, Spot instances can be used in some dev branches or for certain operations (for example, batch processing). Using Spot instances is 3–5 times cheaper than standard ones, but Spot instances may not always be available. Also, they cannot be converted to Standard type or configured to automatical restart when a host event occurs. However, in many cases, for development environments, these features are not critical, besides, spot instances pause not every day (depending on the size of instances). For some cloud providers sometimes it’s possible to configure a K8S operator and specify the desired time window of the reboot so that it is convenient for you. In practice, it has also been noticed that if a sufficiently large disk is mounted in the Spot instance, it reduces the possibility of unavailability and frequent and long pauses.
Migration of VMs to containers. The audit can reveal virtual machines that should be migrated to containers to use less resources because containers do not have such redundancy as VMs and are less resource-intensive. You can discover the possibility of transferring several services with their environment from VMs to containers and placing them on the available resource of your containers cluster, and it won’t affect the performance of the services. Migration to containers, especially in combination with the successful use of Spot instances, in some cases, provides an additional level of cost optimization and significantly reduces monthly infrastructure costs.
Cost optimization due to the difference between on-demand and contracted resources. You can use marketing to take advantage of cloud providers’ plans. If you’ve done some serious market research and settled on a specific cloud provider, signing up for a long-term contract you can get significant discounts. Taking to account the difference in terms and costs between on-demand resources and contracted (managed services), you can get additional benefits in the form of tariff discounts. In addition, by hosting databases and instances as managed services for the contract, you will not have to hire third-party specialists for their management and maintenance, which is especially convenient for early-stage startups or small development teams.
Also, some cloud providers offer credits (or grants) for their services for startups, in some cases, it can allow you to get them for several months or even years for free.
Applying runners for flexible management of CI/CD scalability. Runners are programs that work with CI/CD to run jobs in a pipeline. They can be configured to scale down and scale up individual instances normally running all the time. Cloud services usually charge only the consumed resources.
So, configuring runners to scale up on a separate pool of instances only when required (eg to make builds and deploys) and scale them down when the process is complete will allow you to reduce the CI/CD runners costs down to 0 or the bare minimum. This solution is especially effective for the testing/dev environment because it requires additional time for launching the infrastructure. However, suppose there are many canary releases as dublicated environments, and each of them is required only from time to time. In that case, then constantly running instances will provide you with some additional costs, which you still can improve.
Many practical combinations of certain methods can help to save your budget and the DevOps team with comprehensive experience can suggest the most effective solutions for your project.