This blog post covers one of the easiest ways to mitigate the overall cost of an EKS cluster that has an auto-scaling configuration.
Here are steps to follow for a successful setup of Prometheus and Grafana in an AWS EKS environment.
Cluster Autoscaler is a tool that automatically adjusts the size of a Kubernetes cluster when one of the following conditions is true:
- There are pods that failed to run in the cluster due to insufficient resources.
- There are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
The Cluster Autoscaler on AWS scales worker nodes within any specified Auto Scaling group and runs as a Deployment in your cluster.
Note: The following post assumes that you have an active Amazon EKS cluster with associated worker nodes created by an AWS CloudFormation template. The following example is using the auto-discovery setup. You can also configure Cluster Autoscaler by specifying one or multiple Auto Scaling groups.
The Setup Process
Set up Auto-Discovery
- Open the AWS CloudFormation console, select your stack, and then choose the Resources tab.
- To find the Auto Scaling group resource created by your stack, find the NodeGroup in the Logical ID column. For more information, see Launching Amazon EKS Worker Nodes.
- Open the Amazon EC2 console, and then choose Auto Scaling Groups from the navigation pane.
- Choose the Tags tab, and then choose Add/Edit tags.
- In the Add/Edit Auto Scaling Group Tags window, enter the following tags by replacing awsExampleClusterName with the name of your EKS cluster. Then, choose Save.
Key: k8s.io/cluster-autoscaler/enabled Value:true
Key: k8s.io/cluster-autoscaler/$aws-k8s-ClusterName Value:owned
Create IAM Policy
1. Create an IAM policy called ClusterAutoScaler based on the following example to give the worker node running the Cluster Autoscaler access to required resources and actions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
}
]
}
Note: By adding this policy to the worker node’s role, you enable all pods or applications running on the respective EC2 instances to use the additional IAM permissions.
2. Attach the new policy to the instance role that’s attached to your Amazon EKS worker nodes.
Deploy the AutoScaler
1. To download a deployment example file provided by the Cluster Autoscaler project on GitHub, run the following command:
bash-3.2$wget https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
2. Open the downloaded YAML file, and set the image of the cluster auto-scaler to match the version of your Kubernetes environment. In this example, we’re on Kubernetes 1.14.0.
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: k8s.gcr.io/cluster-autoscaler:v1.14.0
3. In the same file, set the EKS cluster name ($aws-k8s-ClusterName) and environment variable ($region) based on the following example. Then, save your changes.
---
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/$aws-k8s-ClusterName
- --balance-similar-node-groups
- --skip-nodes-with-system-pods=true
---
4. To create a Cluster Autoscaler deployment, run the following command:
kubectl apply -f cluster-autoscaler-autodiscover.yaml
5. To check the Cluster Autoscaler deployment logs for deployment errors, run the following command:
kubectl logs -f deployment/cluster-autoscaler -n kube-system
Summary
Cluster AutoScaler is a powerful tool that maintains the overall availability and uptime of an environment which also has a built-in configuration to decommission nodes and reallocate pods for keeping the overall costs low. The Cluster AutoScaler decommissions the nodes that are not needed based on the overall usage, eliminating the guesswork involved when sizing an environment for uncertain demand.
To optimize costs and maintain the uptime of the environment, Cluster AutoScaler should be considered as an excellent first step to maintaining appropriate sizing on your EKS cluster.
If you have questions on how you can best leverage our expertise and/or need help with your Liferay DXP implementation, please engage with us via comments on this blog post, or reach out to us at here.
Additional Reading
You can also continue to explore Kubernetes by checking out Kubernetes, OpenShift, and the Cloud Native Enterprise blog post, or Demystifying Docker and Kubernetes. You can reach out to us to plan for your Kubernetes implementation or AWS related posts such as DNS Forwarding in Route 53.