Allocating resources for an application running on Kubernetes can be very challenging. How many pods do you need? How much CPU and memory do you need to allocate to each of those pods to maintain uptime and optimal performance? How do you account for spikes in traffic during heavy usage when you provision your infrastructure?

Fortunately, Kubernetes includes Horizontal Pod Autoscaling (HPA), which allows you to automatically allocate more pods and resources with increased requests and then deallocate them when the load falls again based on key metrics like CPU and memory consumption, as well as external metrics. This helps maintain uptime for your application without using unnecessary resources. 

You can use metrics from New Relic to determine how and when your deployment autoscales. This post demonstrates how to autoscale a Kubernetes deployment with New Relic metrics by walking through an example using metrics reported by Pixie. This solution has the following advantages:

  • No need to manually add or remove pods based on your real-time demands.
  • Improved availability by scaling resources when load increases or performance degrades.
  • More cost-effective solution than allocating a fixed number of pods.

The New Relic metric adapter will automatically autoscale your Kubernetes deployments based on metrics gathered from your applications and infrastructure services such as the number of requests received per second by a web server.

The metric adapter gets the metric value from the New Relic NerdGraph API based on a NRQL query and submits this value to the Kubernetes external metrics API. Then, the Horizontal Pod Autoscaler is ready to scale the deployments based on an external metric. The next graphic shows where the metrics adapter fits in.

Prerequisites

  • Kubernetes 1.16 or higher
  • The New Relic Kubernetes integration 
  • New Relic's user API key 
  • No other External Metrics Adapter installed in the cluster

1. Deploy minikube cluster 

Clone the following repo from Github:

 

$ git clone https://github.com/newrelic-experimental/pixie-lab-materials 

$ ​​cd pixie-lab-materials/main 

$ ./setup.sh

 

The setup.sh script spins up a new minikube cluster using the Pixie-supported hyperkit driver. It also configures your network memory and CPU for optimal performance with Pixie. Finally, it creates all the pods and services that make up the demo application.

In a new terminal window, open a minikube tunnel:

$ minikube tunnel -p minikube-pixie-lab

You should now have two terminal windows: 

  • One that contains your tunnel. This needs to remain open to access your demo application.
  • One for running the other commands in this tutorial.

2. Install Kubernetes Integration with Pixie and Metrics Adapter

Next, set up New Relic’s Kubernetes Integration with this guided install. Make sure to enable Pixie by checking Instant service-level insights, full-body requests, and application profiles through Pixie. The other checked items in the next image (Kube state metrics, Prometheus metrics, Kubernetes events, and Log data) are checked by default.

After you select Continue, New Relic generates a command. Copy and paste the command into your dev environment.

To install the New Relic Metrics Adapter, use the newrelic-k8s-metrics-adapter Helm chart, which is also included in the nri-bundle chart used to deploy all New Relic Kubernetes components.

helm upgrade --install newrelic newrelic/nri-bundle \
--namespace newrelic --create-namespace --reuse-values \
--set metrics-adapter.enabled=true \
--set newrelic-k8s-metrics-adapter.personalAPIKey=YOUR_NEW_RELIC_PERSONAL_API_KEY \
--set newrelic-k8s-metrics-adapter.config.accountID=YOUR_NEW_RELIC_ACCOUNT_ID \
--set newrelic-k8s-metrics-adapter.config.externalMetrics.manipulate_average_requests.query='FROM Metric SELECT average(http.server.duration) WHERE instrumentation.provider='pixie''

Here is additional context on the flags in the previous command:

  • metrics-adapter.enabled: Must be set to true so the metrics adapter chart is installed.
  • newrelic-k8s-metrics-adapter.personalAPIKey: Must be set to your New Relic Personal API key.
  • newrelic-k8s-metrics-adapter.accountID: Must be set to the New Relic account where you will be fetching metrics.
  • newrelic-k8s-metrics-adapter.config.externalMetrics.external_metric_name.query: Adds a new external metric with the following information:
    • external_metric_name: The metric name.
    • query: The base NRQL query for the metric.


3. Test your NRQL query

Before you set the cluster to scale based on a metric from New Relic, you need to make sure the query is getting the right data. Test your NRQL query by selecting Query your Data. Then in the Query builder tab, copy and paste the following NRQL query:

FROM Metric SELECT average(http.server.duration) WHERE instrumentation.provider='pixie'

 

4. Configure your Horizontal Pod Autoscaler

Create a new file called hpa.yaml in the pixie-lab-materials/main/kube directory. Based on the HPA definition in the YAML file, the controller manager fetches the metrics from the external metrics API which are served by the New Relic metrics adapter.

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
  name: manipulate-scaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: manipulation-service
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: External
      external:
        metric:
          name: manipulate_average_requests
        target:
          type: Value
          value: 100
  • The HPA is configured to autoscale the manipulation-service deployment.
  • The maximum number of replicas created is 10 and the minimum is one.
  • The HPA will autoscale based on the metric manipulate_average_requests. This should be the same name as the metric defined above in the helm chart.

Every 30 seconds, Kubernetes queries the New Relic API for the value of the metric and autoscales the manipulation-service deployment if necessary. You can autoscale your Kubernetes deployments based on multiple metrics. The autoscaler will select the metric that creates the most replicas and change the frequency of fetching metrics. 

Make sure to apply the new YAML file by running 

$ cd pixie-lab-materials/main/kube

$ ​​kubectl apply -f hpa.yaml

5. Add load to trigger autoscaling

Navigate to the url of the deployed site by running:

$ kubectl get services

Then open the EXTERNAL-IP that you see for frontend-service in your browser.

Next, install hey (and Go v1.17) with the following command in your terminal:

$ brew install hey

After installing hey, confirm it’s installed correctly with the which hey command in your terminal.

Now you can  send GET requests to the EXTERNAL-IP of the frontend with the following command:

$ hey -n 10 -c 2 -m GET http://<EXTERNAL-IP>

This opens 2 connections and sends 10 requests. Each request is a GET request to frontend-service.

You can see the HPA autoscaling by running:

$ watch kubectl get hpa

As you can see, the pod autoscales the number of replicas as the average HTTP request time reported by New Relic increases. You can adjust the configuration for your own services so that New Relic One and HPA automatically help you autoscale as needed.