Automation has become a critical tool for organizations to manage their infrastructure effectively. Red Hat Ansible Automation Controller is a popular platform that allows you to define, operate, scale, and delegate automation across their enterprise. With its rich set of features, Ansible Automation Controller provides an enterprise-grade solution for automation and orchestration. But as workflows become more intricate due to the increased complexity of the infrastructure, it's crucial to monitor the platform to ensure smooth operation of your automation workflows.

In this article, you'll learn the basics of monitoring Ansible Automation Controller, specifically:

  • Why should you monitor Ansible Automation Controller?
  • The benefits of monitoring Ansible Automation Controller.
  • What metrics to monitor. 
  • How to monitor Ansible Automation Controller with New Relic.

The next image shows the Red Hat Ansible Automation Controller dashboard, which includes information about jobs, hosts, inventories, projects, and more.

Why monitor Ansible Automation Controller?

Red Hat Ansible Automation Controller (previously known as Ansible Tower) provides a centralized platform for managing Ansible playbooks, roles, and inventory, and allows you to define, orchestrate, and manage complex tasks across their enterprise. The platform also provides access controls, job scheduling, and notifications, making it easier to manage and secure Ansible automation. 

But its rich set of features also comes with more complexity, and monitoring the platform becomes critical to ensure that automation tasks are running efficiently. Monitoring the platform helps you achieve:

  • Improved automation efficiency
  • Reduced infrastructure downtime
  • Better system resource allocation

Let’s take a closer look at these benefits.

Improved efficiency 

Monitoring Red Hat Ansible Automation Controller can help you identify any bottlenecks or inefficiencies in your automation tasks. With a clear view of your automation workflows, you can optimize your processes, automate repetitive tasks, and eliminate manual errors. This can lead to faster task completion times and greater productivity for your IT team.

Reduced downtime 

Downtime can be a significant issue for any organization, leading to lost revenue, decreased customer satisfaction, and damage to your brand reputation. By monitoring Red Hat Ansible Automation Controller, you get visibility into the status of automation processes, resource utilization, and service availability, helping you proactively identify and resolve issues before they cause downtime. This means that you can ensure the availability of your infrastructure and applications, keeping your business running smoothly.

Better resource allocation 

Red Hat Ansible Automation Controller is a powerful tool that can consume significant system resources. Monitoring resource usage can help you identify areas where you can optimize your resource allocation, improving performance and reducing costs. By keeping an eye on resource usage, you can ensure that you are making the most of your IT budget, while still delivering on performance requirements.

Overall, monitoring Red Hat Ansible Automation Controller is essential for ensuring the smooth operation of your automation workflows. By using a monitoring tool like the New Relic integration for Ansible Automation Controller, you can gain valuable insights into your platform's performance and optimize your processes for maximum efficiency.

Metrics to monitor in Red Hat Ansible Automation Controller

When monitoring Red Hat Ansible Automation Controller, there are several metrics that you should track:

  • Job status. Monitoring job status is essential to ensure that automation tasks are running as expected. For example, if a job is stuck in a pending state, it could indicate an issue with the job's dependencies or the infrastructure. By monitoring job status, you can quickly identify and troubleshoot issues that may be preventing your jobs from running correctly.
  • Inventory health. Keep an eye on your inventory health to ensure that your automation tasks have access to the resources they need. This includes monitoring inventory levels, network connectivity, and other factors that may impact the availability and performance of your infrastructure. For example, if you notice that a particular host is consistently offline, it may indicate a problem with the network or the host itself, which could impact the performance of your automation tasks.
  • Resource usage. Monitor resource usage to identify areas where you can optimize resource allocation. For example, if a resource is consistently underutilized, it may be possible to reallocate those resources to other tasks or to reduce overall resource consumption. By monitoring resource usage, you can ensure that your automation tasks are running efficiently and using resources effectively.
  • User activity. Keep track of user activity to ensure that the platform is being used efficiently and securely. For instance, if a user is making unauthorized changes or using the platform inefficiently, it can impact the performance of automation workflows or compromise system security. By monitoring user activity, you can quickly identify and address these issues, reducing the risk of security breaches and ensuring the efficient use of the platform.
  • Logs. Monitoring logs is critical to gain insight into the performance of your automation workflows, identify any issues that may be impacting them, and take action to resolve them. For example, if a log shows that a task failed due to a missing dependency, you can quickly identify and install the missing dependency, reducing the risk of downtime and ensuring the smooth operation of your automation workflows. Logs can also provide a detailed record of events, making it easier to troubleshoot issues and identify errors that may be impacting your automation tasks.

How to monitor Ansible Automation Controller with New Relic

The New Relic Red Hat Ansible Automation Controller integration helps you track and visualize your Ansible Automation Platform logs, keep an eye on your jobs, and monitor instance memory. Built with the New Relic infrastructure agent and Prometheus OpenMetrics integration, it provides a set of pre-built dashboards that display critical query data in one place. To start monitoring your platform you'll do two things:

  1. Install the New Relic infrastructure agent.
  2. Integrate Red Hat Ansible Automation Platform with New Relic.

Step 1: Install the infrastructure agent

Before you get your Ansible Automation Controller data into New Relic, you need to install the infrastructure agent. This agent serves as the foundation for collecting necessary data. There are two options for installing the agent:

  • Guided install: This is a CLI tool that inspects your system and installs the infrastructure agent alongside the application monitoring agent that's best suited for your system. You can learn more about the guided install process by checking out our Guided install overview.
  • Manual installation: If you prefer to install our infrastructure agent manually, you can follow a tutorial for Linux, Windows, or macOS.

Step 2: Integrate Red Hat Ansible Automation Platform with New Relic

Once you’ve successfully installed the infrastructure agent, the next step is to integrate your Red Hat Ansible Automation Platform with New Relic. This step consists of multiple subtasks. To integrate your platform you’ll:

  • Create a bearer token on the Ansible Automation Platform.
  • Configure the Prometheus OpenMetrics integration.
  • Forward Ansible controller logs to New Relic.
  • Restart your infrastructure agent.
  • Use NRQL to view your data.
  • See Ansible Automation Controller metrics in a dashboard.

Step 2.1: Create a bearer token on Ansible Automation Platform

Step 2.1.1

Log in to your Ansible Automation Platform, navigate to the Users section, and select admin as shown in the next image.

Next, add an access token to the user that the New Relic integration will use.

Copy the token and save it.

Step 2.1.2

Next, create a file named bearer_token_file in the directory of your choice.

touch /home/ansible-automation-patform/bearer_token_file

Step 2.1.3

Paste your token in the created file. This command uses nano but you can use another text editor, too.

nano /home/ansible-automation-platform/bearer_token_file
<paste your token>

Save and exit the editor.

Step 2.2: Configure Prometheus OpenMetrics integration

To configure the Prometheus integration for your Red Hat Ansible Automation platform, you’ll complete three tasks.

Step 2.2.1

Navigate to the path /etc/newrelic-infra/integrations.d/ and create a new file called nri-prometheus-config.yml.

Step 2.2.2

Use the provided configuration template to update the nri-prometheus-config.yml file with the necessary configurations.

Step 2.2.3

Update the following fields in the configuration file:

  • cluster_name: <YOUR_DESIRED_CLUSTER_NAME>
  • bearer_token_file: <BEARER_TOKEN_FILE_PATH>
  • urls: [“https://YOUR_HOST_IP:443/api/v2/metrics/?format=txt”]
  • insecure_skip_verify: true

Your nri-prometheus-config.yml file should look like this:

integrations:
  - name: nri-prometheus
    config:
      # When standalone is set to false nri-prometheus requires an infrastructure agent to work and send data. Defaults to true
      standalone: false

      # When running with infrastructure agent emitters will have to include infra-sdk
      emitters: infra-sdk

      # The name of your cluster. It's important to match other New Relic products to relate the data.
      cluster_name: "YOUR_DESIRED_CLUSTER_NAME"
      bearer_token_file: "BEARER_TOKEN_FILE_PATH"

      targets:
        - description: Metrics of Ansible Automation Platform can be seen on the below url
          urls: ["https://YOUR_HOST_IP:443/api/v2/metrics/?format=txt"]
          use_bearer: true
      #    tls_config:
      #      ca_file_path: "/etc/etcd/etcd-client-ca.crt"
      #      cert_file_path: "/etc/etcd/etcd-client.crt"
      #      key_file_path: "/etc/etcd/etcd-client.key"

      # Whether the integration should run in verbose mode or not. Defaults to false.
      verbose: false

      # Whether the integration should run in audit mode or not. Defaults to false.
      # Audit mode logs the uncompressed data sent to New Relic. Use this to log all data sent.
      # It does not include verbose mode. This can lead to a high log volume, use with care.
      audit: true

      # The HTTP client timeout when fetching data from endpoints. Defaults to 30s.
      # scrape_timeout: "30s"

      # Length in time to distribute the scraping from the endpoints.
      scrape_duration: "5s"

      # Number of worker threads used for scraping targets.
      # For large clusters with many (&gt;400) endpoints, slowly increase until scrape
      # time falls between the desired `scrape_duration`.
      # Increasing this value too much will result in huge memory consumption if too
      # many metrics are being scraped.
      # Default: 4
      # worker_threads: 4

      # Whether the integration should skip TLS verification or not. Defaults to false.
      insecure_skip_verify: true

    timeout: 10s

Step 2.3: Forward Ansible controller logs to New Relic

You can use log forwarding to forward Ansible Automation Platform logs to New Relic. On Linux, navigate to  /etc/newrelic-infra/logging.d/ to search for the log file named logging.yml. If you don't see your log file in this directory, you will need to create a new log file by following the log forwarding documentation provided by New Relic.

To create the log file, add the following script to the logging.yml file:

- name: ansible-tower.log
    file: /var/log/tower/tower.log
    attributes:
      logtype: ansible_tower_log

Step 2.4: Restart your infrastructure agent

You will need to restart your infrastructure agent before you’ll see your data in New Relic. For Linux, ensure you use the correct command for your init system.

  • SystemD (Amazon Linux 2, SLES 12, CentOS 7 or higher, Debian 8 or higher, RHEL 7 or higher, Ubuntu 15.04 or higher): sudo systemctl restart newrelic-infra
  • System V (Debian 7, SLES 11.4, RHEL 5): sudo /etc/init.d/newrelic-infra restart
  • Upstart (Amazon Linux, RHEL 6, Ubuntu 14.04 or lower): sudo initctl restart newrelic-infra

Step 2.5: Use NRQL to view your data

Now, you can confirm that your Ansible Automation Platform data is being sent to New Relic. To confirm, go to query your data in New Relic, and use the following query:

FROM Metric SELECT latest(awx_pending_jobs_total) AS 'Pending Jobs', latest(awx_running_jobs_total) AS 'Running Jobs'

Step 2.6: See Ansible Automation Controller metrics in a dashboard

New Relic provides a dashboard that gives you insights into your system through various charts and graphs including system information, job status, infrastructure, and logs. This helps you gain greater visibility into your infrastructure and automation workflows. To install the default dashboards, see the Ansible Automation Controller quickstart.

Conclusion

Red Hat Ansible Automation Controller is a powerful platform for managing and monitoring automation tasks. But monitoring this platform is critical to ensure that your infrastructure is running smoothly and efficiently. The New Relic integration for Red Hat Ansible Automation Controller provides a set of pre-built dashboards that make it easy to monitor critical metrics in one place and help you improve efficiency, reduce downtime, and refine resource allocation.