As an observability company, New Relic creates and maintains multiple language and technology-specific agents to collect telemetry data from our customers’ environments. When these agent teams release new updates manually, they conduct numerous verifications to ensure the process doesn’t introduce any regressions caused by human error. To reduce the time required to deploy a new release, the Kubernetes agent team fully automated the software agent release process. Reusable GitHub Actions workflows keep track of vulnerable dependencies, write documentation, and sync with partners like Amazon Elastic Kubernetes Service (Amazon EKS) anywhere. Previously, the shipping updates to our Kubernetes integration was manual and took up to two weeks; now an automated process takes an hour per week.
Decreasing response time for security incidents
One of our challenges was our response to security vulnerabilities: we would react to a vulnerability only when a customer contacted Global Technical Support (GTS) with an escalation. That would lead to both customer frustration with our integrations and developer stress because we would need to stop planned work in favor of patching our software.
As part of our continuous integration (CI) pipeline, we enabled code-scanning tools to keep us informed of the latest vulnerabilities discovered in our code base. We enabled CodeQL to look for vulnerabilities in our codebase, and we use Trivy to ensure our Docker images do not include vulnerabilities injected from the base image and the libraries included with it.
One of our common use cases is detecting vulnerabilities in our base image (Alpine). This process alerts us to current issues that need to be fixed. By combining vulnerability scanning with automatic dependency management, we’re able to automatically patch fixes to the code base without the need for human interaction. Our workflows run weekly, which means customers get a patched version within a week of a fix being available.
As a concrete example of our increased response time, our security dashboard alerted us to the following security vulnerabilities in alpine:3.18.4 (the image we’re using in the nri-kubernetes integration at the time):
Those vulnerabilities were fixed in alpine:3.18.5, released November 30, 2023, and alpine:3.19.0, released December 7, 2023. Renovate, our universal dependency management tool, created pull requests for both releases the same day the releases were published, and they were included in our release on December 8, 2023, just one day after the release of alpine:3:19.0.
All three mentioned alpine images have two more vulnerabilities that were detected afterwards:
- CVE-2023-6129 in January 9, 2024
- CVE-2023-6237 in January 15, 2024
Those two pending vulnerabilities are currently flagged in our Security dashboard.
As soon as a fix is released by Alpine, our customers can expect a fixed release version from our integrations within a week.
Provide cutting-edge support for the latest version of Kubernetes
Supporting new versions of Kubernetes involves updating third-party testing tools and performing extensive conformance testing to ensure a new Kubernetes version doesn’t have breaking changes to our integrations. One common issue that we need to validate is a Kubernetes API that’s in alpha or beta version, since there can be changes without any previous notice.
With our fully automated dependency management tooling, once the tools are in place, we have immediate access to them. Also, because our conformance testing is fully automated, we can speed up validation time, allowing us to be on the cutting-edge of Kubernetes support.
When a new release of Kubernetes becomes available, it's crucial to update our testing workflows to incorporate the latest version. Renovate, our dependency management tool, automatically opened a pull request (PR) to update Minikube to the latest version. We use minikube to quickly spin-up a cluster and then run end-to-end tests, and then we run all the battery of tests in each Kubernetes version that we support. Once that minikube is updated with the latest Kubernetes version, we enable testing for that version in our testing framework. If tests confirm integration is working as expected, we can declare support for the latest Kubernetes version. Because of our automated workflows, we updated our test suite to leverage the latest version of Kubernetes the same day that minikube announced the release. This allowed us to test for compatibility within one day and communicate that our Kubernetes integration was compatible with the latest release seven days after minikube was released.
Sync latest agent release with AWS EKS Anywhere Add Ons
New Relic supports the Amazon EKS Anywhere Partner program by offering our Kubernetes agent as an out-of-the-box add-on for Amazon EKS Anywhere clusters. We developed a GitHub Actions workflow to automatically open a pull request when we cut a new release of our agent. This keeps our end users of Amazon EKS Anywhere up to date with the latest agent releases, ensuring that New Relic continues passing the latest conformance testing and remains as an Amazon EKS Anywhere validated partner.
Automate the changelog, communications, and documentation
The creation and update of documentation is a big time sink. To get a weekly cadence of releases, we needed to update all communication channels for our internal stakeholders, external customers, and external partners. To automate communications to all of these stakeholders, we created reusable workflows that run every week and automatically update release notes, send Slack messages of latest releases in internal New Relic stakeholder channels, and update developer documentation.
Then, our own GitHub workflow compiles the documentation and release notes, and sends communications through all of the channels. To better communicate with internal stakeholders, we created our own K8s Agent bot linked to our GitHub Actions workflow, so our customers and partners can be automatically notified of release updates.
Conclusion
A CI pipeline provides more benefits than simply making life easier for developers—customers get access to more secure, well-documented, and cutting-edge software.
An effective CI pipeline is more than just creating automation in the agent release process; it involves improving testing to guarantee no regressions happen in the process, detecting security vulnerabilities quickly and addressing them promptly, and communicating with all stakeholders clearly and effectively.
다음 단계
Check out our open source Kubernetes integration, and check out our docs to get started with getting visibility into your Kubernetes clusters!
이 블로그에 표현된 견해는 저자의 견해이며 반드시 New Relic의 견해를 반영하는 것은 아닙니다. 저자가 제공하는 모든 솔루션은 환경에 따라 다르며 New Relic에서 제공하는 상용 솔루션이나 지원의 일부가 아닙니다. 이 블로그 게시물과 관련된 질문 및 지원이 필요한 경우 Explorers Hub(discuss.newrelic.com)에서만 참여하십시오. 이 블로그에는 타사 사이트의 콘텐츠에 대한 링크가 포함될 수 있습니다. 이러한 링크를 제공함으로써 New Relic은 해당 사이트에서 사용할 수 있는 정보, 보기 또는 제품을 채택, 보증, 승인 또는 보증하지 않습니다.