Feature flags and dynamic configuration are one of the most powerful tools in software delivery, enabling teams to release features gradually, test in production, and minimize risk. But even with careful planning, feature releases can go wrong. Error rates spike, latency increases, or unexpected behaviors emerge. This means teams need to respond fast.

Today, we're announcing a new integration between AWS AppConfig and New Relic Workflow Automation that makes feature flag deployments safer by enabling automatic, intelligent rollbacks powered by real-time observability data.

AWS and New Relic

The Feature Flag Safety Gap

AWS AppConfig is a service that helps teams move faster and safer. Feature flags and dynamic configuration allow you to adjust software behavior on production without a code deployment. Feature flags allow builders to control how they will release new features and capabilities, separating a launch from a code deployment. Also, AWS AppConfig’s kill switches allow instant rollback if something unexpected goes wrong. AWS AppConfig gets used at scale inside of Amazon, and its feature offerings reflect how AWS thinks about proper use of feature flags. 

Unfortunately, teams can underestimate the danger of changing configuration or feature flags. Similar to when an engineer says, “I am just changing one line of code, it’s no big deal,” A feature flag change can appear innocuous. And 99.99% of the time, the change is not a problem. But sometimes those small changes can trigger a massive outage. Many recent well-publicized outages were caused by configuration changes. Thus, builders need to have the right safety guardrails with automation in place to avoid these unexpected impacts. 

Gradual deployments and monitoring are one of these key safety guardrails. One of the essential features of AWS AppConfig is to deploy feature flags safely using gradual deployments. You might release a feature to 10% of users, then 20%, then 50% over several minutes and hours, monitoring system health at each stage. This approach dramatically reduces the scope of potential impact compared to an all-at-once release. But monitoring during this gradual release is also essential. While AWS AppConfig provides powerful gradual deployment and rollback capabilities, and New Relic excels at observability, connecting these two systems in an automated workflow has traditionally required custom code and infrastructure. N.B.: AWS AppConfig already had automated rollback for Amazon CloudWatch alarm, but for New Relic Customers, rollbacks were manual before this partnership.

What happens when something goes wrong during that gradual rollout?

Traditionally, teams needed to:

  1. Notice degraded metrics in their monitoring tool
  2. Correlate the timing with the feature flag deployment
  3. Manually trigger a rollback in AWS AppConfig
  4. Wait for the rollback to propagate

Even with good tooling and clear runbooks, this process can take minutes or longer, and every minute counts in production: an outage that lasts less than a minute has much less operational and reputational impact than a ten minute one.

Closing the Loop: From Detection to Automatic Rollback

The new integration between AWS AppConfig and New Relic Workflow Automation eliminates this manual response cycle. Here's how it works:

During Feature Flag or Dynamic Configuration Deployment

When you deploy a feature flag using AWS AppConfig's gradual deployment strategy, New Relic Workflow Automation monitors your application's health continuously throughout the rollout window. The workflow evaluates your application state against configured alert conditions, including checking error rates, latency, alert severity, and any custom metrics you define using New Relic's unified telemetry data.

When Issues Are Detected

When your application's alert conditions are triggered during the deployment window, indicating the feature flag may be causing problems, the New Relic workflow immediately sends a notification message to your AWS SQS (Simple Queue Service) queue.

The Result – Automated Deployment Safety

What used to require manual detection, correlation, and response now happens automatically in seconds. The feature flag is rolled back before most users are impacted, and your team gets a complete audit trail of what happened and why.

Architecture: How the Integration Works

The integration uses a straightforward, secure architecture:

  1. AWS AppConfig deploys your feature flag using a gradual deployment strategy (e.g., linear 20% over 2 hours)
  2. New Relic Workflow Automation continuously evaluates your application health against defined alert conditions during the deployment window. When alert conditions are met (indicating degraded health) the workflow triggers automatically. This monitoring uses New Relic's unified telemetry platform, providing complete system context, including application health, infrastructure metrics, and user experience data
  3. Conditional logic in the workflow evaluates whether the application remains healthy or enters a degraded state
  4. AWS SQS receives a notification message from the workflow if health degrades
  5. AWS Lambda checks the SQS queue for messages and notifies AWS AppConfig to trigger rollback if a message is present (invoked by AppConfig every 15 seconds during deployment)
  6. AWS AppConfig executes the rollback, reverting the feature flag to its previous state immediately

Authentication is handled securely using AWS IAM roles with appropriate permissions scoped to only the necessary actions.

 

AWS AppConfig & New Relic Workflow Automation Diagram

Real-World Use Case: E-Commerce

Consider a scenario: You're releasing a new checkout flow feature flag to your e-commerce platform. You use AWS AppConfig to gradually roll it out over 2 hours, starting with 5% of traffic. During this time, you want to monitor key metrics to make sure the results are what you expect, and no adverse impact occurs.

Without automation: 45 minutes into the rollout (at 20% traffic), error rates spike from 0.1% to 2.5%. Someone notices in Slack 10 minutes later. Another 5 minutes to investigate and confirm it's related to the feature flag. Another 2 minutes to find the AWS AppConfig deployment and trigger the rollback. That’s over 17 minutes of elevated errors affecting 20% of customers. This may result in significant revenue or reputational impact.

With the integration: 45 minutes into the rollout, error rates spike from 0.1% to 2.5%. In under a minute, the workflow detects the degraded health, sends SQS a notification, and AWS AppConfig begins rolling back automatically. That's less than 2 minutes of elevated errors, and most customers never see the problem.

That’s a 15 minute improvement in revenue protection, customer trust, and engineering team performance.

Beyond Feature Flags: The Broader Picture

This integration demonstrates what becomes possible when you close the loop between observability and action. AWS AppConfig feature flags are just one use case. New Relic Workflow Automation supports a growing ecosystem of integrations:

  • Cloud Providers: AWS (EC2, ECS, Lambda, SQS, and more)
  • Incident Management: PagerDuty, Jira
  • Communication: Slack
  • Custom: Any HTTP API

Monitor with complete observability context, detect issues intelligently, trigger remediation automatically. This applies to auto-scaling infrastructure, restarting failed services, triggering chaos experiments, and countless other operational scenarios.

This is just the beginning of our partnership. We're exploring additional ways to integrate AWS services with New Relic's Intelligent Observability platform, making it easier for mutual customers to build reliable, resilient systems that respond intelligently to changing conditions.

New Relic Advance Build smarter automations with Intelligent Observability
Save your seat.