At CareerVillage.org, we connect over 135,000 professionals to support millions of students of all ages in career entry and advancement. We are now the go-to source for real-world information on education and career advancement advice. We recently expanded our offerings to offer an AI career coach. We use a range of technologies to manage our product, like Django and React, with multiple environments for staging and production. We've got databases of various kinds: mostly Postgres, in addition to Redis, Solar, and Celery. We've got data pipelines to populate our warehouse. And we have some machine learning services that help with things like content moderation and content tagging.

When we started, we used New Relic to help us with server monitoring. Twelve years ago, our main issue was when we went into schools with hundreds of students and teachers, they wouldn't be able to log in at the same time. We needed New Relic to help us understand what was going on. It showed us where our bottlenecks were so we could chip away at them. It helped us introduce the right technologies, like load balancing, when there were gaps in our observability. With larger groups, we used to ask students to log in alphabetically. First, all students with names starting with “a” logged in, then, after a minute, ask students beginning with “b” to log in, and so on. New Relic helped us identify what the server and load balancing issues were, so we could grow to enable 700 students to log in concurrently.

Once the login issue was solved, we were able to look at improving user experience. One of our challenges is content moderation. It's not something our internal team can do for the number of students on our platform. So we have community volunteers. Their concurrent use is vital to keeping a safe crowdsourced space available. Having a tool like New Relic is essential for our efficiency. It helps us focus on what we can do to improve the user experience from an infrastructure point of view.

Now we can monitor our dev, staging, and production environments. We have a separate load testing environment that we monitor, in addition to uptime and reducing bottlenecks on things like concurrent writing to our databases. We track mean time to recovery (MTTR), and empty cache to page load time, which is essential when so much of our traffic comes from search engines. Using New Relic, we were able to get this load time down from 300 milliseconds to under 100 milliseconds.

As a small team, New Relic helps us build out monitoring for our crowdsourced platform. It has grown with us to support our phenomenal student and user growth and allows our engineering team to focus on building out new services and features. New Relic is our core support for observability: it is our team member focused on supporting our DevOps functions. We don't want to hire more engineers to focus just on observability, that is the role of New Relic in our organization.

Over the past year, we have invested in building an AI career coach. We need to be a nimble organization, especially on the engineering side. With New Relic, we have been able to focus new engineers on our AI features, and not on our infrastructure monitoring. There are a lot of technologies involved: a frontend React application, a backend Django application running in the cloud, LangChain as an AI orchestrator, agent management libraries, and a combination of third-party large language models (LLMs) and our own trained and self-hosted LLM,  retrieval augmented generation tools, and Pinecone DB for the vector database that supports the retrieval.

Our AI career coach also represents a shift in our business. The goal as always is to help young people, but now we can also support adult learners. We're able to serve adults who are looking to change their job or help them understand what they might need to grow their careers. The goal of the AI career coach is not only to answer questions from our crowdsourced data, but do exercises like mock interviews, review a job description against your qualifications, or have a conversation about your networking abilities. These are less like question-answer features and more about building a learning experience. We're already deploying it in schools, nonprofits, and workforce development training programs. The AI career coach needs to be reliable, consistent, and predictable. There’s not much tolerance for bugs or problems.

Our new AI tool connects to New Relic as well, where we monitor its uptime, performance, and response times. It has been because New Relic works so well that it has made entry into these new domains possible: our engineering team is growing from 3 core engineers to 7, but our new engineers are not focused on monitoring. It's keeping costs down by allowing our developers to focus on the user-facing product features that matter.