12 minute read · Published January 19, 2024

How a dark launch can help you better validate product ideas and release new features

Hugo Pegley Content and SEO, CommandBar

Latest Update July 20, 2024

For startups, they say that if you're not embarrassed by the first version of your product, you launched too late. If you already have users and customers, that's no longer true: Launch something embarrassing to paying customers and you'll only get churn.

That's why most product managers would love to always do deep user research, extensive user testing, and have lots of data before they launch. However, the reality is that it can be hard to do all those things and move fast, particularly for start-ups. With continuous deployment, product managers and engineering teams can often feel pressure to get features out quickly, particularly in highly competitive markets.

However, there's no need to despair. There's a way to get high quality feedback and data effectively without risking huge blowback.

It’s called a dark launch.

A dark launch allows you to release a new feature or update to a small segment of users without risking the entire product’s reputation or functionality. You're able to get feedback and assess product usage by this small segment, and then decide whether you're going to deploy more widely or not. This used to be a highly intensive process, but now many tools allow you to do this with the click of a button using feature flags.

Let's dive into dark launches and how you can successfully use them in your SaaS product.

Breaking down a dark launch

To help illustrate how dark launches work, let's run through an example.

Let's say you are a product manager at a task management SaaS company. You and your team have created a new feature that will allow users to place each of their to-do items into a new system you call “Buckets.” Anecdotally in conversations, and through a bit of user research, you found that users like the idea of putting things into “buckets” and then seeing their progress visually.

Your team has begun to build out the buckets feature because it doesn't require too much engineering lift, but you are a little nervous it will not get the reception you expect, and your boss is a bit skeptical as well. The buckets system would represent a sizeable change from your traditional product, and while you think it has a high potential to increase engagement, you're just not sure yet.

This is a great opportunity to use a dark launch. Why?

Why you should do a dark launch

Using a dark launch can be effective if you're looking for the following things:

Real-world testing and performance assessment: you're going to be able to see how real users interact with buckets, and collect cold hard data as well as qualitative feedback.

Reduced risk and user impact: you're not risking your product's reputation with one new launch, as you're limiting this to a small cohort of users.

Feedback collection: you are complementing your dark launch with surveys and in-app feedback mechanisms which will allow you to improve the product and more generally connect with your user.

As you can see, a dark launch can give you a great opportunity for testing and feedback with reduced risk.

So how would you execute this?

Executing a dark launch

So you've decided you want to do the dark launch. Your team has a solution in place to execute these, like LaunchDarkly. You have a clear vision of why you are running the dark launch test, and you have a desired end state. Now, it’s time to build it.

The feature flag

You first add your code to a feature flag within your dark launch tool. A feature flag allows you to toggle on and off a section of code which will be deployed only to the cohort you selected, not the entire user base. Once you've created this feature flag, you can toggle on and off whenever you desire.

Launching and collecting

You toggle the future flag on, and you begin to see product usage data roll in (or lack thereof!) You continually monitor this data to ensure collection is happening as you desire.

Pro tip: add in-app feedback requests and surveys using CommandBar to strengthen your dark launch messaging and qualitative feedback collection.

Analyzing your dark launch data

You've got a lot of data at your fingertips, which is great! But, it can also be hard to know what to focus on. While there's no one right answer (because depending upon your business, user base, and use case, the most important data points will vary,) there are a few general areas that are key.

User Engagement and Activity: Track how users interact with the new features. This includes metrics like time spent, click-through rates, and frequency of use. Understanding user engagement can help you gauge the feature's appeal and usability.
Performance Metrics: Monitor the performance of your system with the new features in place. This includes load times, response times, and resource usage. Ensuring that your system maintains high performance is crucial for user satisfaction.
Error Rates and Bugs: Keep a close eye on any increase in error rates or the emergence of new bugs. This will help you address issues before a full rollout.
User Feedback: Collect direct feedback from users. This can be through surveys, interviews, or feedback forms. User feedback is invaluable for understanding how your features are received from a customer perspective.
Adoption Rates: Measure how quickly and widely the new features are being adopted by your users. Low adoption rates might indicate a lack of interest or awareness, which could require more marketing efforts or modifications to the features.
Conversion Metrics: If applicable, monitor how the new features impact conversions. This could be in terms of new sign-ups, upgrades, or other conversion goals relevant to your SaaS.
System Reliability: Ensure that the introduction of new features does not affect the overall reliability of your system. This includes monitoring uptime and quickly addressing any service interruptions.
A/B Testing Results: If you're running A/B tests, analyze the results to understand which variations of your features perform better in terms of user engagement, satisfaction, and business metrics.
Customer Support Queries: An increase in customer support queries related to the new features can indicate confusion or problems that need to be addressed.
Analytics on Feature Usage: Use analytics tools to gather data on how specific features are being used. This can help in understanding which aspects of the new features are most valuable to your users.
Churn Rate: Monitor if there is any increase in churn rate that could be attributed to the new features. This is critical for understanding if the new features negatively impact user retention.
Revenue Impact: Evaluate the impact of the new features on your revenue. This could be direct (through new subscriptions or upsells) or indirect (through improved retention or reduced churn).

After a dark launch — what now?

So you executed your dark launch and collected a lot of data. You saw a pretty good user engagement rate, but you also noted that there were a lot of questions about the buckets and how to use them most effectively. Plus, the buckets interfered with another small part of your product.

Your team should get together and discuss whether the data you've collected indicates a path forward for this feature, or whether to kill it. A dark launch can give you a good taste of how things might go in reality with your entire user base, but it's not a perfect picture, so you'll want to ensure that even if you do move forward you continue to test and gather feedback constantly.

Once you’re in this state where you’ve rolled things out partially, it easy to be somewhat complacent. The trickiest part may still be ahead, but in some sense you’ve already ‘shipped’ the feature and customers are getting value from it, so it’s easy to linger here if you know what’s ahead.”

Enumerating up-front what things you’d need to see before you decide “Go vs. No-go” I think is really key.

Why NOT to dark launch

(Video Source)

Dark launches using feature flags are super helpful, and there's a reason why they have become so popular.

That's not to say that they're not without their issues.

The biggest issues fall into two camps: technical and non-technical.

Dark launch technical blockers and issues

On the technical side, even though dark launches have become easier and easier to execute, there are still definitely headaches that come with managing a bunch of feature flags, experiments, and data flows. You'll need to ensure that your team has great communication and clarity on who is in charge of what for each dark launch. If you're not careful, technical debt can become an issue as you accumulate more and more feature flags.

Dark launch non-technical blockers and issues

On the non-technical side, the inherently limited nature of the feedback and product usage data you are getting can potentially be misleading. If you don't create your cohorts correctly, you might get inaccurate signals about the demand for a feature, only to find when you launch it that it's a dud and you've wasted time, money, and resources. Additionally, if done incorrectly, your dark launch could run into privacy or security issues, or simply just be frustrating or confusing.

You'll want to review this list of potential roadblocks before you go forward with any dark launch:

Limited User Feedback: Since dark launches are typically done with a small subset of users, the feedback obtained may not be representative of your entire user base. This can lead to skewed perceptions of how the new feature will be received by the larger audience.
Performance Impact: Introducing new features, even to a limited user group, can impact the performance of your system. This includes increased load on servers, potential for new bugs, and overall system stability. Monitoring and managing these risks is crucial.
Complexity in Implementation: Dark launches require sophisticated feature flagging, routing, and monitoring systems. Setting up and managing these systems can be complex and resource-intensive.
Data Segregation Issues: Ensuring that data from the dark launch does not corrupt or improperly influence your existing datasets can be challenging. This is particularly important for analytics and machine learning systems.
Security Risks: Introducing new code always carries a risk of security vulnerabilities. Since dark launches involve real users, any security flaw can have immediate and significant consequences. To mitigate these risks, it's advisable to run the code through an automated vulnerability scanning tool to ensure it doesn't contain any vulnerabilities.
Inaccurate Scaling Predictions: The behavior of features under a dark launch may not accurately predict how they'll scale once fully released. This can lead to underestimating the infrastructure needed for full deployment.
Regulatory and Compliance Concerns: Depending on your industry, deploying features to a subset of users without their explicit knowledge might raise privacy, consent, or compliance issues.
Difficulties in Measuring Impact: Accurately assessing the impact of new features on key metrics like user engagement, conversion rates, and overall satisfaction can be more challenging in a limited-scope dark launch.
Resource Allocation: The resources needed for a dark launch, including developer time, support, and infrastructure, can be significant and might detract from other important projects.
Rollback Challenges: If issues are detected, rolling back features can be complex, especially if data migrations or significant changes are involved

When to dark launch?

Naturally, there's also the simple question of whether a dark launch will actually help you make a decision or get information. It's not the case that you should dark launch everything, and in fact it would be counterproductive to do so.

You can think of dark launches as one tool in your PM toolkit. It’s best suited for:

Getting feedback on a new and/or specific feature: when you want to understand how users are going to react to a specific feature change or launch, dark launching that with the proper cohort can be very insightful.
Testing your product’s technical performance: sometimes your engineers are going to want to calm their nerves about a new feature’s impact on your product, particularly around speed and the number of bugs or errors. Dark lines can be a great way to gauge these numbers in a limited fashion.
A/B testing: while folks often use dedicated platforms for A/B testing, using a dark launch can also be a cool way to see how two feature sets perform compared to one another.
User insight: if you're interested in learning more about your users’ interests and behaviors, you can dark launch with the intent of collecting qualitative UX data.

Before you launch, you want to make sure everyone is aligned. As my colleague Paul said:

“If there isn’t consensus upfront about when the “Go vs. No-go” decision should be made, it’s easy to sit in ‘Beta’ and just wait for some strong signal that says to ship it to everyone or to unship it.”

But the reality is there won't always be overwhelming data in either direction and that you'll need to make an informed decision. Having criteria set up beforehand can help allay any disagreements between team members after the fact.

CommandBar goes dark

I was talking with our CTO Vinay about our dark launch process the other day, so I thought I'd include a live example from our product. Vinay and his team were recently using feature flags for a change in targeting rules and conditions for our in-app messaging. They rolled out two versions, and then monitored for errors through Sentry to help tweak and improve the logic. They continued this until they whittled the error rate down to zero with updated versions, and then flipped the switch on the new logic for the entire user base. It was an effective way to improve our targeting with minimal overhead and engineering costs.

Increased insight vs. the risk of a perpetual beta — a CommandBar engineering perspective

I was also chatting with one of our engineers, Paul, and he had some interesting takes on dark launches.

Positively, Paul said:

”I think another interesting piece is that it allowed us to catch and fix errors that we would never have thought of beforehand. There are always weird edge cases that you just can’t enumerate up-front.

I think the lesson there is that you don’t know how your users are going to use something until you give it to them. They will definitely surprise you: which for engineers means they will break things you thought were fool-proof 😅

On the other hand, Paul warned of the curse of the “perpetual beta:”

“One other pitfall I’ve seen in the past is what I’ll call the ‘Perpetual Beta’. You might have a big and complex new feature that you decide to ship out to a group of friendly users to try and work out the kinks. Eventually, you get things working, but as soon as you want to roll it out to everyone it starts to feel extremely daunting (complex migration, backwards compatibility, etc.). Because of this, the feature can end up staying in this perpetual ‘Beta’ state, even for months or a year.”

“I think the lesson here is that dark launches don’t always make the hard parts easier - if you need to do a complex migration, it’s going to be a challenge regardless.”

Scared of the dark? Try these instead!

While we're pretty big fans of dark launching here at CommandBar, there are certainly other great options to get user feedback.

Canary testing

Imagine sending a tiny, chirpy bird (our canary) down a digital mine to ensure it's safe. In software terms, a canary test involves releasing a new feature to a small group of users before rolling it out widely. This approach allows developers to monitor the feature's performance and gather user feedback in a controlled environment.

The main difference is that users know that they are part of the test group!

Now, let's contrast this with a dark launch, which is more like a magic trick performed in a dimly lit room. In a dark launch, features are released to a cohort of users, but without their explicit knowledge. This stealth mode enables developers to test the feature's impact on system performance without all users being any the wiser.

Painted / fake door testing

Running a painted door test can be an excellent way to gather data about the viability of a certain product feature. You can read our in-depth article on this topic, but essentially in a painted or fake door test you create a mock landing page or pop up or nudge about this potential new feature, engage the click-through rate and qualitative feedback from interested users.

Formal and informal beta testing

Beta testing is one of the pillars of software development, for good reason. It can certainly be worthwhile to run either formal beta programs or to give certain users beta access ad hoc. Either way, beta testing gets more eyeballs on your product earlier which is great.

The night is always darkest before the dawn

Dark launches can be a very powerful way to meaningfully test and evaluate features with reduced risk and resource cost. They are best suited for evaluating and validating product ideas and technical changes, and they are best managed through feature flags by highly communicative and organized teams. When you effectively use feature flags and coordinate well across your team, a dark launch can be a great way to get high quality user feedback and data with reduced risk and effort.

Hugo Pegley With over 6 years as a technology writer, entrepreneur, and marketer, Hugo Pegley brings his diverse skill set to CommandBar to help users learn about all things product and UX. Hugo’s experience across a range of start-ups help him to produce engaging content which aims to spark conversation and discovery. When he's not standing at his desk writing, you can find Hugo enjoying a cup of green tea or watching the Warriors play basketball.