The ABC’s of A/B Testing

Startup for Startup


14 min read

Imagine you have a product that you want to tweak and improve. That being said, you’re not sure what changes could potentially upgrade or harm your product. What if there was a way to pursue multiple paths towards improvement without committing to a specific one? That way, you can determine the best route before changing your product. 

With A/B testing, you can do just that by dividing users into test groups that interact with different versions of the same product.

For example, you have a button, and you want to increase the number of clicks.

The test compares how many users selected which button so that you can change your product accordingly. In this way, A/B tests allow you to create the product your users actually want and not what you think they want. always conducted A/B tests, but as the product grew, the R&D Team had less time to devote to tests. Thus, the growth team, like the A-team of A/B testing, was assembled. They’re responsible for continually running A/B tests to generate the data which directs the company towards a better product. Today’s guest, Aviram Gabay, is a developer at the company’s growth team. 

Houston, We Have A Problem Opportunity

A/B testing turns “problems,” such as users getting lost on the product’s on-boarding page, into growth opportunities. The challenge is first to determine which problems are opportunities in disguise, and then uncovering the source of the “problem.”

To uncover the root of the “problem,” Aviram and the growth team create a hypothesis and run tests on what they think is the “problem’s” source. They hypothesized that “the number of buttons on their landing page confused users.” After running the test, and limiting the number of buttons, they discovered users were still lost.

After multiple tests and hypotheses to uncovered that the buttons cloaked the real cause, a lack of clear direction during the on-boarding process.

With each hypothesis comes a long string of code needed to implement the tests. Approximately 90% of the growth team’s code is never used in the platform.

You might say, A/B testing is not for the sentimental.

1 Product, 1000+ Versions, Domino Effect Not Included

The growth team runs six to seven A/B tests at any given time, while the rest of the company runs an additional four. At any given moment, at least ten tests are running creating 2^10 or 1024 versions of the product!

But how can you identify which test ultimately created the outcome?

The trick to A/B testing is running each test on a different part of the product, without allowing any overlap. If there’s any overlap, it’s impossible to know what made the difference. For example, if the button is altered in size and changed to red, you won’t know which caused the increase of users clicking on it.

But what about a domino effect? How do they know that no other test affected their users and that this specific test is the cause?

The answer is statistics. Every user receives a slightly different version of They may or may not receive a red button and they may or may see a one-minute video. Each variable is not dependent on another, so users receive versions of the product at random. This allows you to isolate the data and see what changes led to positive results.

The Growth Team’s Growth Spurt 

In our opinion, the most powerful example of the growth team’s impact was perhaps the simplest. In the past, when users entered the website, they landed on the product’s homepage. There was no introduction or tutorial on how to use the product. The growth team identified this issue, ran some tests, and today there is a video explaining the product’s value.

This one-minute video created a 25% increase in paid users and a 33% increase in monthly recurring revenue (MRR) because users were immediately presented with the product’s value. A modest hypothesis, a few tests, and led to a significant increase in revenue. Not all victories are that grand, but it’s hardly the growth team’s only knockout.

Getting Ghosted by Your Customers? 

Not all hypotheses and tests are intuitive. Like many startups, faced the challenge of deciphering why users enroll for a free trial period and then disappear. Users would register, confirm their email, and go through the signup process only to never use the product.

Did these users not understand the value the product provides?

Maybe they enjoy giving out their email address to strangers?

Even though they constantly tweaked their on-boarding process to increase the number of active users, nothing affected these numbers. Ultimately, the test that brought the best results was the least intuitive. The team added another stage to the on-boarding process and showed users how different departments within one company could use the product.

By showing different potential templates, the company emphasized their product’s value and diverse use cases without forcing them upon their users. This simple change launched a 20% in product use. Instead of dragging them through a tutorial, gave users the motivation to try the product themselves.

Sample templates for users during an on-boarding process

Two Steps Forward, One Step Back Into A/A Testing ran an A/B test and saw a significant increase in one week and a dramatic decrease in the following week. They were moments away from permanently changing elements of the product when they spotted the decline. Suddenly, they had no idea if the change’s impact was positive or negative.

That’s when they discovered A/A tests, testing two identical “variables” to receive more accurate data.

Over time, the data from the two “blue buttons” equalizes allowing for a more definitive answer to the question “is this change for the good?”

Moreover, if so, how good is it?

A/A tests proved to them that A/B tests can’t always be conclusive on their own. Sometimes, they require additional testing. Today, the company focuses on running tests that can reach definitive results in what they determine a reasonable amount of time.
Along with implementing A/A tests came the realization that they previously ran A/B tests and immediately made changes to the platform based on inaccurate data. Potentially, something that they thought caused an increase may have done the opposite in the long term and vice versa. This led them to create a company policy that any A/B test needs to be run on 25,000 users or accounts depending on the test’s scale.

Don’t Listen to The Data (Listen to Your Heart)

Sometimes, the company receives concrete data that specific changes would increase critical measurements, but then they don’t implement them. 

These instances are rare, but they do exist. One example is when they learned that by creating a blank board, user engagement increased. On the other hand, they saw that the users were communicating less amongst themselves. They greatly value users’ communication, and as a result, they decided not to implement this change. Sometimes data contradicts company values, and when that happens, the company’s values remain the priority.

A sample of a blank board

If At First You Don’t Succeed, Try, Try, Try Again 

If A/B testing were an idiom, it would be “trial and error.” Even today, when the growth team starts working on a new hypothesis, they’re usually incorrect. It can take time to understand what to look for, let alone find it.

You’re probably thinking A/B testing sounds like a grueling and tedious process. So, why does do it and we do we think you should too?

The growth team recently showed that in three months, the changes they implemented resulted in ten million dollars of annual revenue!
So, when Aviram promises that a great success makes up for at least twenty failures, we tend to believe him.

Tips for Creating a Successful A/B Test

  • Your hypotheses must be structured so it can be conclusively proven or disproven by a test. Therefore, tests’ main focus must be generating meaningful data that will prove or disprove your hypothesis.
  • Their A/B tests mostly focus on how to increase the conversion from free trial users into paid subscribers. The company’s free trial period runs for two weeks, so their tests run for two to three weeks. Identify where your opportunities lie and time your tests accordingly.

More articles from the blog

איך למקסם את התועלת של ישיבות בורד

אדן שוחט


4 דקות קריאה

על החשיבות של האצלת סמכויות

מיכאל מומצ'וגלו


3 דקות קריאה

איך יוצרים תוכן שעוזר לחברה לצמוח

Startup for Startup


6 דקות קריאה