Metrics

Metrics are what your experiments are trying to improve (or at least not hurt). GrowthBook has a very flexible and powerful way to define metrics.

Conversion Types

Metrics can have different units and statistical distributions. Below are the ones GrowthBook supports:

Conversion Type	Description	Example
binomial	A simple yes/no conversion	Created Account
count	Counts multiple conversions per user	Pages per Visit
duration	How much time something takes on average	Time on Site
revenue	The revenue gained/lost on average	Revenue per User

Need a metric type we don't support yet? Let us know!

Query settings

For metrics to work, you need to tell GrowthBook how to query the data from your data source. There are a few ways to do this depending on your data source.

1. SQL (recommended)

If your data source supports SQL, this is the preferred way to define metrics. You can use joins, subselects, or anything else supported by your SQL dialect.

Your SELECT statement should return one row per "conversion event". This may be a page view, a purchase, a session, or something else. The end result should look like this:

user_id	timestamp	value
123	2021-08-23 12:45:14	10
456	2021-08-23 12:45:15	5.25

Metrics can support one or more types of identifiers. The above example assumes the metric only supports a single id type called user_id, but you would add additional columns if you need to support other ones.

Binomial metrics don't need a value column (the existence of a row means the user converted). For other metric types, the value represents the count, duration, or revenue from that single conversion event. In the case of multiple rows for a single user, the values will be summed together.

If you use Segment to populate your data warehouse, the SQL for a Revenue per User metric might look like this:

SELECT
  -- Assuming you support 2 identifier types - 'user_id' and 'anonymous_id'
  user_id as user_id,
  anon_id as anonymous_id,
  received_at as timestamp,
  grand_total as value
FROM
  purchases

SQL Template Variables

Within your queries, there are several placeholder variables you can use. These will be replaced with strings before being run. This can be useful for giving hints to SQL optimization engines to improve query performance.

The variables are:

startDate - YYYY-MM-DD HH:mm:ss of the earliest data that needs to be included
startYear - Just the YYYY of the startDate
startMonth - Just the MM of the startDate
startDay - Just the DD of the startDate
startDateUnix - Unix timestamp of the startDate (seconds since Jan 1, 1970)
endDate - YYYY-MM-DD HH:mm:ss of the latest data that needs to be included
endYear - Just the YYYY of the endDate
endMonth - Just the MM of the endDate
endDay - Just the DD of the endDate
endDateUnix - Unix timestamp of the endDate (seconds since Jan 1, 1970)
experimentId - Either a specific experiment id OR % if you should include all experiments

For example:

SELECT
  user_id as user_id,
  received_at as timestamp
FROM
  registrations
WHERE
  received_at BETWEEN '{{ startDate }}' AND '{{ endDate }}'

Note: The inserted values do not have surrounding quotes, so you must add those yourself (e.g. use '{{ startDate }}' instead of just {{ startDate }})

Denominator (Ratio / Funnel Metrics)

By default, metrics are evaluated against all users in an experiment: (# users who converted) / (# users in experiment)

You can instead choose another metric to use as the denominator.

Funnel Metrics

When the denominator is a simple binomial (conversion) metric, then it acts just like an "activation metric" in an experiment. It filters the users who are included in the analysis to those who first convert on this denominator metric.

For example, if you want to look at what percent of users checkout after viewing a cart, it can be described as % checkout / % viewed cart. This requires creating two metrics:

Viewed Cart - selects all users who viewed a cart
Viewed Cart -> Checkout - selects all users who checked out and picks Viewed Cart as the denominator.

Ratio Metrics

When the denominator is a count metric, then things are a little different. Instead of acting like a filter, we calculate both metrics and treat the value as a ratio.

The mean is sum(metric) / sum(denominator)
The standard deviation is calculated using the Delta method

For example, if you want to look at the Average Order Value (AOV), what you're really looking for is total revenue / number of orders. This also requires creating two metrics:

Orders per User - selects the count of orders for each user
AOV - selects total revenue per user and picks Orders per User as the denominator.

2. Javascript (Mixpanel only)

We query Mixpanel data sources using their proprietary JQL language based on Javascript. This allows for extreme flexibility when defining metrics.

All metrics at minimum need to specify an Event Name which must exactly match what is used in Mixpanel. You can use OR to match against multiple events. For example viewed_cart OR purchased

You can optionally add Conditions which filters the events further based on properties. For example, if Event Name is Page view, you can add a condition path = "/blog".

Count, duration, and revenue metrics have two additional steps. We first extract all event values for a user into an array and then reduce that array down to a single number, which is the final metric value for the user.

Conditions

Conditions in Mixpanel are very powerful. They consist of a Property, an Operator, and a Value. Multiple conditions are joined with an AND.

The Property can either be the name of an event property or a javascript expression. Some examples:

amount, equivalent to event.properties.amount
event.time
event.properties.city + ", " + event.properties.country

The Operator is one of the following:

equals
does not equal
is greater than
is greater than or equal to
is less than
is less than or equal to
matches the regex
does not match the regex
custom javascript

The custom javascript operator is special. The Value is a javascript expression that evaluates to either true or false (access the property value with the value variable). It lets you do arbitrarily complex filtering. For example:

value < 5 || value > 10;

For all other operators, the condition should read like an English sentence. For example:

// property operator value
country equals US

// Will render as
event.properties.country == "US"

Event Value

For count, revenue, and duration metrics, we need to know what the "value" of the event is.

The Event Value is a Javascript expression to extract a value from a raw Mixpanel event. If you are just extracting a single property as-is, you can just enter the property name as a shortcut. Otherwise, you can reference the event variable in your expression.

Here are some example Event Value expressions:

grand_total, equivalent to event.properties.grand_total
1 (hard-code the value to a specific number)
(event.properties.endTime - event.properties.startTime) / (60 * 60) (difference in hours between two unix timestamps)
new Date(event.time).toISOString().substr(0, 10) (event timestamp in YYYY-MM-DD format)

For count metrics, you can leave Event Value blank and it will default to hard-coding the value to 1, which is perfect for when you just want to count the number of events and don't care about specific properties.

User Value Aggregation

For count, revenue, and duration metrics, we need to know how to aggregate the event values together, in case a single user has multiple matching events.

The User Value Aggregation is another Javascript expression that reduces an array of Event Values to a single number (or null if the user did not convert). Reference the variable values in your expression. There are a few built-in helper functions:

count(values)
countDistinct(values)
sum(values)
min(values)
max(values)
avg(values)
median(values)
percentile(values, p) (p is a number between 0 and 100)

You can use your own custom expression too if you want. For example, this is the equivalent of sum(values):

values.reduce((sum, n) => sum + n, 0);

If the aggregation is left blank, we do sum(values) by default.

3. SQL Query Builder (legacy)

The query builder prompts you for things such as table/column names and constructs a SQL query behind the scenes.

We only recommend this for extremely simple metrics. Inputting raw SQL is far more flexible.

Behavior

The behavior tab lets you tweak how the metric is used in experiments. Depending on the metric type and datasource you chose, some or all of the following will be available:

What is the Goal?

For the vast majority of metrics, the goal is to increase the value. But for some metrics like "Bounce Rate" and "Page Load Time", lower is actually better.

Setting this to "decrease" basically inverts the "Chance to Beat Control" value in experiment results so that "beating" the control means decreasing the value. This will also reverse the red and green coloring on graphs.

Capped Value

Large outliers can have an outsized effect on experiment results. For example, if your normal order size is $10 and someone happens to make a $5000 order, whatever variation that person is in will automatically "win" any experiment even if it had no effect on their behavior.

If set above zero, all values will be capped at this value. So in the above example, if you set the cap to $100, the $5000 purchase will still be counted, but only as $100 and will have a much smaller effect on the results. It will still give a boost to whatever variation the person is in, but it won't completely dominate all of the other orders and is unlikely to make a winner just on its own.

Conversion Delay

Conversions within the first X hours of being put into an experiment are ignored (default = 0). This is useful for metrics like "day 2 retention". In that case, you would set conversion delay to 24 hours..

Negative conversion delays

The conversion delay can also be negative to include some conversions before a user is put into an experiment. For example, a value of -2 would mean conversions up to 2 hours before will be included. You might be wondering when this would ever be useful.

Imagine the average person stays on your site for 60 seconds and your experiment can trigger at any time.

If you just look at the average time spent after the experiment, the numbers will lose a lot of meaning. A value of 20 seconds might be horrible if it happened to someone after only 5 seconds on your site since they are staying a lot less time than average. But, that same 20 seconds might be great if it happened to someone after 55 seconds since their visit is a lot longer than usual. Over time, these things will average out and you can eventually see patterns, but you need an enormous amount of data to get to that point.

If you set the conversion delay to something negative, say -0.5 (30 minutes), you can reduce the amount of data you need to see patterns. For example, you may see your average go from 60 seconds to 65 seconds.

Keep in mind, these two things are answering slightly different questions. How much longer do people stay after viewing the experiment? vs How much longer is an average session that includes the experiment?. The first question is more direct and often a more strict test of your hypothesis, but it may not be worth the extra running time.

Conversion Window

The number of hours the user has to convert after the conversion delay (default = 72). Any conversions that happens after this time will be ignored. As always there's a tradeoff here.

The lower you set this, the more you can be sure that your experiment directly contributed to the metric conversion. For example, a user who views your new checkout page and then completes the purchase right away is a much stronger signal than someone viewing your checkout page and then finally completing the purchase 7 days later.

However, setting this too low can exclude valid conversions. In the above example, maybe your new checkout page is more memorable and makes people more likely to return later. With a short conversion window you wouldn't be able to capture this data.

If you return multiple rows for the same user/experiment, we unique the list (ie: only counts as one user in the experiment) and count the earliest timestamp. From that point, the user has 72 hours (or whatever you change the window to) to convert.

Converted Users Only

This setting is deprecated and has been replaced by "Denominator" above.