What is SUTVA and What to Do When It’s Violated in Practice

Imagine we are doing an A/B test, for user i, W=0 indicates control and W=1 treated, so the two potential outcomes for user 1 are Y10 (control) and Y11 (treated).

We can write the experiment outcome as the table below only if the Stable Unit Treatment Value Assumption (SUTVA) holds true: the outcome of each user only depends on the type of treatment they got assigned, not the treatment of other users. The columns for i do not include Wi’, i’≠i.

In practice, this assumption could be easily violated in a two-sided market setting. For example, in food delivery companies, a treatment that improves one user’s experience might make the other user’s experience in control worse if they share the same delivery person, making the treatment group looks extra good; but the effect would disappear when the treatment fully rolls out and each unit has to compete with it own not the control group.

To solve this, companies like Doordash and Uber use switchback test. All users are assigned to one variant for some amount of time, then switch all units to the other variant for the next period, repeat this multiple times.

Two things to highlight for the Switchback test method:

To minimize crossover effects from one period to another, ignore data immediately after the switch until the system reset to its equilibrium.
When analyzing, the randomization unit is not individual person but each time- unit/time-region unit.

Switch Back Test (DoorDash)

Related Readings: