Sample Size and Power Analysis for MaxDiff, Parts 1 and 2

Keith Chrzan

Last updated: 10 Oct 2019

Imagine that we need to estimate what sample size we need for a MaxDiff experiment.

Part I – How Large of a Utility CAN We Detect?

Step One: Use Lighthouse Studio to make the design. Now in the Test menu, choose Generate Data and ask the software to generate 400 respondents. After your respondents generate, select Get Data and then go to the Analysis Manager to run Logit analysis on the utilities. Imagine that the results show standard errors of 0.060, on average, for the items. We’ll use this estimate of standard error in just a bit.

Step Two: For a detailed discussion of where the following numbers come from a short but very clear treatment online, visit: http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Power/BS704_Power_print.html). If we want 95% confidence and 50% power (i.e. a 50/50 chance of finding a significant difference) then we use 1.96 (the Z statistic associated with 95% confidence and 50% power). If we want 95% confidence and 80% power, we will use a combined Z statistic of 2.80 (which is the 1.96 from 95% confidence plus 0.84, the Z value corresponding to 80% power). By similar math if we want 95% confidence and 70% power, we use the combined Z of 2.485 (1.96 from confidence plus 0.525 from power). Academic researchers usually advise having 70% or 80% power.

Step Three: Now we multiply the results from Step 1 and Step 2 together. For our MaxDiff with 400 respondents and a 95% confidence level, then we have an 80% chance of detecting as significant utilities with an absolute value larger than 0.168 (that is, 0.060 times 2.80) or greater. We have a 70% chance of detecting utilities with an absolute value larger than 2.485 * 0.060 = 0.149. And so on.

Step Four: Given results of steps 1-3, we know that we can detect with 95% confidence and 80% power utilities larger than 0.168. To make an estimate for any number of respondents we multiply that 0.168 by the square root of 400/n, where n is the sample size. So, for a sample size of 1,200, we get SQRT(400/1200) = 0.577 and we multiply this by our detectable difference in Step Three: 0.577 * 0.168 = 0.097 (we have an 80% chance of detecting, at 95% confidence, utilities with an absolute value of 0.097 or larger). Similarly, if we increase our sample size to 4,000, then we have 0.168 * SQRT(400/4000) = 0.053— our sample size increase results in a more precise experiment, one enabling us to detect smaller significant differences.

Part II – How Large of a Utility SHOULD We Detect?

When we think about how big of a utility we want to be able to detect, we’re thinking about what’s called effect size. For logit models like those we use to estimate MaxDiff utilities, we typically express effect sizes in terms of choice probabilities. For example, maybe we want to be able to measure when an item is 25% more likely to be selected than an average item. Since the average item on a zero-centered (“raw”) scale is 0.00, by the logit choice rule we’d exponentiate the 0.00 to get 1.00 and an item with 25% more likelihood to be chosen would have an exponentiated utility of 1.25 or a raw utility of ln(1.25) or 0.223. In this case we want to be able to detect a utility difference of 0.223 and the original sample size of 400 in the example above would suffice (because it can detect, with 80% power, utilities of 0.168). In fact, we could get by with a sample that’s even a bit smaller than 400: because we want 0.168 * SQRT(400/n) to be at least 0.223 and solving for n that gives us a sample size of 227.

Summary

Knowing how to think about effect size and how to use Lighthouse Studio to test a MaxDiff design can allow you to calculate sample size for your MaxDiff experiments.