Can I Compare Utilities Across Different Studies?

Last Updated: 01 Oct 2014Hits: 8289

I ran two studies in the past year. Is it appropriate to compare the utilities from study A to study B?

The answer to this question, like many in conjoint analysis, is it depends. If the two studies have the same design, meaning the attributes and levels presented to respondents are the same, then it is appropriate to compare utilities across studies, as it would be appropriate to compare utilities across respondents within a study. Keep in mind that using rescaled utilities, such as zero-centered diffs, is the appropriate choice when making comparisons across respondents. For additional reading about utilities, please see our article on Interpreting the Output of Conjoint Analysis

If, however, your designs differ across studies, it would not be appropriate to make comparisons since utility scores reflect the relative preference and importance of the differences in levels. Here's two examples to illustrate.

Example 1: A study is run on automobile preferences without a pricing attribute. You might come up with a set of utilities for a respondent like this

Kia	Ford	BMW	Ferrari
-67	-59	35	91

If you then field the same study and include a price attribute, that same respondent might not ever pick a Ferrari in the study, and their utilities might change to something like this:

Kia	Ford	BMW	Ferrari
24	-32	38	-30

The respondent hasn't changed, but the situation has changed. The respondent probably still would prefer to drive a Ferrari, but when price enters in to the picture, we (probably correctly) model that respondent as being very unlikely to purchase a Ferrari.

A similar example could be created by changing levels within an attribute, as opposed to changing attributes above. Perhaps a study was done on ice cream and some traditional flavors were included, such as Vanilla, Chocolate, and Strawberry. The utility estimation would reflect these flavors having utility relative to each other. If a new study was done that included some potentially terrible options, like avacado bubblegum or brocolli-spinach marmalade, the utility estimates from Vanilla, Chocolate, and Strawberry would now be relative to these terrible options, and their values would shift up.

In both examples it would not be very defensible to use the analysis to see if the brand image of Ferrari has changed over time, or if the sample used for the ice cream studies was different because of how much they liked Strawberry.