Segmenting With Sparse MaxDiff Data: How Data Sparseness Impacts Market Research Accuracy

Keith Chrzan

Last updated: 20 Aug 2025

What is MaxDiff and Why Does Data Sparseness Matter?

In some MaxDiff studies we have so many items that we can't easily fit enough questions to show each item the recommended three to four times to each respondent. In those cases, we often opt for a "sparse" MaxDiff, one wherein we show each respondent enough questions to see each item only once or twice.

While even a sparse design can recover a sample's true mean utilities with great fidelity, we know that our ability to recover respondents' true utilities degrades as our MaxDiff design becomes increasingly sparse (Chrzan 2015, Chrzan and Peitz 2019). This should have implications for segmenting with MaxDiff data, but I've never quantified the deterioration in our ability to recover segments that sparseness may cause. Recently this question has come up from a client, so I decided to look into it.

Research Methodology: Testing MaxDiff Segmentation Accuracy

Experimental Setup

I took an existing MaxDiff data set with a large number of items (36) for which I had previously run a four-segment solution using latent class MNL. I took the mean MNL utilities for each segment and I treated them as the known (true) utilities for each respondent in each of four different segments. I copied each set of utilities into 200 rows of data. Thus I have 200 respondents in each of four segments, for a total of 800 respondents whose four unique sets of utilities and whose segment memberships I know for certain.

MaxDiff Experiment Design

To create the MaxDiff data files for analysis, I programmed three MaxDiff experiments into our Lighthouse Studio software:

Experiment 1: 100 versions of 27 sets of quads (each item seen 3 times)
Experiment 2: 100 versions of 18 sets of quads (each item seen twice)
Experiment 3: 100 versions of 9 sets of quads (each item seen once)

Respondents in these three experiments will see each item three times, twice and just once, respectively. Using the data generator functionality in our software, I had each of my 800 artificial respondents answer each of the three MaxDiff experiments, using their assigned utilities and a theoretically appropriate amount of random Gumbel response error. The result was three sets of responses from each respondent, one per experiment.

Analysis Approach

With both the experimental designs and the response data in hand, I ran latent class MNL on each of the three experiments to produce segments. In all three cases the BIC fit statistic correctly identified a four segment solution, which was encouraging. When we compare known segment membership to the segment membership I estimate from these three analyses, however, we expect some degradation from sparseness: that respondents who see each item the recommended minimum of three times per item should fall into their known segment more often than if they see each item twice or once.

Key Findings: How Sparse Data Affects Segmentation Accuracy

Adjusted Rand Index Results

And this is exactly what happens. I measured the accuracy of segment assignments using a standard metric called the Adjusted Rand Index (ARI) and I found this pattern:

As expected, the accuracy of segment assignments falls as sparseness gets worse.

Segmentation Accuracy Breakdown

A more intuitive way to report this might be to count how often each method puts respondents in the right segments, say by crosstabbing true and estimated segment membership. For example, for the standard MaxDiff, where each respondent sees each item three times, we get this crosstab:

I've highlighted the cells where the two segmentations match up. If I sum the highlighted numbers and divide by the total sample size of 800, we can see that the latent class MNL put 92.25% of the respondents into the correct segments.

Doing the same for the other two experiments we can see the deterioration in accuracy of segment assignments that comes from sparseness:

Sure enough, as we show each MaxDiff item less often to each respondent, our ability to accurately segment respondents decreases.

Implications for Market Research and Choice Modeling

Study Limitations and Considerations

Of course, this analysis uses robotic respondents, but we have no reason to believe we'd have greater success with human respondents (in any case we never know the true segment membership of human respondents).

Also, this is just a single study with one particular pattern of between-segment differences, equal sized segments and respondents programmed to answer with equal amounts of response error.

Future Research Directions

This research might be interestingly expanded to include:

Experiments with different numbers of items
Designs with different amounts of sparseness
Populations with different numbers of segments
Segments with different patterns of utilities
Segments of differing sizes
Respondents with different amounts of response error

But those will be jobs for another day.

Conclusion: Balancing Practicality with Accuracy in MaxDiff Studies

This research demonstrates the trade-off between practical constraints and segmentation accuracy in MaxDiff studies. While sparse designs can still provide valuable insights, researchers should be aware that reducing item exposure frequency will impact the precision of segment assignments.

References

Chrzan, K. (2015) "A parameter recovery experiment for two methods of MaxDiff with many items," Sawtooth Research Paper available at https://sawtoothsoftware.com/resources/technical-papers/a-parameter-recovery-experiment-for-two-methods-of-maxdiff-with-many-items

Chrzan, K and M. Peitz (2019) "Best-Worst scaling with many items," Journal of Choice Modeling, 30: 61-72.