Email Logit Research

News:
Latest updates
 
Techniques and tools:
Introduction
Structured Equation Modelling
CHAID Analysis
Latent Class Analysis
Conjoint Modelling
Maximum Difference Scaling
Brand Mapping
Multivariate Testing
Key Drivers Analysis

 


Techniques and tools
CHAID Analysis

A very common objective of research is to find subgroups in a market which differentiate highly according to some observed measure, such as frequency of service use (daily, several times a week, weekly, less than once a week).

The usual response to these types of questions is to produce reams of charts breaking down such factors by different demographic variables. As well as adding to the project budget (producing and checking this many charts takes time) this often fails to provide adequate guidance to decision makers as the following example illustrates.

The key point is to realize that in the real world, differences in behaviour are explained by higher-order combinations of demographic and other variables.

For example, 18-24 year old, working females, from large households, who are heavy users of text messaging, may use the service most frequently. Whereas 45+ Males, not working, without mobile phones, with children may use it least frequently. This example is a fairly simple illustration, though it is easy to see that this same approach can yield a deal of insight when the number of profiling variables being combined is large.

By focusing on only the “main effects” of variables (e.g. gender in the whole population) we gain no information on how their explanatory impact differs within particular population subgroups. For instance, although text messaging is one of the best variables for defining the heaviest users, we might find in the population as a whole that its impact is insignificant. Ironically, by failing to drill-down into the data in a way which takes account of these types of interaction, we fail to see the big picture.

The type of drill-down described can be achieved with a technique called CHAID Analysis. The analysis produces a tree-like structure, where each node (point) in the tree represents a particular subgroup which shares similar characteristics with regards to the dependent variable (frequency of use in our example). Each new node in the tree is further split into subgroups using the predictor variable which is found to discriminate the most on the dependent variable. The process stops once some minimum subgroup sample size has been reached, or when we fail to find other variables which discriminate well on the dependent variable. The branches on the final tree tell us which combination of predictor variables discriminate the greatest and the least on our dependent variable. The highest and lowest extreme groups are shown for our example in the chart below.

Along with this output, we obtain a “gains chart” which illustrates, at any level of aggregation, the potential return from the subgroups identified (based on the behaviour we are interested in or some related target score) verses their proportion in the population. Pareto’s Law states that “80% of returns (sales in this case) can be gained from only 20% of the market”. This technique, in combination with other techniques (e.g. Latent Class Cluster Analysis), helps policy makers address the question - which 20%?

 

Please contact Gary Bennett for further information (garyb@logitresearch.com).

< Back to Techniques and Tools Home Page

 
    © Logit Research 2004 Site by Ocean