![]() |
||||||||||
|
||||||||||||||||||||||||||||||||||||
|
|
The usual response to these types of questions is to produce reams of charts breaking down such factors by different demographic variables. As well as adding to the project budget (producing and checking this many charts takes time) this often fails to provide adequate guidance to decision makers as the following example illustrates.
The key point is to realize that in the real world, differences in behaviour are explained by higher-order combinations of demographic and other variables. For example, 18-24 year old, working females, from large households, who are heavy users of text messaging, may use the service most frequently. Whereas 45+ Males, not working, without mobile phones, with children may use it least frequently. This example is a fairly simple illustration, though it is easy to see that this same approach can yield a deal of insight when the number of profiling variables being combined is large. By focusing on only the “main effects” of variables (e.g. gender in the whole population) we gain no information on how their explanatory impact differs within particular population subgroups. For instance, although text messaging is one of the best variables for defining the heaviest users, we might find in the population as a whole that its impact is insignificant. Ironically, by failing to drill-down into the data in a way which takes account of these types of interaction, we fail to see the big picture. The type of drill-down described can be achieved with a technique called CHAID Analysis. The analysis produces a tree-like structure, where each node (point) in the tree represents a particular subgroup which shares similar characteristics with regards to the dependent variable (frequency of use in our example). Each new node in the tree is further split into subgroups using the predictor variable which is found to discriminate the most on the dependent variable. The process stops once some minimum subgroup sample size has been reached, or when we fail to find other variables which discriminate well on the dependent variable. The branches on the final tree tell us which combination of predictor variables discriminate the greatest and the least on our dependent variable. The highest and lowest extreme groups are shown for our example in the chart below. Along with this output, we obtain a “gains chart” which illustrates, at any level of aggregation, the potential return from the subgroups identified (based on the behaviour we are interested in or some related target score) verses their proportion in the population. Pareto’s Law states that “80% of returns (sales in this case) can be gained from only 20% of the market”. This technique, in combination with other techniques (e.g. Latent Class Cluster Analysis), helps policy makers address the question - which 20%?
Please contact Gary Bennett for further information (garyb@logitresearch.com). |
| © Logit Research 2004 | Site by Ocean | |||