In everyday data mining practice, the availability of data is often a serious problem. The amount of customer data is generally growing fast. However, it is typically scattered among a large number of sources. For instance, in database marketing elementary customer information resides in customer databases, but market survey data is only available for a subset or even a different sample of customers.

Conventional solution

A widely accepted approach within database marketing is to buy external socio-demographic data that has been collected at a regional level. All customers living in a single region, for instance in the same zip code area, receive equal values. However, the kind of information that can be acquired is relatively limited. Furthermore, the underlying assumption that all customers within a region are equal is at the least questionable.

Sentient’s solution

Data fusion provides a way out by combining information from different sources for each customer, using all overlapping information available (not just the region). Instead of having survey information available for a small sample, a virtual interview is created for every single customer by predicting the answers they would have given on the survey questions, e.g. what newspapers they read, their income, their profession, their interests, etcetera. DataDetective provides the unique technology to make this possible. The resulting enriched data set can be used for all kinds of data mining and database marketing analyses. This data set approximates the value of a real survey with the entire customer base at a fraction of the cost.

