At OriginsInfo we offer a quarterly newsletter to our subscribers. Enter your email address below to subscribe.

Data Reliability

No classification of cultural origin is 100% accurate and, other than through DNA analysis, a precise definition of cultural origin is elusive. Most definitions tend to reflect country of birth, language spoken at home, religion, or ancestry. Each has its own limitations usually associated with the method of collection, the response rate or the development of a classification to suit a range of purposes.  Even where census data or other collected data is available, none may assist with customer-level analysis, or in developing a detailed understanding of the drivers of customer attitudes and behaviour.

Collecting data from consumers on cultural origin is rarely successful due to accuracy, non-response, or absence of apparent purpose. Where it does occur, collection usually relies on self-completion or self-perception, or both. In the case of census data, reporting is usually aggregated by geography, or by classification hierarchy, or both - thereby sacrificing granularity for reasons of practicality and confidentiality.

The view of OriginsInfo is that a person's name is a very good surrogate for cultural identity, particularly where it makes use of the substantial body of research into name patterns and their meaning.  More importantly, measurement using Origins generates evidence of differences in behaviours and attitudes that are correlated with cultural origin.


Origins achieves a coding rate of more than 99% of records from reasonable quality customer files containing personal and family names. This is achieved through the vast accumulation of names and associated information obtained by OriginsInfo.

The residue will be made up of a small number of unrecognised and extremely rare names, and names that are ambiguous or 'too close to call'.

Validation and Accuracy

The proportion within each cultural group derived from name analysis is remarkably consistent with nationally aggregated census data. This is supported by comparative mapping of census data and Origins categories that have a very good correspondence with the census classifications.

OriginsInfo commits considerable resource to continuously checking the fit between known origin and the software allocation.  Results from this work, and two separate pieces of validation research on databases where cultural background was known have confirmed that individual level accuracy in coding the Australian population to Origins Types comfortably exceeds 85 percent.

From a statistical point of view, individual level errors do not materially impact the insight that Origins delivers or the ability for users to implement targeted communications and actions.  However, it is appropriate to share the OriginsInfo understanding of some areas where the error rate may influence the content and style of communication messages.

The accuracy rate does vary from one code to another.  Names of Islamic, Chinese, Vietnamese, Indian and British (Anglo-Saxon and Celtic) origin achieve accuracy rates in excess of 90% at the Origins Type level. Read about how Origins classifies names.

Accuracy rates for southern and eastern Europeans, and Armenians, are around 90%, while Hispanic coding achieves in the 80-90& range. Slightly lower levels occur with names originating from northern Europe and France. The weakest performance is 50-80% for Aboriginal and Torres Strait Islanders, and members of the Black Caribbean and Jewish communities - where there is a greater tendency to adopt Anglo-Celtic name styles.

The confidence score generated by Origins allows users to define cut-offs to exclude those matches that are below an acceptable threshold for use in targeting communications to individuals based on the most likely cultural origin. This is an effective and flexible way of screening out individuals who are less likely to be in the target group.

There are also some specific cases where individual level accuracy may be impaired:

  • Adoption of partner’s family name in cross-cultural marriage. The instances where this diminishes accuracy are relatively low – partly because of cross-cultural marriages still being less common and partly due to the trend for females to retain their original family name.  In aggregated population data this is less of an issue because there is often a counter-balancing flow between cultures reducing the statistical error in analysis and segmentation.  For example, the incidence of Greek females marrying Anglo males is partly compensated by Anglo females marrying Greek males.

  • Transliteration from non-Roman scripts may produce a name that is common in another part of the world. The family name 'Lee' is a case in point, where it is a common name in Britain, China and Korea. Taking account of the personal name (or the middle name) usually minimises the risk of misallocation but a small number will be assigned to an incorrect code.

  • Offspring from long-term migrant families.  Some people have distinctive names even though they may be second, third or more generations removed from their original migrant ancestors.  The extent to which such people retain behavioural characteristics that are relevant in contemporary marketing and business operations is debatable and may vary by cultural group and business context.  Attitudes to finance products, propensity to experience particular health conditions, demand for particular cosmetic products, travel and telephony preferences, and the influence of religion on consumer behaviour may all persist for many generations.  To mitigate the impact and allow users to take this into account, Origins reduces the confidence level assigned to a person with, for example, an Italian family name but a personal name that suggests a clear Australian or Anglo affiliation.


  • Customised segments to reflect business needs

  • Origins is more reliable than many other  methods of cultural data

  • Origins data is very accurate

  • Accuracy varies by cultural groups

Contact OriginsInfo

Australia and New Zealand

Contact us
+61 418 359 711

Europe and USA

Experian Marketing Services