Over a period of more than ten years we’ve heard many questions about Origins.  Here, they are collated to provide ready answers.  More detail?  Please don’t hesitate to contact.

Why should I measure cultural diversity?

A substantial body of research has established that cultural background is a key driver of consumer behaviour – whether for commercial organisations or government services.

Given Australia’s well-documented and dynamic multicultural demography, businesses can no longer afford to ignore how they interact with the 49% of Australians whose heritage lies beyond these shores.

If you can’t measure, you won’t have the evidence to understand, manage or monitor.

For Commercial Organisations and Not-for-profits

Insight into cultural diversity assists businesses to:

  • Understand how customers and prospects reflect the market
  • Assess market penetration by cultural segment
  • Detail which groups are profitable, and which present opportunity or risk
  • Dissect engagement through product use, churn, channel preference
  • Support a business case for action and deployment

All of which leads to a competitive edge in achieving better service delivery, higher sales and increased profitability.

For Government and Public Sector

Insight into cultural diversity assists agencies to:

  • Understand the cultural diversity of clients
  • Report on participation in and satisfaction with essential services
  • Have an evidence-base to support resource allocation
  • Ensure staff reflect the client base (workplace mutuality – see this short HealthWest video

These contribute to the achievement of organisational goals and a clear indication of delivery on the principles of access and equity in service delivery.

For further information, please see Why Use Origins and Business Benefits.

How can I measure cultural diversity?

There are several ways to measure cultural diversity.  The best way for you depends on several factors:

  • Do you own a large dataset?
  • Are you focused on customers, prospects, employees or geography?
  • Do you have expertise and sufficient resource to administer a survey – including design, promotion, management, collation and reporting?
  • Do you have in-house analytical resource, or do you prefer an out-sourced solution?

The following table summarises the options

Data Source Pros Cons
General Survey – ABS Census
  • High coverage
  • Inexpensive
  • Excellent for national and regional area reporting and analysis
  • Aggregated area-based data; not applicable to individual
  • Relies on Country of Birth
  • Ancestry data is ambiguous
  • No nuance
Context-specific Survey – Self-Reported Data
  • Mostly accurate
  • Allows for nuances eg parent/grandparent ancestry, granular definitions of ethnicity, world view, language skills
  • Good for small numbers
  • “Narrow and Deep”
  • Not practical for large databases
  • Response rate and response quality vary by context, empathy with purpose, quality of ‘sell’
  • Comparing with census data as a base to represent market view is not valid
  • Variable participant commitment and diligence
  • Privacy concerns; suspicion
  • High administrative overhead for users
Surname Tables
  • Less aggregated than Census
  • Simple (but error-prone)
  • Credibility depends on substantial globally-sourced data resources
  • Broad brush
  • Inference-based
First and family names with geography
  • High coverage – 99.5%+
  • High accuracy of name origin; v good correlation with cultural background
  • Ambiguous names use census-based probability
  • More accurate than Surname Tables
  • No privacy concerns
  • Highly practical
  • Cost-effective for medium/high volume turnover applications
  • “Broad and Indicative”
  • Not best suited for ATSIC
  • Probabilistic
  • One-dimensional; single code overall, and for each of first and family name
How does Origins work?

You need four pieces of information arranged in separate fields in a csv or tab-delimited format

  • A unique identifier. This allows you to link back to your behavioural data allowing you to reveal even deeper insight
  • A personal name (first or given)
  • A family name (surname)
  • A geographical indicator of usual residence eg a postcode or Suburb/State. This is optional, but when available, it activates the Enhanced Neighbourhood Insight feature to confirm and improve accuracy

Origins separately evaluates the personal and family names.  These are matched to Origins databases of over 3 million unique, globally-sourced names.  The software then uses the geographical indicator to confirm or modify the code by referencing ABS census data.  Finally, the software allocates each record to one of 260 detailed Origins CEL codes reflecting their most likely cultural and ancestral origin of that name combination.

Reports are produced summarising the information into broad cultural groupings to give an overview of the cultural mix of your data set.

The Origins database has been developed from comprehensive data sources covering around 1.2 billion individuals from around the world.  This is augmented with data gathered from more than fifteen years of genealogical, academic and local history research into the meaning and origin of names, for major regions across the globe.  The resulting output file allocates over 99.5 percent of records to the category reflecting their most likely ancestral origin.  For more, see Coverage, Accuracy & Privacy.

What is the Enhanced Neighbourhood Insight feature?

This unique feature improves the accuracy of coding.  Enhanced Neighbourhood Insight uses the latest census data to check that some names are allocated to the most likely code.  For example, we use the latest census data to help sort out whether “Pereira” is more likely to be Portuguese or Sri Lankan.  Or whether “Gutierrez” is more likely to be Spanish, Mexican, South American or Filipino.

How does Origins handle persons of mixed, or ‘hyphenated’ ancestry?

Origins can identify persons whose names reflect more than one ancestry – for example a person with an English or an Australian personal name and an Italian family name.

The confidence score given to each name combination is used to select or deselect people who are most likely to be of mixed ancestry.  Restricting a communication to names with high confidence scores is an effective way of avoiding communicating with individuals who are least likely to belong to the target group.

How can my business benefit from the use of Origins?

Typically, initial use of Origins generates insight and evidence about the cultural dimensions of customer behaviour.  This insight can then be used in many ways.  Here are four suggestions:

  1. To review how well the images and messages used by the brand connect with Australia’s multicultural consumer and labour markets.
  2. To provide a quantified evidence-base for setting goals and policies to improve engagement with multicultural Australia.
  3. To deploy in models where Origins is demonstrated to improve model performance and business outcomes.
  4. To provide benchmarks for monitoring over time. For example, to what extent do strategies, campaigns, and staff training promote change?

For more information see Business Benefits.

Can Origins be used as an element of my predictive models?

Absolutely!  In our experience, Origins works well with internal data, and other external data, to lift the performance of predictive models.  There are few cases where Origins does not contribute to positive model performance.

How do I access Origins?

Licensed software and data. A self-serve coding engine, delivered on one or two platforms to suit your requirements:

  • Desktop installation on one or more pcs, for batch processing
  • API linked to our Sydney-based server, providing the option of real-time or batch processing.

Licensed use of data. We do the coding, send you the codes, you do the analysis.

Geographical table. Origins by your choice of geographical unit.  For trade-area analysis, mapping or as an input to geographical decisioning.

Consulting. We package a service to meet your needs and objectives.

See Origins Products & Services for more information or Contact us to discuss which is right for you.

What does ‘licensing’ mean?

We enter into an arrangement where we allow you to use Origins software and data or a specified term and for certain applications.

At the conclusion of the licence, you will be required to delete Origins data, and any derivatives from that data, from all your systems.

How is Origins priced?

Software and Data Licensing: In-house Use

Pricing for in-house software and data licensing is quoted on a case-by-case basis to meet the specification of each client’s requirements.  Things determining the price include:

  • Value – as measured by the number of customers (depth), and the range of products, services and touchpoints (breadth)
  • Scope – use as a research/insight tool vs unrestricted use including deployment in business operations; range of use across divisions or affiliates within large companies
  • Length of Licence – three-year commitments attract the best annual fee

However, indicatively, annual licence fees start at A$17,500 plus GST, and range upwards to A$50,000 plus GST, per annum for Australia’s largest organisations.

Pricing for the use of an API connected to our Sydney-based server is similarly configured with an option for volume-based pricing where real-time coding is required.

Origins Geographical Data: In-house Use

Pricing for customised geographical tables derived from the Origins base file is quoted on a case-by-case basis to your requirements.  As with Software and Data Licensing, pricing for licensed use of Geographical Data is determined by Value, Scope and Length of Licence.

However, indicatively, annual licence fees start at A$5,000 plus GST, and range upwards to A$25,000 plus GST, per annum for Australia’s largest organisations.

Appended Data: Out-sourced Coding

In cases where you out-source Origins coding to OriginsInfo on a ‘per occasion’ basis, the price uses similar determinants for software and Data Licensing – ie Value, Scope, and Length of Licence.  The data we supply is subject to the same licence restrictions that apply to In-house use – ie you will be permitted to use the data for an agreed time.


For consulting assignments where we supply you with reports and extensive engagement/support, but no Origins data.  Indicatively, pricing varies between A$7,500 plus GST to A$30,000, plus GST, subject to context, scope and specification of requirements.  Concessionary rates are available to support genuine post-graduate level academic research.

Also see Origins Products & Services.

Is Origins data privacy-compliant?

Absolutely.  The data set from which the product is derived is compiled from a range of globally-sourced, publicly-available sources.  The Origins product and Origins base file is de-identified and contains no personally identifiable information.  See also Coverage, Accuracy & Privacy.

Is my data compliant with privacy legislation after I append an Origins code to a customer record?

Yes.  The Origins code itself is not personal information.  And, when it becomes part of a record which is personal information, it is not considered to be ’sensitive information’.

This statement is based on legal advice, industry association advice, and the Australian Privacy Principles guidelines produced by the Australian Information Commissioner (see APP Guidelines issued by the OAIC, April 2015, Paragraph B139).  These principles are consistent across all Australian jurisdictions.

Of course, for research purposes, organisations may choose to de-identify a data-set containing Origins codes to derive rich insight into the cultural dimensions of consumer behaviour at population, or population sub-set level.  See also Coverage, Accuracy & Privacy.

If I decide to use the Origins code to vary my message or offer (eg for web enquirers or callers to a call centre) am I in breach of the privacy legislation?

No.  Because Origins data appended to a customer record is not considered by the OAIC guidelines to be ‘sensitive information’, it is not subject to privacy legislation.

However, the bigger risk is reputational risk through poor marketing design or execution.  Organisations that design content, messages and deployment that is positive, creative and inoffensive to individuals are the winners.  It is important to remember that the Origins code is a probabilistic indicator of cultural background.

Can I create a list of customers or prospects where Origins is used either as a component of the selection criteria (eg through the use of a model) or as a selection criterion in its own right?

Yes.  There is no legal impediment in doing so.  As with the previous case, well designed and well executed marketing to cultural segments will determine the success of any campaign or customer management initiative.