Is bad data worse than no data?


#1

Love it or hate it, data processing is now an integral part of every modern workplace – including arts organisations. You track everything – from footfall to clicks on your website – to analyse your performance, adapt your programming and inform funding applications.

Lots of us are afraid of getting it wrong, sometimes to the point where we avoid collecting any data and have nothing at all to work with. Worse still, some of us collect bad data – data that’s incorrect, incomplete, irrelevant, duplicated, noncompliant with GDPR, or more than one of the above. Here’s why and how to avoid those pitfalls.

What makes bad data?
Let’s suppose that you make a “phone number” contact field a mandatory part of the checkout process. If there’s a good, obvious reason why that information is necessary – you have the kind of events that are frequently cancelled and need to contact customers at short notice, for example – you’ll likely get good data.

However, if the reason is unclear (or non-existent), it’s quite possible that the customer will enter a random number just so they can get to the next part of the form. This makes life difficult because just about any eleven-digit number that starts with 0 looks like a valid UK phone number, so it’s not obvious when the data is bad – you’ll only find out at the point where you do need to contact that customer.

Worse yet, numbers like “0123456789” might be used by multiple different customers, causing the deduplication function to flag false positives and potentially losing you valid data in the process. Here, no information is much better than incorrect information.

Similarly, including a mandatory post code field you don’t need in a checkout process is going to result in made up post codes, which will skew important data about where your customers are coming from and how far they travel to reach you. This can impact funding applications.

How do we get good data?
The best (and most GDPR-compliant) basic datasets come from specific questions that are carefully chosen, easy to interpret and quick to answer. Start with something like:

  • Where are we getting our data?
    Hard facts, like ticket sales and attendance reports, are your baseline. You can find out a lot about your audience by looking for patterns in the sales data without having to ask intrusive demographic questions. Avoid privileging just one type of data.
  • Will we be able to analyse it in a digestible way?
    It’s quicker and more practical to analyse figures, postcodes and one-word answers than paragraphs of text, so make sure you can present your findings on a graph or data table. Save the anecdotes for your website.
  • Do we understand why we need it?
    Why are “title”, “age” and “gender” mandatory fields on your form? Will you ever use that information? If you’re not interested in the answer, don’t ask the question.
  • Do our customers understand why we need it?
    There’s a world of difference between a mandatory field labelled “Phone number*”and “Phone number* (to contact you in case of a cancellation)”. Being transparent about how you’ll use information – particularly contact information or more sensitive data like equality and diversity monitoring forms – makes it more likely that what you collect will be accurate.
  • What can we do to avoid and mitigate duplication?
    Make sure your staff are familiar with any deduplication tools are available to them and discourage them from doing things like entering random information to avoid leaving fields blank – you can do this by giving serious thought to which fields are mandatory.
  • Would I personally bother completing this form as a condition for a purchase? Would my older relatives understand how to do it?
    If your customer sees data collection as an obstacle to checking out – be it because they can’t understand why you need the information or because it’s overly complicated ¬– they won’t give you the data OR complete their purchase.

Good data can tell you who you’re selling to, who you need to work harder to reach, how well you’ve been able to forecast sales and attendance, and how you can improve. It can help you identify patterns and target your marketing campaigns more effectively, or adjust your targets if you’re persistently falling short. Bad data will leave you with all the same questions, but no answers.

How can Monad help you collect good data?

  • Our native segmentation tools help you narrow down your data sets into more precise categories and combine them where relevant.
  • Our HotJar integration allows you to add feedback forms, polls and surveys to your site, and helps you identify usability problems using heat maps and recordings of people interacting with the site.
  • Our MailChimp integration helps you evaluate the reach and effectiveness of your email marketing campaigns, making sure enough people are reading relevant content.
  • Our Audience Finder integration allows you to compare your findings with data collected from a national pool of arts organisations (and contribute yours to their database). Look out for more on Audience Finder in a forthcoming post.
  • Our ArenaMetrix integration lets you set up dashboards that can help you track your performance and KPIs easily and produces sales and attendance forecasts based on historical data.