How to Understand Data Quality and Mitigate Bias

Understanding High Quality Data

Obtaining high-quality data isn't always an easy task. Even some of the most highly trusted platforms, including Search Console and Facebook, have made catastrophic mistakes that led to marketers making bad decisions and negatively impacting businesses overall.

Obtaining high-quality data isn’t always an easy task. Even some of the most highly trusted platforms, including Search Console and Facebook, have made catastrophic mistakes that led to marketers making bad decisions and negatively impacting businesses overall. For example, in April 2019, Search Console contained a bug that de-indexed pages for about 12 days and removed a critical manual action feature. This oversight resulted in pages not appearing in the SERPs, and in addition, marketers could not rely on data that was collected within the time frame that the bug was happening. Another example occurred in February 2021, when Facebook was generating inflated ad metrics. Facebook ignored these problems as they were producing more money off of marketers.   

In the world of SEO and the organic marketing channel, you must fully understand and determine if the data you are receiving is high quality. After assessing the quality of the data, you must then be able to avoid bias, or preconceived notions, when making decisions for marketing. Choosing your key performance indicators (KPIs) carefully and pinpointing any bias within the data is critical to avoid losing money in eCommerce conversions.

Seven Types of Bias

Below we’ll discuss the seven different types of bias to be aware of before making decisions based on misguided information:

  • Selection Bias
  • Algorithmic Bias
  • Interaction Bias
  • Click Bias
  • Bias Due to System Drift
  • Omitted Variable Bias
  • Social Bias

Selection Bias

Selection bias happens when data from participants differs from data of the population of interest. This type of bias happens when an authoritative platform misguides users and marketers into making decisions based on data that appears to be trustworthy. This type of bias is tough to prevent because most marketers look to platforms like Facebook and Search Console to build their marketing strategies. As marketers, what can we do? Always keep up with the latest marketing trends so you can quickly respond before there is a disaster. It is our job as marketers to be proactive instead of reactive. Don’t let the client find out from others about any problems. Rather, be aware and prepared by having something ready to respond to the client when they find out.

Algorithmic Bias

Algorithmic bias is another machine learning model that is exposed to massive amounts of data. This is where search engines, for example, may make judgments and predictions about the information they processed. The underlying problem with this is that these algorithms work for certain situations, but more complex scenarios could result in disastrous results. The algorithm would need to be updated so that it fixes these issues and doesn’t keep showing biased search results.

Cartoon by xkcd.com, creative commons license (https://xkcd.com/license.html)

Interaction Bias

A user’s interaction based on using an algorithm and how we interact with it is interaction bias. Users usually have a specific agenda on what they want to accomplish when they are searching the internet. Every user interaction and their bias to purchase a product or service can be very different. It’s not accurate to use a survey to make decisions, such as decisions based on products such as Microsoft Clarify. These heatmaps only record what a user is doing, not the user’s intention. The information obtained is not absolute. You can’t predict with 100% certainty what all users will do. Anyone that says they can predict this would be contributing to interaction bias. 

Click Bias

Click bias is the idea that all pages that appear on the first page of Google’s search engine results pages (SERPs) will automatically translate into users clicking on the links, going to the websites, and converting. The truth is, even if a page appears in the #1 position, there are multiple other areas that the user might click instead. Clicking occurs not only on paid ads but also on Google My Business (GMB) and Google Maps. Similarly, there is the idea that if a person doesn’t click the first results, the SEO strategy for that page must be poorly optimized. The reality is, being the first result on the SERPs doesn’t 100% guarantee a click, though your odds do drastically increase. 

Bias Due to System Drift

When the user interaction model changes based on how the users interact with the system, it is called bias due to system drift. Search engines constantly experience system drift when they add new features like Google Maps for non-branded keyword queries. The same thing happened with the introduction of Google My Business for branded keywords. Changing the layout of search engine result pages forces the user to react with the queries differently than it previously did in the SERPs. Thus, Google has its own agenda to promote both Google Maps and Google My Business, and that implementation forces a system drift.

Omitted Variable Bias

A type of bias that happens more frequently than many employees would like to recognize is omitted variable bias. Omitted variable bias occurs because employees are often given large data sets of information to input in Excel or other spreadsheets. Manually imputed datasheets, unfortunately, tend to have human errors, including the omission of variables. When data like this is used for hospital or medical purposes, for example, one minor calculation could have serious problems, and in some cases even including death. 

Social Bias

Social bias is positive and negative references against individuals and groups based on their social identities, including race and gender. Such thoughts are based on opinions and not facts. When the system uses these stereotypes and prejudices, individuals will start acting on their biases, which becomes discrimination. If a social platform allows stereotypes and prejudices, it’s unfair to say that this data is accurate because it is based on opinion. Beware of data sets such as these because they can have negative impacts if you choose to support data based on opinions instead of facts.

Preventing Data Bias

Now that we are familiar with all the problems and types of data bias, what can we do as marketers to prevent data from being biased? Omnichannel data analytics, also known as omnichannel analysis, is key. Incorporating omnichannel data analytics involves using various sources to obtain data and enhance the user experience (UX).

Owned first-party data

First-party data is important. Companies that collect data from their own sources can use this information as a marketing advantage to target their niche, including online and offline sources, which can be its website, CRM, applications, social media, and surveys. The collection of this data is vital, but keep in mind that it may include selection bias and social bias if the person collecting the data is not careful.

Earned data from the organic channel

Earned data is a very important section, as well. Watching platforms like Google Analytics and Google Search Console can easily help any SEO professional make better decisions and see success from the results. However, this earned data can also be biased due to selection bias, algorithmic bias, interaction bias, and click bias. There is nothing you can do preventively to stop this from happening, but being able to quickly make changes to the channel in-house or with your agency can make all the difference for disaster recovery. 

Paid Performance Marketing

Much like the organic traffic, PPC marketers and ad campaigns can also have difficulty with biased information, especially when it comes to selection bias. Google Ads being placed incorrectly or towards the wrong audience could force high amounts of your budget wasted, even if you build the perfect campaign. Paid advertising deals with the same problems of algorithmic bias, interaction bias, and click bias, as well.  

Other ways to market like TV Advertising, print, and direct mail

Marketing created in traditional advertising spaces such as television, print media, and direct mail is often biased. In most cases, companies place ads for certain TV shows that target a specific demographic. Similarly, direct mail could target homeowners only to refinance their houses. Print ads could be an article posted on a bus stop so that those who are fed this ad can easily purchase their first car. These types of ads are usually, but not always, biased. For example, TV commercials that hit larger, more diverse demographics, such as ads in the Super Bowl, are seen by everyone that watches the big game. These aren’t only people who love football but also those viewing just to see the new, creative commercials. The point is that you don’t base your entire website on these decisions because you know that they made the ad to target a larger audience. You wouldn’t use the data and make all decisions based on biased material for the same reason. 

Experts in Understanding Data Quality and Bias

If you are ready to understand the data surrounding your target audience and you want to avoid bias in your efforts, the digital marketing experts at Tandem Buzz are here to help. Our busy bees know how to avoid bias in search engine optimization (SEO), pay-per-click (PPC) marketing, landing page optimization, and more! Contact Tandem today and have your business buzzing.