How to fix discrepancies in your web analytics data
Google Analytics, like every web analytics tool, does not deliver accurate data. Here's how you can fix that.
Google Analytics, like every web analytics tool, does not deliver accurate data. Here's how you can fix that.
Google Analytics, like every web analytics tool, does not deliver accurate data.
There are many reasons behind this including:
There are certain actions though that can be exactly measured, as they are recorded in back end systems.
We know exactly how many transactions are placed, leads are generated, contact forms submitted, etc. For these actions, we can audit the accuracy of the web analytics and identify errors in the tracking.
This is critically important as these actions are nearly always macro conversions and, if they are not tracking correctly, we cannot evaluate the performance of marketing campaigns or the impact of website features.
These actions can never be recorded 100% accurately in any web analytics tool (you should not try and report your revenue to the tax office using web analytics data) but they should only be 2%-3% off reality.
If the difference is more than 5% (with a large enough sample size), you have an issue somewhere in your tracking. The code works, or no data would be collected at all, but it is not correctly submitting measurements to the web analytics tool in all cases.
To simplify the language within this blog post, I will be using transactions on an Magento ecommerce website compared to Google Analytics data as an example for the remainder of the post.
The first step is to check if there is a discrepancy.
Extract daily orders for 8 to 12 weeks for both Magento and Google Analytics and then compare performance at a daily level. As long as order volumes are high enough, the discrepancy should be fairly consistent for each day.
Ideally Magento should report transactions slightly higher than Google Analytics but no more than 5%. If the difference is more than that, you have an issue.
The key reasons for a difference in transactions recorded within Google Analytics (or any analytics tool) and Magento are:
The challenge is to identify which one (or more) of these apply to your business. The first two reasons can be identified through an internal investigation into what data is being recorded in each of Magento and Google Analytics.
For the second, check into what filters are applied and/or create a new Google Analytics View with no filters applied to see if that changes the data.
The third reason requires some analysis within Google Analytics.
Check the conversion rate for each device and browser version. If it is 0% for a certain option (with a decent number of sessions), you may have identified the culprit/s. Check more into the data or even check through making a transaction on that device/browser to confirm the transaction code isn’t fired correctly.
For reasons four to six, extract a list of the transactions from Magento and Google Analytics for three non-sequential days during the previous period (make sure these days contain the typical discrepancy) including the transaction ID. Compare the two lists using the transaction IDs and identify the transactions which are not recorded within Google Analytics.
Review these transactions for patterns of payment methods, particular products or just very large transactions. The challenge is that some missing transactions were just not recorded while others should fit the pattern of one or more of the above reasons.
For the final reason, check the location of your Google Analytics code on the page. If it is lower in the page than immediately below the <body> tag, that could be the cause.
Once the cause of the discrepancy has been identified, it should naturally suggest the solution. These solutions include:
Once you apply these fixes, the discrepancy should immediately reduce. Continue checking and making improvements until the discrepancy between your back end numbers and your web analytics numbers reduces to under 5%.
One final note, web analytics data can also be higher than that recorded in back end systems. This would be the case if duplications are recorded in GA but automatically excluded in back end systems (e.g. for transactions) or if data has been cancelled out of the backend systems e.g. cancelled orders, fake leads.