Data Discrepancies: GSC vs SPA

Modified on Wed, 15 Feb, 2023 at 3:21 PM


Google Search Console data representation has its own limitations; some of the highlights are as follows:

  • To protect user privacy, GSC doesn't show all data. For example, we might not track some queries that are made a very small number of times or those that contain personal or sensitive information.
  • Some processing of our source data might cause these stats to differ from stats listed in other sources (for example, to eliminate duplicates and visits from robots). However, these changes should not be significant.
  • Technical differences between tools. We are using the Search Console API which processes data differently from the Search Console UI.


In the following examples, the “View” column represents the available data splits in the GSC UI. The “Overall” view, presented in the GSC screenshot below, is different from other views.  In both examples, we can see that the “Pages” view shows more clicks and impressions in comparison to other breakdowns (overall, queries, countries, devices). This is one example that highlights the extent of data discrepancies.


The data presented in SPA will reflect, more closely, the metrics related to “Pages” rather than the “Overview”. 


Example # 1: Bushnell, May 2020


Google Search Console Performance Report

View

Clicks

Impressions

Average CTR

Average Position

Overall

159K

2.22M

7.1%

15.8

Queries

79879

822368

27.11%

2.87

Pages

167849

4 495 602

6.28%

13.17

Countries

158834

2222789

5.21%

21.9

Devices

52944

740929

8.23%

13.2

Search Appearance

78371

966534

7.41%

9




Schema Performance Analytics (SPA):



Example # 2: Bushnell, June 2020



Google Search Console Performance Report

View

Clicks

Impressions

Average CTR

Average Position

Overall

122K

1.57M

7.8%

17

Queries

66995

551814

24.74%

2.87

Pages

126985

3164315

7.01%

13.42

Countries

122489

1572727

6.20%

24.8

Devices

122489

1572727

8.93%

13.80

Search Appearance

55247

698801

6.17%

12.08



Schema Performance Analytics


From the above samples, we can clearly see that GSC matrices reflect maximum clicks and impressions for page view even though the number of pages are limited to 1000 in UI. These data discrepancies are known by Google Search Console and are explained here for different reasons: https://support.google.com/webmasters/answer/6155685?hl=en#groupingdata


Whereas, SPA uses the GSC API to collect data on a daily basis. API gives the advantage to retrieve more than 1000 rows, however, when metrics such as clicks or impressions are pulled using API and broken down by different dimensions (page, search appearance and query), metrics are not the same as if the breakdown had not been applied. For example, the results of the brand query could not be equivalent to the overall/unfiltered results. As overall results provide an aggregate of all queries while brand queries (branded and non-branded) reflect the aggregate of tracked queries and omit the performance (clicks, impressions, CTR, position) of untracked queries. Therefore, anonymized queries are omitted, and data is truncated due to serving limitations by Google. This same reason is applicable to data inconsistencies for search appearances, scopes and other measures in SPA.

Therefore, in order to pull data for different data dimensions in SPA, first, we pull one row per URL, with the total of clicks, impressions, and CTR. After that, a separate stream pulls the breakdown by query and search appearances. The following scheme is used while pulling the information from API and also presenting different visualizations in the SPA dashboard. 

  • Overall Results 

    • All Features + All Queries

  • Specific Feature 

    • Feature +All Queries

  • Specific Feature Brand Query (Branded/NonBranded) 

    • Feature + set of tracked queries

  • Overall Brand Query (Branded/NonBranded)

    • AllFeatures + set of tracked queries

The users of the dashboard don’t need to memorize or handle the above information while extracting and filtering the information. The above information is provided to clarify the gap between different data segments.


Why do I see gaps in reporting for “No rich results”?

You may come across graphs in reporting that appear to show missing data. This is expected behaviour for tracking “No Rich Results”. Google doesn't report on this category, so Schema App computes these metrics with the formula:

All clicks - Sum of Search Appearance Clicks = No Rich Result Clicks


A graph showing data gaps for the "No Rich Results" data.


Ideally, the "No Rich Results" category should be a positive number, however, Google sometimes reports sum(search appearance clicks) more than overall clicks, therefore the formula results in a negative value which shows up as missing data in the visualization. 

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article