The Open Banking Advantage

A Data-Backed Comparison of Data Quality with Screen Scraping

Dr Zhitao Xiong, Head of Data Science (Frollo)

Open Banking revolutionises financial data sharing, empowering institutions to create personalised experiences and innovative products. While screen scraping remains an option, it doesn’t come close to the data quality offered by Open Banking

This article dives into a two-month study (November 2023 – January 2024) analysing anonymised data from 9.7 million CDR transactions and 1.3 million screen-scraped transactions in the Frollo money management app to unveil the clear advantages of Open Banking.

Sample overview for data quality comparison between cdr and screen scraping

Richer data for deeper insights

Screen scraping inherently limits data due to its reliance on what’s visually accessible. Key fields like merchant category codes, biller codes, and transaction types – crucial for understanding financial behaviour – are often missing. For example, CDR data provides biller codes like “75556” (Australian Taxation Office), enabling accurate categorisation by Frollo’s enrichment service, IDEaS. Even merchant names, though sometimes present in screen-scraped data, lack consistency. Our analysis revealed that CDR data offers merchants names in 52.3% of transactions compared to just 31.7% for screen scraping – a significant 65.7% advantage.

CDR data quality analysis - Merchant names in transaction data

Cleaner data for smoother operations

Large datasets can contain inconsistencies like meaningless symbols or text fragments, which pollute data and hinder categorisation by advanced, AI-powered enrichment engines like Frollo’s IDEaS service. Imagine financial data as a conversation – the actual transaction details are the “signal” you want to understand, and the meaningless symbols are the “noise” that gets in the way. IDEaS relies on  “meaningful” words and numbers to extract clear signals from the data. This can include things like merchant identifiers or postcodes. 

Analysing word frequency across both datasets revealed a stark difference. In CDR data, only 14% of words were irrelevant, compared to 34% in screen-scraping data. This “dirty” data translates to user effort. Frollo users re-categorise screen-scraped transactions 30% more often than CDR transactions, reflecting the impact of data quality.

CDR data quality analysis - Signal vs noise in transaction data

Open Banking – The clear choice

Open Banking offers a secure, consistent, and standardised method for data sharing. It delivers cleaner and more insightful data than screen scraping, resulting in a better user experience and return on investment.

As a pioneer in Open Banking, Frollo has witnessed CDR data become more robust over time. Upcoming improvements to data standards in July this year promise even better data points and structure. 

With continuous advancements, Open Banking presents an extraordinary opportunity for financial institutions to transform operations, personalise services, and unlock data-driven innovation.

Get in touch to discuss how Frollo can help your business deliver a more streamlined, personalised customer experience with Open Banking.

Managing Data Holder risk and compliance

Managing Data Holder risk and compliance

How Australian Unity leverages the Frollo Data Holder testing platform

Data quality where it matters
CDR Gateway header 2

Data quality where it matters

Taking a use case based approach to data quality comparison and improvement

You May Also Like