Steps Ahead

Building data analytics?
Start here… Normalisation

Data analytics holds huge promise for financial market participants – from artificial intelligence platforms which can guide traders towards opportunity, to advanced quantitative analytics, to driving back-office efficiencies. However, before any of this potential can be realised, argues Matthew Hodgson, CEO and founder of Mosaic Smart Data, firms need to start with a solid foundation of properly normalised and enriched transaction and market data.

Without this key building block, analytics can only deliver partial results. With it, firms can maximise the value of the data flowing through their organisation and transform it into a powerful source in order to enhance efficiencies and profitability.

The Covid-19 pandemic has necessitated some of the biggest changes in working practices for financial participants in decades. Firms have been forced to adapt to remote working for what might be the first time, as travel restrictions and social distancing have made usual working practices difficult. This has greatly increased the need for technology solutions that can keep operations running smoothly and teams communicating effectively from any location.

Among other things, the need to work remotely has greatly accelerated the trend towards digital transformation in the capital markets. Where firms already recognise that investing in ways to extract more value from their data is important, the need to develop new digital tools to facilitate and manage a distributed workforce has transformed this into a critical priority.

This, however, has created a new set of challenges. Many firms that do now realise there is an urgent need for digital tools such as data analytics, simply haven’t yet built the data foundations necessary to make these tools available and effective.

Take, for example, artificial intelligence (AI). For many, AI in the financial markets has underdelivered. Just a few short years ago, vendors and institutions alike were predicting a tidal wave of transformation in the markets as AI began helping firms to do everything from speeding up compliance to managing low-touch sales relationships.

This enthusiasm was not misplaced – artificial intelligence, as well as other forms of algorithm-driven analytics, have an important role to play in the future of financial market operations – however, the reason that many of these programmes are taking longer to deliver is that they have not yet been built on the right data foundations. To quote Google’s Research Director, Peter Norvig: “More data beats clever algorithms, but better data beats more data.”

No matter its purpose, all analytics programmes require fundamental data foundations in order to be effective. The very fabric of a firm’s data must be integrated and used in a way that is frictionless, for it to be truly valuable. As such, data sets must be harmonised and standardised into one consistent format and cover as wide a set of relevant transaction and market data as possible. Furthermore, each data entry should be as comprehensive as possible, with all relevant fields captured for every entry.

This may sound obvious, however, achieving such a unified, cleansed and enriched data set in the financial markets is often far from straightforward. Within market-facing firms, trades are being executed across myriad electronic trading venues including bilateral liquidity streams and by traditional over-the-counter protocols (i.e. voice). Within the FICC markets, each trading network adheres to its own messaging language for passing and recording trades and there will often be wide variation in the fields captured for a given trade. To add to the complexity, data which firms bring in from external sources will have been processed in a way which is unique to that data provider and cannot simply be added to this new unified data set.

In my previous roles as Head of Trading, not having a clear idea of what transaction data we were producing across all our global locations and across all of our trading channels with our clients was a significant disadvantage. The inability to transform it into one standardised format to deliver a coherent view of what was really happening in the fixed income and FX markets was a barrier to our growth and profitability. Knowing how important this normalised data stream is, was one of the reasons I founded Mosaic Smart Data. As such, the company now offers the ability to facilitate best-in-class data streams at both the micro and macro levels, providing an invaluable asset to institutions, enabling users to understand their activity in a comprehensive way and enabling them to deliver a more tailored service to their customers.

Without data normalisation, the challenge participants face is that each trading channel taken in isolation provides only a partial impression of market activity. Basing trading decisions on such partial and narrowly applicable information flows will, at best, lead to compromised outcomes and severely limit the value which can be derived. Furthermore, specific analytics projects may simply be impossible if data sets don’t include the fields which are required for the particular use cases.

In response, many firms have resorted to buying in data sets from vendors who have done some of the work to aggregate data across venues. While this data no doubt has its uses, such an approach can be extremely expensive. A recent paper by research firm GreySpark Partners found that the beneficial cost savings of applying the latest technology innovation, which includes access to cloud infrastructure, open-source software and APIs, have been entirely eroded by the rising costs of data purchases.

The sad irony is that the purpose of initiating data analytics and AI projects is to make a firm more profitable and efficient; however, the costs of acquiring the raw materials to implement them – i.e. data – render the project compromised from the outset.

An altogether different response is therefore required. Instead of relying on purchased, third-party data, firms need effective data normalisation and enrichment programmes which can realise the value of their internal transaction data. By leveraging software specifically designed for this purpose, institutions can begin the process of building a solid data foundation which is flexible enough to serve all their analytics requirements including client analytics and performance, compliance, surveillance, regulation reporting, risk and prediction. The enormous benefits of transforming transaction data from a disparate set of individual sources into one, flexible and highly valuable asset immediately become clear.

This process
has three
distinct stages.

Firstly, it is vital to understand what the data needs and analytics ambitions of the firm are. It isn’t simply a question of building as large a data lake as possible. Instead, firms need to begin a data normalisation project with an understanding of what key questions users wish to answer from the data. Without this important first step, the institution might well discover that crucial data fields for their analytics ambitions have been left unrecorded. Firms, therefore, need to audit their existing data assets and thereafter define their core KPIs and build a roadmap which establishes their starting point and their analytics goals. Such a process can be done with internal resources, or it can be done with the support of an experienced third party which can bring in broader expertise and share best practice.

Once these source systems have been identified, and the parameters and ambitions are established, the second stage of the process, of aggregating and normalising data, can be undertaken. This is the task of bringing together relevant transaction data sets from across the organisation and reformatting them so that data fields are consistent. This seemingly small step quickly translates into a giant leap forward for the organisation given that the data is hygiene-checked, remediated, enriched and can be made available via API to all consumers within the organisation. As an example, it would mean that analytics aimed at understanding a firm’s global exposure to risks associated with GBP could be seen in real time across all channels of activity. When it comes to risk exposure, such a consolidated global view is far more powerful than individual measures analysing different platforms.

There is an important third and final step which is to enrich this normalised data set with additional fields of data that are missing from the transaction record. This is where external data sets can be employed to ‘plug the gaps’ in a firm’s data. This includes enhancements such as using market data to enable market impact comparisons between the firm’s activity and the markets as a whole, but it can also include far more complex additions to the data set, such as introducing risk calculations onto the data record for cash or derivative trades. When a firm is considering its position for any instrument – spot, forwards, swaps, futures, and more – the need to really understand what is happening in the markets is an imperative.

Adding to the complexity of this process is the need to deliver aggregated, normalised and enriched data in near real-time. After all, data analytics insights are only useful when they are current. Technology in the financial markets have to contend with extremely high frequency and data volumes. Software not designed specifically for this speed and size may be ill-equipped to handle the complexity of this environment. As such, when firms are looking at the technology solution to handle their normalisation programme, it is vital to consider whether their vendors have the capability to succeed.

By choosing to engage with a data normalisation specialist provider, the firm will benefit from the most relevant and up-to-date thinking around the marriage of transaction, market and reference data and in so doing, time to market for value added analytics projects is shortened and, at the same time, at a significantly lower cost than an internal build.

With this three-stage process complete, the firm now has the building blocks of a solid data foundation on which firm-wide analytics projects can be launched with confidence. Those firms which undertake this important ‘subterranean’ groundwork before embarking on high profile analytics or artificial intelligence projects can expect to find that the route to value is smoother, quicker and offers vastly improved outcomes. On the other hand, those firms which attempt to skip this fundamental stage of data preparation will find that projects take far longer to deliver and, ultimately, provide limited and compromised value.

An increasingly digital world requires companies that urgently need to be forward-thinking in their use of technology. As success in the financial markets becomes more and more reliant on being able to effectively derive actionable insights from data, those firms which begin with solid data foundations will likely see themselves with a significant and sustainable competitive advantage.

FXLIQUIDITY Live Sample Spread Analysis:

*FXLiquidity is a new free analytics service that provides weekly updates on liquidity, volume and spread changes in 12 of the world’s Major, Latam, APAC and EM currencies.

FXLIQUIDITY In collaboration with:


Mosaic Smart Data is transforming the financial industry, empowering market participants with analytics tools to find and retain a competitive edge. In an increasingly data-driven world, we provide finance professionals with the ability to quickly harness available information to drive the right business decisions.


Get in touch to find out more.