Steps Ahead

Tackling data health to enable analytics in front office investment banking

A study by Mosaic Smart Data

In recent years, investment banks (IBs) have invested vast sums of money and resources in data management and analytics initiatives, however, evidence suggest little ROI is typically achieved in one of the most critical divisions of the investment bank: the capital markets front office. Often, the problem lies in failing to aggregate, standardise and enrich data before implementing an analytics programme.

This paper highlights the challenges that front office divisions face in terms of laying the foundations for data and analytics based on a representative sample of IBs, explores the drivers behind these challenges, and suggests a set of recommended actions to achieve maximum ROI when implementing a technology solution.

Key findings:

When front office staff (mostly FICC[1] traders, salespeople and quants) across 11 global and regional investment banks have been surveyed with a questionnaire designed to assess data and analytics maturity, a Mosaic Smart Data study highlighted that:

  • 66% of investment banks struggle with data quality and integrity
  • 83% of investment banks have no real-time access to data and/or data analytics
  • 66% of investment banks’ staff struggle to access their data
  • 50% of investment banks find their reference data is unfit for purpose


A decade ago, analytics solutions only provided historical analysis. As the amount of data generated increased, technology vendors started to develop predictive analytics. As artificial intelligence (AI) evolves, data analytics solutions are rapidly evolving to become much more sophisticated.

Data analytics in investment banking emerged over the last decade, initially being focused around regulatory reporting with visualisations and descriptive analytics on platforms such as Tableau, PowerBI and Excel.

Stemming from, and propelled by, the requirements from regulators following the 2008 market crash, investment banks made large investments in data management during this period – circa USD 88 million per year, out of which USD 36.4 million per year were on data centre systems in 2023 alone.[2]

Fast forward to today, and forward-thinking banks are deploying prescriptive and predictive analytics and automation solutions powered by AI and machine learning. As AI continues to evolve, so does the sophistication of the data analytics solutions it underpins, but before banks can take advantage of this enormous power they must first address the ‘state’ of their data. The deluge of data requires a permanent, automated solution to aggregate, normalise and ensure ‘data health’ across the organisation.

In the current economic environment, ROI remains front of mind for all banks considering a data analytics solution to improve the efficiency, productivity and profitability of their front office.

Methodology and sample

The findings in the study were based on a number of sources:

  • Circa 6 million[3] FICC transactions across FX, fixed income, credit and multi-asset instruments including cash and derivatives
  • Analysis on data sourced from investment banks (tiers 1 to 3) in U.S., Canada, UK, EU, Japan, South Africa, South America and Australia
  • FICC data and analytics maturity self-assessment (proprietary Mosaic data questionnaire) completed by front office participants users and some data/tech SMEs to gauge the state of their data and analytics capabilities. This questionnaire also facilitated in building data and architecture maturity models used as an industry peer group comparison.
  • Interviews with IB stakeholders in FICC trading, sales and data management


The results of the questionnaire identified four consistent pain points:

  • Data quality and integrity is a critical challenge
    66% of banks struggle with data quality, gaps in important data points (e.g., customer static data) with some flows not being captured at all (most prevalent with derivatives and voice trade data)
  • Banks want real-time analytics capabilities
    83% of banks have no real-time access to data and/or data analytics, largely due to the lack of a central repository for FICC transactions data on top of which any real-time APIs or tools can be built.
  • Banks struggle to access their data
    66% of banks said that the data they found most useful for their analytics is challenging to access because it is fragmented across disparate systems, or they have no access at all and rely on tech/data SMEs to compile reports for them.
  • Reference data is unfit for purpose
    50% of banks have reference data with no unified counterparty identifier (discrepancies in naming across CRM systems, regions and product teams), especially for client static data. In a number of cases the data was altogether missing.

On average, the results did not identify any strong trends in data maturity levels, either across regions or IB tiering. In general, when implementing data strategies, larger institutions have bigger budgets but also more complexity, bureaucracy and constraints, whereas smaller institutions have smaller budgets but they tend to be leaner, faster to implement change and deal with less complexity.

Unsurprisingly, some contradiction was seen between front office participant and data/tech SME responses; the data/tech SMEs tend to find data accessibility much easier than front office users who sometimes stated that they had no access at all.

Key challenges

Data fragmentation
Many of the participating institutions found that they struggled with data fragmentation, with multiple data sources and reports being generated across the front office. This can produce inefficiencies and may lead to data inconsistencies across different reports or teams. On average 5-10 data sources need to be joined to get the full picture.

Data fragmentation also includes front-end visualisation, with the average number of systems being used by front office participants adding up to circa 2-3 to deliver all the information they need at any given point in time. These disparate front-end systems usually have multiple pages to navigate, which also adds to the complexity of accessing data.

Voice data
This is often the most persistent and pervasive data challenge for institutions. There are two main challenges seen in collecting and managing voice data:

  • Missed voice trades: most institutions do not capture missed voice inquiry data, largely because they do not have a proper implementation of a sales-to-trader (STW) ticketing system to systematically capture the client → sales → trader workflow.
  • Bloomberg chats – messages contain useful trade/inquiry data but the constraint is that the messages contain unstructured data which can be challenging to draw insights from. Getting useful transactional data from Bloomberg chat can be complex and there aren’t many successful and efficient technologies in the market for this.

Instrument static data (average based on Mosaic clients’ data)

  • Interest Rate Swaps (IRS)
    On average, IRS account for roughly 6% of the total flow and 35% of the total notional. However, Mosaic Smart Data highlighted that in more than 50% of the transactions there was at least one of the following dimensions missing from source data:

    • Delta / DV01 (i.e. Dollar Value of 1 bp, which is the key interest risk measure for interest rate swaps)
    • Floating rate index
    • Interest rate payment frequency
    • Interest rate payment day count convention

These are significant gaps as the lack of information of the traded Interest Rate Swap (e.g., floating leg index) prevents access to any insightful and risk-based (e.g. DV01) analytics on IRS flows.

  • Credit Redeemable Bonds
    Redeemable Bonds (also known as Callable or Puttable bonds, depending on the optionality embedded in the security) account for an increasing percentage of the bonds issued today in the Credit market. Thus, they account for a greater and greater portion of the revenue of the investment bank. Based on a Mosaic Smart Data analysis, on average Callable Bonds account for almost 50% of the Credit transactions and this sums up to a combined average notional of about 50% of the overall flow.
    However, missing or incomplete Callable Bond data is a significant gap in getting risk-based metrics to help maximise a business’ portfolio potential. Specifically, there are widespread issues with capturing in a systematic way:

    • List of the Redemption dates of the callable/puttable bonds
    • Redemption price of the callable/puttable bonds
    • Yield to worst of the callable/puttable bonds

These data gaps have a significant impact of the risk profiling of the callable/puttable bonds, and therefore in the ability of the investment banks to internalise and/or affect the risk in a cost-effective way.

Counterparty static data
This is one of the biggest challenges and pain points for most institutions when it comes to data capture and management. In most cases, Counterparty static data can be highly fragmented and not normalised across the several systems involved in the IB Front Office (e.g. Client Relationship Management systems – aka CRMs, electronic trading systems, pricing systems, trade booking systems).

Some of the key challenges with Counterparty static data include:

  • Lack of consistent unique identifier mapping across systems
  • Poor to non-existent Counterparty hierarchy i.e., differentiation of parent vs child legal entities
  • Little granularity for Counterparty segmentation / sector categorisation
  • Mis-categorisation of counterparties

Lack of or poor client static data can hinder any useful Counterparty-specific analytics or insights, for example and not limited to, Counterparty profitability/toxicity, Counterparty portfolio risk analysis, and Counterparty sentiment analysis.

It is in an investment bank’s best interest to update and upgrade their client static data management and client sentiment analysis to help the front office better understand the appetite of their clients and better anticipate what products they may like in order to make recommendations most appropriate for them with the aim of driving customer loyalty.

Looking ahead: What is the front office asking for?

  • Consolidated view for all FICC analytics
    This addresses the challenge of fragmented data sources and visualisation tools. Front office participants are looking for a one-stop-shop for all FICC analytics and not just for transactional data, but also including other key data (e.g., Positions, Axes, etc.)

There has also been an increase in front office participants asking for the above-mentioned data to be overlayed with or viewed adjacent to available market transactions like MiFID II APA and settlement data, to get a grasp of their market share, just to name one key use case.

  • Real-time analytics
    There is an increase in appetite for more predictive analytics and AI and machine learning-driven automation and insights, including but not limited to:

    • Profitability insights – hedging recommendations to automatically reduce risk
    • Client defection alerting
    •  Pricing analysis/yield clustering alerting

The road to progress

Achieving better ROI when implementing data analytics solutions hinges on the following:

  • Tailoring the data management initiatives and deliveries of data and analytics to front office use-cases. In data maturity self-assessments for example, although tech/data SMEs believe data is accessible and of good integrity, the front office believes and experiences the complete opposite, revealing a gap that needs to be bridged.
  • Focusing on the right ROI insights by measuring the impact of front office analytics e.g., Hit Ratio.
  • Working on building a unified target data model across all FICC data, to allow the consolidation of disparate front office data systems and easier access to analytics across products and regions.


  • No powerful analytics or AI can fully function if data is not normalised, maintained in an orderly fashion and gaps in the data enriched. This must be addressed as a first order priority before an analytics programme can progress successfully.
  • It is now a crucial time to invest in data and analytics to revolutionise the front office of investment banks – but ROI must be guaranteed in the current economic climate.
  • Making the right changes today can deliver significant long-term returns. Smart data and smart analytics can certainly be used to achieve a profitable improvement in productivity in the immediate term, but equally importantly they can also serve to future-proof the business.
  •  Mosaic Smart Data offers an award-winning solution, designed specifically for capital markets, by a team with deep domain expertise and years of experience. The product is in daily use by hundreds of people around the world at leading global investment banks and is setting a new category defining standard across the industry by maximising our clients’ ROI.
  • Following a recent pilot with Mosaic, a tier 1 bank reported a 20% increase in call volumes, 22% increase in call duration, 18% increase in inquiries – and 100% of its users wanted to move to production immediately given the tangible benefits enjoyed.

Capital Market Data Expert

Mosaic Smart Data Italy – Author:

Federico Balestri
Technical Account Manager / Head of Professional Services

[1] FICC = Fixed Income, Currencies and Commodities

[2] Source:

[3] The 6M transactions are based on sample extracts received from 3 investment banks (two tier 1 and one tier 2) and have been used for the analysis outlined in section: Key Challenges – Instrument Static Data.