Social Networks

We use cookies to enhance your experience. By continuing to visit this site you agree to our use of cookies. Learn More

Got it!

Data Analytics Driven Automation

Revolutionising

Financial

Data Analysis

Investment Bank Equity Research

Harnessing Generative AI for Automated Equity Research

Investment Bank (IB) Equity Research partnered with Calimere Point to conduct a proof of concept (POC) exercise. The goal was to assess the potential of Generative AI models in automating the extraction of key financial metrics from unstructured company filing data, transforming it into a structured Excel output for use in IB's research models.

The goal was to leverage Generative AI to automate the extraction of key financial metrics from unstructured quarterly reports of major insurance companies - a task traditionally requiring hours of manual work. The POC aimed to transform these metrics into structured Excel outputs for IB's research models.

While not without challenges, the results were promising: the exercise demonstrated that AI could indeed automate this process, potentially revolutionising the speed and efficiency of equity research. By testing the limits of current AI technology in handling real-world financial data complexity, this project has implications that could reshape how financial analysis is conducted in the future.

AI in Equity Research: Navigating complexity for greater efficiency

The main challenge in applying Generative AI to equity research is the variability in financial report formats and structures.

This variability ranges from semi-structured tabular data to highly unstructured information in charts and narratives. The financial industry is actively exploring AI-driven solutions to automate data extraction, assist in analysis, and improve overall efficiency in equity research. Approaches include pure AI models, hybrid systems combining traditional analytics with AI, and sequential AI processes. These efforts aim to standardise data handling and accelerate research workflows.

It's important to note that while there's a clear trend towards automation and AI-assisted research, the complexities highlighted in this Use Case suggest that human expertise remains crucial in interpreting and validating AI-generated insights in equity research.

Background

This PoC exercise was designed to explore the frontier of Generative AI applications in financial analysis, specifically focusing on the automation of data extraction from complex financial reports.

Scope

The POC focused on a carefully selected group of 5 insurance companies, each representing different geographical markets and reporting structures:

Legal and General (UK): A leading provider of life insurance, pensions, and investment management services.
Sampo Group (Finland and Nordics): A major player in the Nordic and Baltic insurance markets.
Hannover Re (Germany): One of the world's largest reinsurance groups.
Just Group (UK): A specialist in retirement income products and services.
Allianz (Germany): A global leader in insurance and asset management.

This diverse selection allowed the team to test the AI models against a variety of reporting styles and regulatory frameworks.

Project Structure

The POC was structured to maximise efficiency and minimise resource allocation while still providing robust insights:
Technology Stack

The project leveraged cutting-edge Generative AI technologies, complemented by traditional data analytics approaches. This hybrid approach aimed to combine the strengths of both paradigms.

Resource Allocation

Two members of Calimere Point's data science team were dedicated to the project, bringing specialised expertise in AI and financial data analysis.

Timeline

The development phase was time-bounded to 2.5 weeks, necessitating rapid iteration and focused problem-solving.

Infrastructure

The entire POC was conducted on Calimere Point's proprietary data analytics architecture (CPRA Cloud), providing a secure and powerful environment for AI model training and testing.

Cost Efficiency

Remarkably, this was executed as a cost-effective POC exercise, valued at $25k, demonstrating the potential for high-impact results with a modest financial investment.

Industry: Investment Banking

Key Challenges in Equity Research

In the pursuit of automating financial data extraction through Generative AI, the team encountered several significant challenges. These obstacles highlight the complexity of working with real-world financial data and the nuances involved in applying AI to such tasks. The primary challenge was the high degree of variability in report formats and data structures, both between companies and between different quarterly reports for the same company.
1. Variability in Report Formats and Structures

The primary challenge stemmed from the high degree of variability in how different companies present their financial information. This variability manifested in two main forms:

Inter-company Variability:
Each of the five insurance companies in the study had its own unique way of structuring and presenting financial data. This inconsistency made it difficult to create a one-size-fits-all AI solution.

Intra-company Variability:
Even within the same company, the format and structure of reports could change from one quarter to another. This temporal inconsistency added another layer of complexity to the AI's task.

2. Types of Data Encountered
The team categorised the data they encountered into two main types, each presenting its own set of challenges:
Structured Unstructured Data (Lower Complexity)

This category primarily consisted of tabular data, often presented in PDF format.

Challenges:

  • Extracting data accurately from PDF tables
  • Handling slight variations in table structures across reports

Opportunities:

  • More amenable to programmatic approaches
  • Higher potential for accurate data extraction
  • Improved accuracy when data structures remain consistent across periods
Unstructured Unstructured Data (Higher Complexity)

This category included non-tabular data in PDFs, narrative text, and information embedded in charts or graphics.

Challenges:

  • Extracting precise numerical data from charts or graphics
  • Interpreting contextual information from narrative text
  • Dealing with inconsistent presentation of key metrics

Example:
Sampo's Solvency II ratio was only available in PowerPoint charts in their quarterly reporting, making it particularly challenging to extract accurately.

3. Balancing Accuracy and Automation

Another key challenge was finding the right balance between the level of automation and the accuracy of extracted data:

  • Highly prescriptive AI models could achieve good accuracy but required significant manual setup and were vulnerable to changes in report formats.
  • More flexible, general-purpose AI models offered greater adaptability but sometimes sacrificed accuracy or required more manual verification.
4. Handling Edge Cases and Exceptions
Financial reports often contain exceptions, footnotes, and special cases that are crucial for accurate analysis but challenging for AI to interpret correctly. Ensuring that the AI models could identify and handle these edge cases appropriately was a significant challenge.

Maintaining consistency across different AI approaches

As the team experimented with different AI models and approaches, ensuring consistency in results across these various methods became a challenge. This was particularly important for comparing the effectiveness of different AI strategies. By addressing these challenges, the team not only advanced their understanding of applying Generative AI to financial data extraction but also gained valuable insights into the broader implications of AI in financial analysis. These learnings form the foundation for future developments in this field, potentially transforming how equity research is conducted.

Insights and Discoveries

These insights provide a nuanced view of the current state of Generative AI in financial data extraction. They reveal both the significant progress made and the challenges that lie ahead in fully realising the potential of AI in Equity Research.

Effectiveness of AI Assistants

Key Finding: AI Assistants demonstrated the capability to generate accurate and automated outputs for the 5 insurance companies in the POC.

This success suggests that AI Assistants have significant potential in automating routine data extraction tasks in equity research.

However, they are more programmatic in nature so contingent on carefully underlying defined and a high degree of consistency in input data formats.

Potential of sequential AI approaches

Key Finding: Experiments with sequential AI approaches showed promise in balancing flexibility and accuracy.

By combining different AI models in sequence, the team found potential pathways to reduce reliance on prescriptive logic while maintaining accuracy. This approach could offer a middle ground between highly structured and purely generative methods.

API vs. User Interface Discrepancies

Key Finding: The team observed variance in behaviour between user interface queries and API queries.

This discovery highlights the importance of consistent performance across different interaction modes with AI systems. It also suggests that the method of querying AI models can significantly impact results, a crucial consideration for real-world applications.

Data format variability impact

Key Finding: The high variability in report formats and data structures significantly influenced AI performance.

This insight underscores the need for robust, adaptable AI systems capable of handling diverse financial reporting styles. It also highlights a potential area for standardisation in financial reporting to facilitate AI-driven analysis.

Experimentation and Approach

The POC employed a multi-faceted experimental approach to thoroughly explore the potential of Generative AI in financial data extraction. Three primary methodologies were investigated, each offering unique advantages and challenges. This comprehensive experimentation strategy allowed the team to gain a nuanced understanding of how different AI approaches perform in the context of financial data extraction. The insights gained from these experiments formed the basis for the project's conclusions and recommendations for future development in this field.

Summary of Approaches

The POC employed a multi-faceted experimental approach to thoroughly explore the potential of Generative AI in financial data extraction. Three primary methodologies were investigated, each offering unique advantages and challenges:

1. Pure Generative AI

In our exploration of pure Generative AI models, we leveraged cutting-edge LLMs like Claude 3 Sonnet and ChatGPT4. This approach offered flexibility in handling diverse document formats and showed promising accuracy at 80%. While it required no prescriptive coding, the process still relied on manual data sourcing and provision. The models demonstrated an impressive ability to understand financial contexts, though they struggled with certain file formats like Excel. As LLM technology rapidly evolves, we anticipate significant improvements in accuracy and versatility, making this a promising avenue for future development in automated financial data extraction.

2. AI Assistants Models

Our AI Assistants approach combined Generative AI with traditional data analytics, resulting in a highly accurate and automated system. This hybrid method achieved 100% accuracy in data extraction, with both sourcing and provision fully automated. The assistants excelled in structured data extraction and incorporated domain-specific financial knowledge. However, the system's reliance on prescriptive code made it vulnerable to changes in input data formats. Despite this limitation, the AI Assistants model stands out as a powerful tool for consistent, high-accuracy financial data processing, particularly in stable reporting environments.

3. Sequential LLM / Generative AI Models

The Sequential LLM approach represented our most innovative experiment, utilising multiple AI models in a predefined sequence via API infrastructure. This method achieved a high degree of automation in both data sourcing and provision. However, it unexpectedly resulted in lower accuracy (40%) compared to other approaches, particularly when using API calls instead of user interfaces. While this method shows potential for handling complex financial data structures, the significant accuracy drop highlights the need for further investigation and refinement. The sequential approach offers valuable insights into the challenges of creating more sophisticated, multi-model AI systems for financial data extraction.

Overall Conclusion and Analysis

Summary

1. Pure Generative AI
Accuracy: Medium (80%)
Strengths: No prescriptive coding required – excluding query refinement; LLM Accuracy increasing rapidly
Weaknesses: Fully manual data provisioning process, Medium current level of accuracy; Doesn’t accept all document formats (PDF – OK, Excel – not OK)

2. AI Assistants Models
Accuracy: High (100%)
Strengths: Highly accurate outputs; High degree of automation
Weaknesses: Prescriptive underlying code infrastructure; Vulnerable to input data format changes

3. Sequential LLM / Generative AI Models
Accuracy: Low (40%)
Strengths: Higher degree of process automation than via pure generative AI approaches
Weakness: Inexplicable drop in accuracy of results leveraging API versus user interface in Approach 1

Additional Experimentation

The team also conducted:

  • Comparative assessments of different methods
  • Stability testing of outputs
  • Exploration of ways to reduce prescriptive logic while maintaining accuracy
  • Investigation of discrepancies between user interface and API query results

This comprehensive experimentation strategy provided a nuanced understanding of how different AI approaches perform in financial data extraction. The varying results across accuracy, automation, and flexibility highlight the complexities involved in applying AI to real-world financial analysis tasks. These insights form the basis for future development and refinement of AI-assisted financial data processing systems.

0

Data

Scientists

0

Days

$25k

Cost

Unlock the future of Financial Data Analysis with Generative AI

Learn how we can help you harness the power of Generative AI for your organisation. Let's redefine what's possible.

  • Explore Our Use Case: Delve into the details of our POC and see how AI can automate the extraction of key financial metrics from unstructured reports.
  • Leverage Our Expertise: With a strong background in machine learning and data analytics, Calimere Point is uniquely positioned to support your AI integration journey.
  • Maximise Efficiency: Learn how our hybrid AI models can streamline your data handling and accelerate research workflows.

Calimere Point, a strong advocate of AI’s capabilities, is uniquely positioned to assist organisations in leveraging Generative AI.

For additional insights, view our Use Case on Automating Counterparty Credit Risk to learn how Calimere Point's data science expertise and deep real-world experience in using data analytics can help your organisation define the art of the possible.

Discover the potential of your Data with Calimere Point

With our extensive background in machine learning and natural language processing, we support organisations through every stage of AI integration. Our Hybrid model is live and currently in use. Get in touch with us to explore how we can assist you in adopting AI.