Data Analytics Driven Automation
Revolutionising
Financial
Data Analysis
Investment Bank Equity Research
Harnessing Generative AI for Automated Equity Research
Investment Bank (IB) Equity Research partnered with Calimere Point to conduct a proof of concept (POC) exercise. The goal was to assess the potential of Generative AI models in automating the extraction of key financial metrics from unstructured company filing data, transforming it into a structured Excel output for use in IB's research models.
The goal was to leverage Generative AI to automate the extraction of key financial metrics from unstructured quarterly reports of major insurance companies - a task traditionally requiring hours of manual work. The POC aimed to transform these metrics into structured Excel outputs for IB's research models.
While not without challenges, the results were promising: the exercise demonstrated that AI could indeed automate this process, potentially revolutionising the speed and efficiency of equity research. By testing the limits of current AI technology in handling real-world financial data complexity, this project has implications that could reshape how financial analysis is conducted in the future.
AI in Equity Research: Navigating complexity for greater efficiency
The main challenge in applying Generative AI to equity research is the variability in financial report formats and structures.
This variability ranges from semi-structured tabular data to highly unstructured information in charts and narratives. The financial industry is actively exploring AI-driven solutions to automate data extraction, assist in analysis, and improve overall efficiency in equity research. Approaches include pure AI models, hybrid systems combining traditional analytics with AI, and sequential AI processes. These efforts aim to standardise data handling and accelerate research workflows.
It's important to note that while there's a clear trend towards automation and AI-assisted research, the complexities highlighted in this Use Case suggest that human expertise remains crucial in interpreting and validating AI-generated insights in equity research.
Background
Scope
The POC focused on a carefully selected group of 5 insurance companies, each representing different geographical markets and reporting structures:
Legal and General (UK): A leading provider of life insurance, pensions, and investment management services.
Sampo Group (Finland and Nordics): A major player in the Nordic and Baltic insurance markets.
Hannover Re (Germany): One of the world's largest reinsurance groups.
Just Group (UK): A specialist in retirement income products and services.
Allianz (Germany): A global leader in insurance and asset management.
This diverse selection allowed the team to test the AI models against a variety of reporting styles and regulatory frameworks.
Project Structure
The project leveraged cutting-edge Generative AI technologies, complemented by traditional data analytics approaches. This hybrid approach aimed to combine the strengths of both paradigms.
Two members of Calimere Point's data science team were dedicated to the project, bringing specialised expertise in AI and financial data analysis.
The development phase was time-bounded to 2.5 weeks, necessitating rapid iteration and focused problem-solving.
The entire POC was conducted on Calimere Point's proprietary data analytics architecture (CPRA Cloud), providing a secure and powerful environment for AI model training and testing.
Remarkably, this was executed as a cost-effective POC exercise, valued at $25k, demonstrating the potential for high-impact results with a modest financial investment.
Industry: Investment Banking
Key Challenges in Equity Research
1. Variability in Report Formats and Structures
The primary challenge stemmed from the high degree of variability in how different companies present their financial information. This variability manifested in two main forms:
Inter-company Variability:
Each of the five insurance companies in the study had its own unique way of structuring and presenting financial data. This inconsistency made it difficult to create a one-size-fits-all AI solution.
Intra-company Variability:
Even within the same company, the format and structure of reports could change from one quarter to another. This temporal inconsistency added another layer of complexity to the AI's task.
2. Types of Data Encountered
This category primarily consisted of tabular data, often presented in PDF format.
Challenges:
- Extracting data accurately from PDF tables
- Handling slight variations in table structures across reports
Opportunities:
- More amenable to programmatic approaches
- Higher potential for accurate data extraction
- Improved accuracy when data structures remain consistent across periods
This category included non-tabular data in PDFs, narrative text, and information embedded in charts or graphics.
Challenges:
- Extracting precise numerical data from charts or graphics
- Interpreting contextual information from narrative text
- Dealing with inconsistent presentation of key metrics
Example:
Sampo's Solvency II ratio was only available in PowerPoint charts in their quarterly reporting, making it particularly challenging to extract accurately.
3. Balancing Accuracy and Automation
Another key challenge was finding the right balance between the level of automation and the accuracy of extracted data:
- Highly prescriptive AI models could achieve good accuracy but required significant manual setup and were vulnerable to changes in report formats.
- More flexible, general-purpose AI models offered greater adaptability but sometimes sacrificed accuracy or required more manual verification.
4. Handling Edge Cases and Exceptions
Maintaining consistency across different AI approaches
Insights and Discoveries
Effectiveness of AI Assistants
Key Finding: AI Assistants demonstrated the capability to generate accurate and automated outputs for the 5 insurance companies in the POC.
This success suggests that AI Assistants have significant potential in automating routine data extraction tasks in equity research.
However, they are more programmatic in nature so contingent on carefully underlying defined and a high degree of consistency in input data formats.
Potential of sequential AI approaches
Key Finding: Experiments with sequential AI approaches showed promise in balancing flexibility and accuracy.
By combining different AI models in sequence, the team found potential pathways to reduce reliance on prescriptive logic while maintaining accuracy. This approach could offer a middle ground between highly structured and purely generative methods.
API vs. User Interface Discrepancies
Key Finding: The team observed variance in behaviour between user interface queries and API queries.
This discovery highlights the importance of consistent performance across different interaction modes with AI systems. It also suggests that the method of querying AI models can significantly impact results, a crucial consideration for real-world applications.
Data format variability impact
Key Finding: The high variability in report formats and data structures significantly influenced AI performance.
This insight underscores the need for robust, adaptable AI systems capable of handling diverse financial reporting styles. It also highlights a potential area for standardisation in financial reporting to facilitate AI-driven analysis.
Experimentation and Approach
Summary of Approaches
1. Pure Generative AI
2. AI Assistants Models
3. Sequential LLM / Generative AI Models
Overall Conclusion and Analysis
Summary
1. Pure Generative AI
Accuracy: Medium (80%)
Strengths: No prescriptive coding required – excluding query refinement; LLM Accuracy increasing rapidly
Weaknesses: Fully manual data provisioning process, Medium current level of accuracy; Doesn’t accept all document formats (PDF – OK, Excel – not OK)
2. AI Assistants Models
Accuracy: High (100%)
Strengths: Highly accurate outputs; High degree of automation
Weaknesses: Prescriptive underlying code infrastructure; Vulnerable to input data format changes
3. Sequential LLM / Generative AI Models
Accuracy: Low (40%)
Strengths: Higher degree of process automation than via pure generative AI approaches
Weakness: Inexplicable drop in accuracy of results leveraging API versus user interface in Approach 1
Additional Experimentation
The team also conducted:
- Comparative assessments of different methods
- Stability testing of outputs
- Exploration of ways to reduce prescriptive logic while maintaining accuracy
- Investigation of discrepancies between user interface and API query results
This comprehensive experimentation strategy provided a nuanced understanding of how different AI approaches perform in financial data extraction. The varying results across accuracy, automation, and flexibility highlight the complexities involved in applying AI to real-world financial analysis tasks. These insights form the basis for future development and refinement of AI-assisted financial data processing systems.
0
Data
Scientists
0
Days
$25k
Cost
Unlock the future of Financial Data Analysis with Generative AI
Learn how we can help you harness the power of Generative AI for your organisation. Let's redefine what's possible.
- Explore Our Use Case: Delve into the details of our POC and see how AI can automate the extraction of key financial metrics from unstructured reports.
- Leverage Our Expertise: With a strong background in machine learning and data analytics, Calimere Point is uniquely positioned to support your AI integration journey.
- Maximise Efficiency: Learn how our hybrid AI models can streamline your data handling and accelerate research workflows.
Calimere Point, a strong advocate of AI’s capabilities, is uniquely positioned to assist organisations in leveraging Generative AI.
For additional insights, view our Use Case on Automating Counterparty Credit Risk to learn how Calimere Point's data science expertise and deep real-world experience in using data analytics can help your organisation define the art of the possible.