Lingua Custodia’s Generative AI Multi-Document Analyser


Multi-Document and Multi-Lingual Data Extraction 

Lingua Custodia’s Multi-Document Analyser is now in production and can be accessed via the secure platform or through API.
This means that a group of documents can be uploaded at the same time, with key information extracted within seconds. As this technology is also integrated with our machine translation engines, it is possible to load your documents in different languages and extract your data using a different language prompt.

As an example, French, Chinese and Spanish documents can be uploaded and you can ask your queries in English or any other language which is supported by the platform.

You can upload up to 15 pdf documents together. The technology has been optimised to read and extract information in tables and to respond swiftly to queries.

What are the uses cases for the Multi-Document Analyser?

The use cases for the multi-document analyser include financial product and regulation queries, client support queries, requests for proposals and due diligence questionnaires. So, it is possible to upload a group of Key Investment Documents and extract the risk performance indicators and other details.

Lingua Custodia focuses on innovation

Lingua Custodia is very proud of this highly innovative technology which is part of a suite of financial processing services it provides. It was set up by 2 financial professionals in 2011, originally to provide specialised machine translation services, having identified a clear use case for this service.  Since 2020, new technologies, such as speech to text and data extraction, have been added progressively.

 Its aim is to be the market leader in financial document processing for financial institutions, and Lingua Custodia is distinguished by its focus on data security as it recognises that this is a priority for its clients.  Data is stored on bare metal servers in Europe.


Why AI will not be replacing humans anytime soon!

Lingua Custodia

Why AI will not be replacing humans anytime soon!

The last 18 months has seen dramatic developments in the arena of Artificial Intelligence (AI). The emergence of Large Language models such as ChatGPT, which can analyse, respond and generate text was a major event.  This then led to the rapid emergence of other models, focused on sentiment analysis, image and voice recognition.  

This has understandably led to concerns about the impact of these innovations on the human workforce. Will AI innovations make humans redundant?!

At Lingua Custodia, we feel strongly that the response is no. These technologies will boost productivity and create new job opportunities.  AI is to be embraced rather than feared!

AI and humans learn differently

Large language models use queries or prompts, based on mathematical formulas to process and identify patterns in a huge volume of data.  These prompts are then converted to text outputs.  

These models learn by correlation, so for example, they can link 2 variables – such as studying and grades, but a human brain learns by causation – that the change in one variable can impact the other one – so if you study, you might get better grades, whereas if you do not study, your grades might suffer. 

So, AI and human brains do not learn in the same way – they are different.  The AI may well be able to process huge volumes of data faster than a human brain, but the human brain can identify causation as well as adding layers of creative thought, consciousness, and ethics. 

AI should be used to boost productivity

Future job roles will use AI to as a tool to boost productivity.  So, an engineer might use AI to check their code for potential errors.  In terms of the financial industry, which is championing AI, it can be used to identify risk, rapidly analyse investment opportunities and optimise client services through the use of chatbots.

The Lingua Custodia platform which is specialised for the financial services sector, contains several AI technologies which are all focused on adding value for our clients. Our secure platform allows the rapid communication, extraction and analysis of data in different languages. For example, machine translation technology translates documents and text within seconds, while our Document Analyser, rapidly extracts key data from large pdf documents.

Lingua Custodia features on the Wavestone radar for French Generative AI Startups 2023

Generative AI

Lingua Custodia was delighted to feature on the Wavestone Radar for French Generative AI start ups in 2023.

What is the Wavestone Radar?

Wavestone is an international global consulting company, which has a start up accelerator focusing on emerging trends in the startup ecosystem. It shares the results of these market insights through the publication of Wavestone Startup Radars.

What is Generative AI?

Generative AI is a type of model trained to spot pattens in data, which enables it to then generate new content based on the previous patterns.  So, for example in the finance industry, Generative AI models can be used to analyse trading and investment data, identifying patterns to generate trading opportunities.

Lingua Custodia

Lingua Custodia is included on the Generative AI radar within the ‘Gestion de la Connaissance’ category, because of its focus on financial document processing and its range of technologies for data extraction and analysis.

It is a huge achievement to feature on this radar.  Lingua Custodia was initially created in 2011 by finance professionals to offer specialised machine translation.

Leveraging its state-of-the-art NLP expertise, the company now offers a growing range of financial document processing solutions in addition to its initial Machine translation technology.

Lingua Custodia’s document analyser, uses Generative AI applied to a large language model (LLM) model to search for specific information in confidential documents, extracting and then summarising the information.

The key advantages of our document analyser is the rapid extraction of the relevant data in response to a series of queries. The document analyser is multi-lingual, available in 10 languages. This allows you to query a document in a different language to the one it is written it. The use cases for the document analyser include requests for proposals, regulatory, compliance, research and security documents.

The source references are also included which helps with verifying and checking the accuracy of the responses.  The ability to query several documents at the same time will be developed and live on the platform by the end of Q1 2023.

The EU AI Act – Supporting innovation and building trust across the financial services industry.

The EU AI Act – Supporting innovation and building trust across the financial services industry

The EU AI Act was agreed by the European parliament in December 2023 and the financial text is likely to be published in early 2024.

This act will apply to all industries across the European Union and is aimed at continuing to foster innovation while ensuring the protection of individual’s rights, through stricter regulation of high-risk AI technologies and the promotion of transparency and trust across AI technologies.

It recognises that innovation is essential for competitiveness, so this Act also includes the creation of regulatory sandboxes to facilitate the development, testing, and validation of innovative AI systems under strict regulatory oversight.

The Act establishes rules and obligations for AI technology, based on the potential risk to the user and society.  Five risk levels are defined, with stricter obligations for technologies deemed to be at higher risk.

Technologies with an unacceptable risk are banned, such as systems which aimed at exploiting vulnerabilities or behavioural manipulation. Technologies which are deemed as high risk, with the potential to impact on fundamental rights, democracy and health and safety, will be required to comply with extensive governance activities to ensure these technologies are compliant with the Act.

AI systems which are categorised as limited risk, will need to ensure that they are fully transparent, this means for example, that users should be aware if they are interacting with an AI chatbot or a human.

What does the EU AI Act mean for financial services?

Many of the AI technologies used across the financial services industry fall into the high-risk category, such as trading algorithms, risk analysis and credit scoring.  The onus will be on financial institutions to demonstrate that their models can be understood and that their underlying data is unbiased and of good quality. 

Matching the requirements of the EU AI Act has the advantage of winning consumer trust, as consumers are becoming very aware of the importance of ethical AI and the need to respect their rights and privacy.

While the EU AI ACT might take two years before coming into force, financial institutions should act now to analyse their AI technologies and make any necessary changes to comply with the required obligations.

Lingua Custodia’s Generative AI Document Analyser

Our latest generative ai financial document processing technology, our document analyser allows the rapid extraction of key data from large pdf documents such as the EU AI Act. It’s fully secure, like our other technologies, multilingual, and provides the source referencing!

You can test it here!

Generative AI for finance

Generative AI for finance

Generative AI is a powerful innovation and one which is being rapidly adopted by the financial industry to improve productivity and enhance workflows. Machine learning and AI have been used over the last decade within the financial services industry to automate and enhance previously manual processes such as fraud detection and compliance.  However, the usage of generative AI for finance can add even more value and be used across a number of key areas, providing a clear competitive advantage

What is Generative AI?

Generative AI is a type of AI which can generate content.  It can combine data from large language models (LLMs) and algorithms to generate content based on patterns it observes in other content.

It’s ability to generate content takes it beyond traditional machine learning.  Machine learning focuses on recognising patterns and making predictions and decisions based on this data.  Generative AI is able to generate new content based on the data is was trained on, so creating new content which follows the underlying data patterns.

What are the use cases for Generative AI for finance?

There are several use cases for Generative AI for finance. 


Generative AI can be used to spot patterns and identify anomalies, such as transactions which do not follow typical patterns, which can then be flagged for further investigation.  This helps to improve the productivity and efficiency of the fraud team who can focus on a specific subset of transactions.

Improve Client Satisfaction

Generative AI can be used to in chat bots to provide human like interactions, to help answer customer queries and questions, for example responding to queries relating to balances and transactions.  The customer experience can also be personalised with the analysis of client data to provide recommendations for specific products.

Data Analysis and Extraction

As many financial institutions have to deal with large volumes of data which is time consuming and laborious, Generative AI can be used to rapidly summarise large documents, extracting the key information and providing summarises for further review.

Lingua Custodia’s document analyser was created to meet the demands of its clients for a technology able to rapidly summarise and extract key data from large documents.  It can also be used in conjunction with its specialised machine translation engines, to extract the data in different languages.  This allows its clients to query large documents in a different language to one the document is written in, which is invaluable for international financial institutions. 

Innovation in Finance – Fintech Campus Corporate with ING

innovation in finance

Our Managing Director Frederic Moioli was delighted to take part in this campus at the The LHoFT – Luxembourg House of Financial Technology in partnership with ING Luxembourg, which was very focused on driving innovation in finance, with lively and interesting discussions on key trends for the financial sector. Lingua Custodia with its extensive expertise in financial document processing has always been aware of the importance of innovation in finance to optimise workflows and add value for its financial clients.

The day consisted of a series of key discussions on current innovations in finance and finished with an interactive workshop to provide attendees with an opportunity to explore innovative technologies.

Lingua Custodia, represented by Frederic, participated in a session on the future of finance through the application of Generative AI in financial services.  Generative AI is currently being used to automate financial tasks, rapidly analyse and summarise key data as well as improve client satisfaction through the creation of chatbots.

Frederic was very happy to share his knowledge and experience with the attendees, and highlight how Lingua Custodia has been driving innovation in finance since its inception. 

Size no longer matters for large language models!

Lingua Custodia’s new compact open source language model Fin-Pythia-1.4B outperforms larger language models.

The French Fintech company Lingua Custodia, a specialist in Natural Language Processing (NLP) applied to Finance since 2011, releases its first open source language model on the Hugging Face Hub specifically trained for sentiment analysis of financial text.

Fin-Pythia-1.4B is a language model that’s been fine-tuned on financial documents and instructions. It can understand complex financial jargon and terminology. It is compact in size, which makes it fast to run without compromising the quality of the output.

Lingua Custodia’s open source language model is extremely accurate in analyzing financial sentiment and outperforms well-known models like GPT-4 and BloombergGPT. 

Raheel Qader, The head of Lingua Custodia’s Research and Development lab highlights “to produce accurate language models, the essential bases are data, expertise and experience.  Following 4 months of research, we were delighted to find that our open source language model outperformed both GPT-4 and and BloombergGPT in various financial NLP tasks.  This demonstrates clearly that size is not necessary to create powerful models and I am delighted that the research team here at Lingua Custodia was able to bring the model to the open source community so rapidly.  This places Lingua Custodia firmly at the forefront of generative ai technologies

Fin-Pythia-1.4B model card is available at this link

Lingua Custodia will be releasing a series of other large language models targeting various financial document processing use cases as part of its research and development strategy for 2024.

New Study Reveals AI’s Carbon Footprint is Much Lower Than Humans’

New Study Reveals AI’s Carbon Footprint is Much Lower Than Humans’

A new study published in Social Science Research Network found that AI systems have a much lower carbon footprint than humans for tasks such as writing and creating illustrations.

The researchers compared the emissions generated by AI systems like ChatGPT and DALL-E to the emissions generated by an average human completing the same tasks. They found that the AI systems emitted 130 to 1500 times less CO2 per page of text and 310 to 2900 times less per image. Researchers calculated the carbon footprint of the AI systems by looking at the emissions from training the models and from generating each individual response. Even when factoring in energy used for model training, the AI systems were far more efficient than humans. The emissions generated by a laptop or desktop computer used by a human were also greater than the AI systems. The researchers note that AI could play an important role in reducing emissions for certain activities. However, they stress that AI does not substitute for all human tasks and that factors like job displacement must be considered. They suggest collaboration between AI and humans as the best approach in many fields.

While AI emissions may grow as the technology advances, this study highlights an important benefit of AI systems as they stand today. It adds a new perspective to concerns about the carbon footprint of AI by comparing it directly to human activities. Even with current technology, AI enables the completion of common tasks at much lower emissions than humans can achieve.

Lingua Custodia’s Generative AI Document Analyser

Lingua Custodia’s Document Analyser tool for rapid data extraction!

Our new data extraction technology, the Document Analyser, is easy to use and allows swift extraction of key information from large documents. It was developed inhouse by our machine language experts and is fully secure, like all our other financial document processing technologies. The use cases include compliance and due diligence queries, calls for tender and fund and research queries.

The Document Analyser is easy to use. You simply upload your document and begin to type in your questions. It is available in a multilingual format, which means that you can load the document in one language and ask questions in a different language.

Do not hesitate to try it! You can sign up for a 14 day test access which allows you to try our AI translation services and the Document Analyser.

Job opportunity: 6 month internship -Training Large Language Models on Financial Conversational Data

6 month internship opportunity working for the Lingua Custodia Lab!

Would you like to work an innovative and supportive environment at the forefront of AI developments?! The objective of this 6 month internship working for the Lingua Custodia Lab (Research and Development team) is to study LLM fine-tuning in the context of financial conversational data (or instructions). You will be working on supervised instruction fine-tuning and reinforcement learning from human feedback. Theoretical and practical knowledge in all of these mentioned topics are essential to carry out the internship.

Internship organisation

Internship supervised by the Lingua Custodia Lab (R&D team)

Internship of 6 months in Paris (fully on-site)

Full time, 35 hours weekly. Ticket restaurant and 50% reimbursement of navigo pass

Starting date: Early 2024

Internship objectives

Study state-of-the art in Large Language Models (LLMs) fine-tuning.

Compare different models through systematic experiments.

Collect large amounts of conversational data (from public sources)

Fine-tune LLMs on conversational chat

Required qualifications and experience

Master’s Degree student in Computer Science, Machine Learning or Natural Language Processing.

Required qualifications and experienceExperience with LLMs (class or personal projects) (this is a must)

Experience with Huggingface library (this is a must)

Proficient in English and/or French.

To apply:

Please send your CV to :  Raheel Qader –