M4. Use Case: Retrieval Augmented Generation system for the innovation domain

This module, related to OB5, aims to design and develop a specialised Retrieval-Augmented Generation system (RAG) for retrieving and generating information on innovation. This system will integrate advancements in QNLP tasks developed in M1, M2, and M3. This use case will serve as an experimental platform and testing ground to intrinsically and extrinsically evaluate the impact and effectiveness of the developments in NLP, ensuring a practical and results-oriented approach. A RAG search system is a technique in AI that combines information retrieval with text generation
(Fan et al., 2024). The process involved in a RAG comprises the following 3 stages:
1) Information Retrieval: the system retrieves relevant information.
2) Enrichment: the retrieved results are used as additional context to guide the language model. This allows the model to produce answers that are not only linguistically consistent but also supported by relevant data or information.
3) Response Generation: once the system has access to the retrieved documents, a language model (such as GPT or similar) uses this information to produce a response tailored to the query. The result is a more accurate and contextualised output compared to generative models
that do not incorporate search.
As can be observed in this context, NLP within a RAG system becomes essential to ensure that the system retrieves, processes and generates relevant and accurate responses based on external information. Some of the aforementioned tasks may be suited to QNLP processing if we understand that the task could benefit from a quantum approach due to the ability of quantum systems to model ambiguous meanings, complex contextual relationships and the overlapping of meanings. This is in line with the tasks and applications that will be studied in M1, M2, and M3 which will include lexical and semantic disambiguation, semantic representation (contextual embedding and semantic composition), anaphora resolution as well as metaphors detection and interpretation.

Task 4.1. Definition, development, and evaluation of a RAG system for the innovation domain.

The objective of this task is to define, develop and evaluate a RAG system for the innovation domain that uses and integrates the results obtained in M1, M2 and M3. In addition, in the last RAG stage concerning language generation, the impact on LLM in the generation of summaries and simplified text will be evaluated and a comparison of classical and quantum-based approaches will be conducted.

Milestone: Proposal and development of a RAG system for the innovation domain.