Retrieval augmented generation (RAG) has revolutionized open-domain question answering, enabling systems to produce human-like responses to a wide range of queries. At the heart of RAG is a retrieval module that scans a large corpus to find contextually relevant passages, which are then processed by a neural generative module – often a pre-trained language model like GPT-3 – to formulate a final response .
Although this approach has proven to be very effective, it is not without limitations.
One of the most critical components, embedding pass vector search, has inherent constraints that can hinder the system’s ability to reason in a nuanced way. This is especially evident when questions require complex, multi-hop reasoning across multiple documents.
Vector search refers to searching for information using vector representations of data. This involves two key steps:
- Encoding data into vectors
First, the searched data is encoded as digital vector representations. For textual data such as passages or documents, this is done using integration models such as BERT or RoBERTa. These models convert text into dense vectors of continuous numbers that represent semantic meaning. Images, audio, and other formats can also be encoded into vectors using appropriate deep learning models.
2. Search using vector similarity
Once the data is encoded into vectors, searching involves finding vectors similar to the vector representation of the search query. This relies on distance measures such as cosine similarity to quantify how close two vectors are and rank the results. Vectors with the smallest distance (highest similarity) are returned as the most relevant search results.
The main advantage of vector search is the ability to search for semantic similarity, not just literal keyword matches. Vector representations capture conceptual meaning, allowing more relevant but linguistically distinct results to be identified. This allows for higher search quality than traditional keyword matching.
However, transforming data into vectors and searching in high-dimensional semantic space also have limitations. Balancing the tradeoffs of vector search is an active area of research.
In this article, we will analyze the limitations of vector search, exploring why it struggles…