Generative AI

Generative AI is a buzzword that has gained significant attention in technology discussions. It involves algorithms that use statistical methods to analyze and create content, recognizing complex patterns in data (mostly unstructured such as text).

The term “generative” reflects the technology’s ability to produce new content. It uses probabilities to predict the next word or pixel in a new artifact based on word or pixel association patterns it identified from existing data. This process can create a wide range of outputs, from text and images to music.

The buzz about generative AI has triggered a gold rush in the tech sector, surfing the economy of promises build on current techno-messianism belief.

 

Neural networks (NN)

The concept of neural networks dates back to theĀ 1950s.

NN are a subset of machine learning models inspired by the structure and function of the human brain. They consist of interconnected layers of nodes, or “neurons,” which process input data and learn to recognize patterns.

Each neuron receives input, applies a transformation (often a weighted sum followed by a non-linear activation function), and passes the output to the next layer.

 

LLMs

Large Language Models (LLMs) are a specialized type of neural network designed to process and generate human language. They utilize a neural network architecture called the transformer, which was introduced by Google researchers in 2017. These models are built with deep neural networks and are pre-trained on vast amounts of data sourced from the internet.

“Pre-trained” refers to the process of training the model on logical functions and related variables to optimize its performance. This involves adjusting the model to effectively activate the neurons in subsequent layers based on the likelihood of word sequences derived from a corpus. As a result, LLMs can store and utilize the patterns found in the data they were trained on.

One of the main advantages of LLMs is their ability to generate diverse content like text, code, images, and videos on demand using natural language. However, they also have significant drawbacks, including the potential for hallucinations, biases based on their training data, and a tendency to provide overly general answers. Additionally, they may not cater specifically to individual business needs and have limitations on instruction size.

 

RAG

Retrieval-Augmented Generation (RAG) is a method focused on upstream data optimization for feeding LLMs. It involves generating embeddings, which transform input artifacts into vector representations, and storing these in a vector database. This process enables the execution of similarity calculations to select and retrieve the most relevant matches from the stored data.

The mainstream RAG approach includes generating embeddings, storing documents and queries in a vector database, and running similarity measures. This allows for the retrieval of significant matches based on the embeddings created from the text. However, this process requires substantial computational power and expertise, which can be costly and difficult to optimize.

RAG pipelines face challenges such as the inability to effectively handle proprietary or specialized information, which can lead to incomplete or inaccurate responses. Additionally, constructing efficient data pipelines for retrieval can be complex, particularly in niche domains like healthcare, where the specificity of information is crucial.

 

Graph RAG

Graph RAG integrates a knowledge graph into the feed for LLMs, enhancing their ability to process and generate information. By utilizing structured data, it allows for a richer and more nuanced understanding of relationships and entities within the data. This approach aims to improve the overall quality and relevance of the outputs generated by LLMs.

Graph RAG enhances LLM performance by providing densified inputs to LLMs resulting in more structured and accurate responses. It allows for personalization based on user-specific taxonomies and use cases, thereby tailoring answers to individual or organizational needs. Additionally, by grounding LLMs in factual data, Graph RAG helps reduce hallucinations, leading to more reliable information generation.

In the context of adaptive RAG, Graph RAG emphasizes the importance of using the right resources for specific tasks, recognizing that LLMs and GPUs can be resource-intensive. When using Graph RAG, we suggests that close-ended queries should not involve LLMs, while open-ended queries benefit from a persona-optimized RAG approach. This tailored strategy ensures that the system operates efficiently and effectively, addressing unique demands adequately.

KG and LLM complementarity

LLMs (as part of RAG pipelines) are stochastic systems (non-deterministic) that analyze training data by breaking it into chunks to identify patterns for generating new outputs. For example, if trained on clinical trials, an LLM can predict a trial design (epochs, events, cohorts, clinical assessment, milestones…), even if such a specific trial doesn’t exist, but may produce nonsensical results if the training data lacks relevant designs tailored for specific drug classes, mechanism of actions or disease and therapeutic area. This unpredictability makes LLMs unsuitable as databases, as they generate predictions rather than factual representations of data.

In contrast, symbolic deterministic systems utilize symbolic logic to derive outputs based on established rules and datasets. For instance, a knowledge graph can accurately document and identify whether a clinical trial designs is fit-for-purpose to test Progression-Free-Survival (PFD) or Overall Survival (OS) for any drug, providing factual information rather than creating an imaginary representation. This reliability allows symbolic systems to confirm the existence or absence of specific entities within their domain of knowledge.

Graph RAG is a movement towards unifying stochastic and symbolic systems, despite differing opinions on their effectiveness. Stochastic systems generate imaginative possibilities but lack reliability for factual answers, while symbolic systems offer predictable and precise responses based on known rules. By grounding stochastic outputs with the structure of symbolic systems, it becomes possible to establish parameters that ensure the accuracy and consistency of LLM generated information.