Skip to main content

2 posts tagged with "llms"

View All Tags

· 20 min read
Semih Salihoğlu

In my previous post, I gave an overview of question answering (Q&A) systems that use LLMs over private enterprise data. I covered the architectures of these systems, the common tools developers use to build these systems when the enterprise data used is structured, i.e., data exists as records stored in some DBMS, relational or graph. I was referring to these systems as RAG systems using structured data. In this post, I cover RAG systems that use unstructured data, such as text files, pdf documents, or internal html pages in an enterprise. I will refer to these as RAG-U systems or sometimes simply as RAG-U (should have used the term RAG-S in the previous post!).

To remind readers, I decided to write these two posts after doing a lot of reading in the space to understand the role of knowledge graph (KGs) and graph DBMSs in LLM applications. My goals are (i) to overview the field to readers who want to get started but are intimidated by the area; and (ii) point to several future work directions that I find important.1

  1. In this post I'm only covering approaches that ultimately use retrieve some unstructured data (or a transformation of it) to put it into LLM prompts. I am not covering approaches that query a pre-existing KG directly and use the records in it as additional data into a prompt. See this post by Ben Lorica for an example. The 3 point bullet point after the "Knowledge graphs significantly enhance RAG models" describes such an approach. According to my organization of RAG approaches, such approaches would fall under RAG using structured data, since KGs are structured records.

· 26 min read
Semih Salihoğlu

During the holiday season, I did some reading on LLMs and specifically on the techniques that use LLMs together with graph databases and knowledge graphs. If you are new to the area like me, the amount of activity on this topic on social media as well as in research publications may have intimidated you. If so, you're exactly my target audience for this new blog post series I am starting. My goals are two-fold:

  1. Overview the area: I want to present what I learned with a simple and consistent terminology and at a more technical depth than you might find in other blog posts. I am aiming a depth similar to what I aim when preparing a lecture. I will link to many quality and technically satisfying pieces of content (mainly papers since the area is very researchy).
  2. Overview important future work: I want to cover several important future works in the space. I don't necessarily mean work for research contributions but also simple approaches to experiment with if you are building question answering (Q&A) applications using LLMs and graph technology.

This post covers the topic of retrieval augmented generation (RAG) using structured data. Then, in a follow up post, I will cover RAG using unstructured data, where I will also mention a few ways people are building RAG-based Q&A systems that use both structured and unstructured data.