SuperDuperDB Blog | SuperDuperDB Blog

The Evolution: From Keyword Search to Vector Search

March 21, 2024 · 9 min read

Growth at SuperDuperDB

Search technology has undergone an incredible transformation over the past few years. It began with simple, directory-based methods and evolved into sophisticated algorithms capable of interpreting the nuances of human language.

Mostly of the systems until today rely heavily on keyword matching. However, this approach has a lot of limitations, often overlooking the context or the true intent behind a user's query.

So the need for a more intelligent search became apparent. This need led to the rise of artificial intelligence (AI) in search technologies, giving birth to vector search, a method that understands queries and content at a much deeper level. By prioritizing context and semantics, vector search can discern the meaning behind words, providing more accurate and relevant results.

What is Search by Keywords?

Keyword-based search is the foundation upon which traditional search engines were built. It involves searching through documents to find matches for specified words or phrases. Despite its straightforward nature, this method has drawbacks---it lacks the ability to understand the context or the intent behind the search query. This limitation often results in a list of results that contain the keywords but may be irrelevant to what the user is actually looking for.

What is RAG, and why should I care?

March 19, 2024 · 9 min read

Fernando Guerra

Growth at SuperDuperDB

Fotis Nikolaidis

Head of Cloud and Infra at SuperDuperDB

RAG is a groundbreaking approach that combines the strengths of information retrieval (IR) techniques with the creative capabilities of LLMs. This improvement transforms LLMs from being merely conversationalists to experts capable of engaging in in-depth and contextually rich dialogues on specialized topics, significantly enhancing their use and applicability across various domains.

Large Language Models (LLMs) are typically trained to converse on a wide range of topics with relative ease. However, their responses often lack depth and specificity and they might struggle to engage in detailed discussions on specialized subjects due to a lack of domain-specific knowledge. To overcome this, RAG fetches relevant information from different data sources in real-time and incorporates it into its responses; With it, the RAG model acts as an expert that evolves a general LLM into a specialized one, capable of retrieving and utilizing relevant information to provide precise responses, even to queries that require knowledge beyond its initial training data.

RAG Image

Integrating Nomic API with MongoDB using SuperDuperDB

March 13, 2024 · 6 min read

Anita Okoh

Data Scientist at SuperDuperDB

Integrating Nomic API with MongoDB using SuperDuperDB

One of the major components of building an RAG system is being able to perform a vector search or a semantic search. This potentially includes having an embedding model and a database of choice.

For this demo, we will be using Nomic’s embedding model and MongoDB in order to accomplish this

Nomic AI builds tools to enable anyone to interact with AI scale datasets and models. Nomic Atlas enables anyone to instantly visualize, structure, and derive insights from millions of unstructured data points. The text embedder, known as Nomic Embed, is the backbone of Nomic Atlas, allowing users to search and explore their data in new ways.

Implementing a RAG System on DuckDB Using Jina AI and SuperDuperDB

March 11, 2024 · 9 min read

Anita Okoh

Data Scientist at SuperDuperDB

Fernando Guerra

Growth at SuperDuperDB

Querying your SQL database purely in human language

RAG = DuckDB + SuperDuperDB + Jina AI

Unless you live under a rock, you must have heard the buzzword “LLMs”.

It’s the talk around town.

LLM models, as we all know, have so much potential. But they have the issue of hallucinating and a knowledge cut-off.

The need to mitigate these two significant issues when using LLMs has led to the rise of RAGs and the implementation of RAGs in your existing database.

Integrating AI directly with your databases to eliminate complex MLOps and vector databases

December 5, 2023 · 10 min read

Duncan Blythe

CTO & Co-founder of SuperDuperDB

Timo Hagenow

CEO & Co-founder of SuperDuperDB

🔮TL;DR: We introduce SuperDuperDB, which has just published its first major v0.1 release. SuperDuperDB is an open-source AI development and deployment framework to seamlessly integrate AI models and APIs with your database. In the following, we will survey the challenges presented by current AI-data integration methods and tools, and how they motivated us in developing SuperDuperDB. We'll then provide an overview of SuperDuperDB, highlighting its core principles, features and existing integrations.

How to efficiently build AI chat applications for your own documents with MongoDB Atlas

October 4, 2023 · 4 min read

Duncan Blythe

CTO & Co-founder of SuperDuperDB

Despite the huge surge in popularity in building AI applications with LLMs and vector search, we haven't seen any walkthroughs boil this down to a super-simple, few-command process. With SuperDuperDB together with MongoDB Atlas, it's easier and more flexible than ever before.

info

We have built and deployed an AI chatbot for questioning technical documentation to showcase how efficiently and flexibly you can build end-to-end Gen-AI applications on top of MongoDB with SuperDuperDB: https://www.question-the-docs.superduperdb.com/

Implementing a (RAG) chat application like a question-your-documents service can be a tedious and complex process. There are several steps involved in doing this:

Walkthrough: How to enable and manage MongoDB Atlas Vector Search with SuperDuperDB

October 1, 2023 · 4 min read

Duncan Blythe

CTO & Co-founder of SuperDuperDB

Timo Hagenow

CEO & Co-founder of SuperDuperDB

In step-by-step tutorial we will show how to leverage MongoDB Atlas Vector Search with SuperDuperDB, including the generation of vector embeddings. Learn how to connect embedding APIs such as OpenAI or use embedding models for example from HuggingFace with MongoDB Atlas with simple Python commands.

info

SuperDuperDB makes it very easy to set up multimodal vector search with different file types (text, image, audio, video, and more).

Install superduperdb Python package

Using vector-search with SuperDuperDB on MongoDB requires only one simple python package install:

Jumpstart AI development on MongoDB with SuperDuperDB

September 30, 2023 · 3 min read

Duncan Blythe

CTO & Co-founder of SuperDuperDB

MongoDB now supports vector-search on Atlas enabling developers to build next-gen AI applications directly on their favourite database. SuperDuperDB now make this process painless by allowing to integrate, train and manage any AI models and APIs directly with your database with simple Python.

Build next-gen AI applications - without the need of complex MLOps pipelines and infrastructure nor data duplication and migration to specialized vector databases:

(RAG) chat applications on documents hosted in MongoDB Atlas
semantic-text-search & similiarity-search, using vector embeddings of your data stored in Atlas
image similarity & image-search on images hosted in or referred to on MongoDB Atlas
video search including search within videos for key content
content based recommendation based on content hosted in MongoDB Atlas
...and much, much more!

SuperDuperDB now supports Cohere and Anthropic APIs

September 29, 2023 · 2 min read

Duncan Blythe

CTO & Co-founder of SuperDuperDB

We're happy to announce the integration of two more AI APIs, Cohere and Anthropic, into SuperDuperDB.

Cohere and Anthropic provides AI developers with sorely needed alternatives to OpenAI for key AI tasks, including:

text-embeddings as a service
chat-completions as as service.

Building a Documentation Chatbot using FastAPI, React, MongoDB and SuperDuperDB

September 12, 2023 · 9 min read

Nick Byrne

Engineer at SuperDuperDB

Imagine effortlessly infusing AI into your data repositories—databases, data warehouses, or data lakes—without breaking a sweat. With SuperDuperDB, we aim to make this dream a reality. We want to provide everyone with the tools to build AI applications directly on top of their data stores, with just a pinch of Python magic sprinkled on top! 🐍✨

In this latest blog post we take a dive into one such example - a Retrieval Augmented Generation (RAG) app we built directly on top of our MongoDB store.

What is Search by Keywords?​

Integrating Nomic API with MongoDB using SuperDuperDB​

Querying your SQL database purely in human language​

What is Search by Keywords?

Integrating Nomic API with MongoDB using SuperDuperDB

Querying your SQL database purely in human language