By Gonzalo Chiappara

This blog post explores Oracle AI Vector Search, a new feature that introduces vector capabilities to the Oracle Database. Vector embeddings are a powerful tool for tasks like semantic search and Retrieval-Augmented Generation (RAG). We’ll delve into the creation, storage and search functionalities offered by Oracle AI Vector Search.
Providing a practical guide for developers looking to leverage this technology in their applications.


What is a Vector Database?

Vector databases are emerging as a cornerstone of modern search and Retrieval-Augmented Generation (RAG) applications. So, what exactly are they, and why are they so important?

Understanding Embeddings

Imagine capturing the essence of a document, an image, or any data point, and translating it into a numerical code. This is the magic of embeddings. Moreover, these codes represented as arrays of numbers,  embody the semantic meaning of data objects. They essentially become a new language for computers to understand the relationships between different pieces of information.

High-Dimensional Space

Think of a vast, multi-dimensional space. Each embedding occupies a specific location within this space, with similar data points residing closer together. This allows for efficient comparison and retrieval based on semantic meaning, not just exact keyword matches.

The Power of the Vector Database

Here’s where vector databases come in. They are specialized data storage solutions designed specifically to manage these high-dimensional vector representations. Their key strength lies in their ability to perform Approximate Nearest Neighbor (ANN) searches with exceptional speed.

Approximate Nearest Neighbor

Imagine searching for a specific document in a massive library. A traditional database might meticulously compare keywords across every document. In contrast, ANN algorithms within vector databases work differently. They leverage the spatial organization of embeddings in the high-dimensional space to quickly identify the closest matches, significantly reducing the number of comparisons needed. This translates to lightning-fast and highly relevant search results.

Why Vector Databases Matter

The ability to efficiently handle large-scale, high-dimensional data is critical for various applications.Furthermore, vector databases excel in:

  • Semantic Search: Finding information based on meaning and context, not just keywords.
  • Retrieval-Augmented Generation (RAG): Providing LLMs (Large Language Models) with relevant context from retrieved data, leading to more accurate and informative responses.

In a nutshell, vector databases empower us to unlock the true potential of embeddings, paving the way for a new era of intelligent search and AI-driven applications.

The Vector Database Landscape

The world of data management is rapidly evolving, and vector databases have become a prominent player. But with various options available, selecting the right solution can be daunting. This section explores the three main categories of vector databases.

Vector Libraries (FAISS, ANNOY):

These established tools (dating back to the mid-2010s) excel at efficient similarity search. They are:

  • Pros: Powerful for research and development, highly customizable, and easy to integrate into existing projects.
  • Cons: Require you to manage your own infrastructure, offer limited or no support for filtering by metadata, and lack the convenience of SQL queries.

Vector-Only Databases (Milvus, Pinecone, Chroma, Qdrant, Weaviate):

These specialized databases have gained traction in recent years. Additionally, they are built specifically for high-dimensional vector data, making them ideal for tasks like:

  • Pros: Optimized for vector similarity search, efficient handling of large datasets, and generally user-friendly with built-in features.
  • Cons: Limited or no SQL support, may have limitations on enterprise-grade features, and scalability might become an issue for extremely large datasets.

Enterprise Databases with Vector Support (MongoDB, Neo4j, CouchDB, PostgreSQL (pgvector), Oracle, Cloud solutions from AWS & Google, Databricks):

Major players in the database world are starting to embrace vectors. Consequently, these solutions offer:

  • Pros: Eliminate the need for a separate database, enable richer context for RAG by combining vectors with other data types, and generally provide more robust enterprise features.
  • Cons: Vector search functionalities might be basic or under development, performance can vary significantly, and handling massive, high-dimensional datasets may require additional optimizations like data partitioning or specialized indexing techniques to maintain efficient query speeds.

The vector database landscape is constantly evolving. As the technology matures, expect further advancements and optimizations. By understanding the strengths and limitations of each category, you can make an informed decision and leverage the power of vector databases to unlock new possibilities in your data management strategy.

Oracle AI Vector Search

Overview

In 2024, Oracle introduced AI Vector Search, marking a significant advancement in their database technology. This offering goes beyond simply adding vector capabilities. It provides a comprehensive toolkit for working with embeddings, interacting with large language models (LLMs) and even implementing a complete Retrieval-Augmented Generation (RAG) pipeline. As a result, Oracle AI Vector Search positions itself as a powerful solution for developers working with AI applications.

One of the key strengths of AI Vector Search is its ability to perform all necessary tasks within the database itself. This includes creating embeddings, chunking documents, and leveraging LLMs for text processing. This centralized approach streamlines development workflows. However, it also maintains flexibility by allowing users to perform specific tasks externally if desired, using Oracle solely for vector storage.

This functionality is achieved through the addition of a new vector data type, vector indexes and a set of SQL functions and PL/SQL packages. Notably, users can import embedding models directly into the database using the ONNX format, providing flexibility in choosing the most suitable model for their project. In addition , integration with third-party services is seamless, with credentials securely managed within the database.

The true power of AI Vector Search lies in its ability to combine the strengths of an enterprise database with advanced vector search capabilities. Moreover, users can retrieve and search data by seamlessly blending semantic vectors with traditional business data filtering.

The potential applications for AI Vector Search are extensive. Developers can leverage it to expand the knowledge of LLMs with RAG, build robust recommender systems, or perform efficient text and semantic searches.

For those interested in exploring AI Vector Search, Oracle Cloud offers a cost-effective option. The free tier provides access to always-free Autonomous Databases, allowing developers to experiment with the features and assess its suitability for their projects.

Generate Embeddings

Oracle Database provides a versatile approach to generating embeddings, offering both in-database and external options.

Within the Database:

ONNX Embedding Models: Leverage pre-trained models in the ONNX format. The DBMS_VECTOR.LOAD_ONNX_MODEL procedure within the DBMS_VECTOR PL/SQL package facilitates importing these models. Simply provide the ONNX file and a chosen model name. Keep in mind, model size is limited to 1 gigabyte.

Here’s an example SQL query demonstrating how to generate an embedding with a loaded ONNX model named doc_model:

SELECT TO_VECTOR(VECTOR_EMBEDDING(doc_model USING 'hello' AS data)) AS embedding;

External Embedding Providers:

Oracle supports various third-party embedding providers like Cohere, Generative AI, Google AI, Hugging Face, OpenAI, and Vertex AI. To leverage them:

  1. Create Credentials: Utilize the DBMS_VECTOR.CREATE_CREDENTIAL procedure. Assign a name to the credential and provide the access token.
  2. Utilize DBMS_VECTOR.UTL_TO_EMBEDDING: This function allows you to interact with external providers. Here’s an example:
SELECT DBMS_VECTOR.UTL_TO_EMBEDDING('hello', json(:params)) FROM dual;

The params variable holds details like the provider, credential name, URL, model and additional provider-specific parameters.

Outside the Database:

This section demonstrates how to create an embedding outside the database using Cohere, convert the embedding into a suitable format and then insert it into an Oracle database table. The code performs the following steps:

  1. Import and Initialize: Import the CohereEmbeddings class from the langchain_cohere library and initialize it with your Cohere API key and model.
  2. Generate Embedding: Define a text document and generate its embedding using the embed_documents method.
  3. Convert to Array: Convert the resulting embedding to a 32-bit floating point array.
  4. Connect to Oracle Database: Establish a connection to the Oracle database using the oracledb library.
  5. Insert Embedding: Insert the embedding into a table named vector_table.
  6. Close Connections: Close the cursor and the database connection.
from langchain_cohere import CohereEmbeddings

cohere_embeddings = CohereEmbeddings(
    cohere_api_key="<<apiKey>>", model="embed-english-light-v3.0"
)

text = "This is a test document."

doc_result = cohere_embeddings.embed_documents([text])

import array

vector_data_32 = array.array("f", doc_result[0])

import oracledb

conn = oracledb.connect(user="<<user>>", password="<<password>>", dsn="<<dsn>>")

cursor = conn.cursor()

cursor.execute(
    "insert into vector_table (v32) values (:1)",
    [vector_data_32],
)

cursor.close()
conn.close()

This flexibility empowers developers to choose the embedding generation approach that best suits their specific needs and project requirements.

Store Embeddings

Oracle AI Vector Search introduces the VECTOR data type, specifically designed to store vector representations of data efficiently. In the following section, we will explore how to leverage this data type for embedding storage and management.

Creating Tables for Embeddings:

The process is straightforward. You can define a table with a VECTOR column to store the embeddings, as illustrated below:

CREATE TABLE my_vectors (id NUMBER, embedding VECTOR(768, INT8));

Here, the embedding column is defined as a VECTOR data type with 768 dimensions and the INT8 format.  Specifying the number of dimensions and format is optional, with options including INT8, FLOAT32, and FLOAT64. While vectors themselves can be null, their dimensions cannot.

Conversion Functions:

Oracle provides functions to seamlessly convert between VARCHAR2/CLOB data types and VECTOR. This simplifies the process of integrating existing data sources with your vector representations.

Large-Scale Loading:

For efficient loading of large numbers of vectors, Oracle’s SQLLoader steps in. It supports loading vectors from both text files and binary format through .fvec files. This streamlines the process of importing vast amounts of embedding data into your database.

Vector Indexes

Efficient vector searches hinge on the power of indexes. Oracle AI Vector Search empowers you to create vector indexes, enabling the implementation of Approximate Nearest Neighbor (ANN) search. This significantly reduces retrieval times compared to linear searches. It’s important to remember that these indexes provide approximate results.

Types of Vector Indexes:

Most vector indexes fall into two computational categories: clustering and graphs. Oracle supports:

  • Inverted File (IVF) Indexes (Neighbor Partitioning): These partition data points into clusters, enabling efficient searching within relevant clusters.
  • Hierarchical Navigable Small World (HNSW) Indexes (In-Memory Neighbor Graph): These create a navigable graph structure to locate similar vectors. However, HNSW indexes are not available in RAC environments.

Choosing the Right Distance Metric:

The distance metric you choose should align with the recommendations of your embedding model. It essentially calculates the similarity between vectors. Common options include EUCLIDEAN, L2_SQUARED, COSINE, DOT, MANHATTAN, and HAMMING. You can specify the metric during index creation or within the search query itself.

Fine-Tuning Index Parameters:

For IVF indexes, you can define the number of clusters and the number of centroids (representative points) to explore during a search. Additionally, indexes allow specifying a target accuracy percentage to influence the search. A value of 100% aims for an exact search, but this isn’t guaranteed. This target accuracy can be set during index creation, in the query or even overridden within the query itself.

Here’s an example of creating an IVF index:

CREATE VECTOR INDEX galaxies_ivf_idx ON galaxies (embedding) ORGANIZATION NEIGHBOR PARTITIONS
DISTANCE COSINE
WITH TARGET ACCURACY 95;

Maintaining Index Accuracy:

It’s important to note that DML operations (insert, update, delete) are not allowed after creating an HNSW index. For IVF indexes, DML operations can degrade accuracy over time. Periodically rebuilding the index may be necessary.

Monitoring and Optimization:

A valuable feature is Oracle’s provision of functions to report index accuracy based on query vectors used in approximate searches. This allows you to continuously monitor and optimize the accuracy of your vector search implementation.

Vector Distance

Computing vector distances is a core function in any vector search application. Oracle AI Vector Search provides a comprehensive set of tools to facilitate this process.

Distance Functions and Operators:

  • VECTOR_DISTANCE: This is the primary function for calculating distances between vectors. It allows you to specify the desired distance metric for the calculation.
  • Shorthand Functions: Oracle offers convenient shorthand functions for common distance metrics, such as L1_DISTANCE, L2_DISTANCE, COSINE_DISTANCE, and INNER_PRODUCT. These functions simplify your code and improve readability.
  • Distance Operators: As an alternative to functions, Oracle supports operators that mirror the functionality of distance functions. This provides a more concise approach to distance calculations within your SQL queries.

Exact Similarity Search:

The following example demonstrates how to perform an exact similarity search using the VECTOR_DISTANCE function:

SELECT docID
FROM vector_tab
ORDER BY VECTOR_DISTANCE(embedding, :query_vector, EUCLIDEAN_SQUARED)
FETCH FIRST 10 ROWS ONLY;

In this example, the query retrieves the document IDs from the vector_tab table, ordered by their distance to the provided query_vector. The EUCLIDEAN_SQUARED distance metric is used for the calculation.
Additionally, the FETCH FIRST 10 ROWS ONLY clause limits the results to the top 10 closest matches.

Approximate Similarity Search:

For scenarios where exact results are not essential, you can leverage approximate search for faster retrieval times. Here’s an example:

SELECT name
FROM galaxies
WHERE name <> 'NGC1073'
ORDER BY VECTOR_DISTANCE(embedding, TO_VECTOR('[0,1,1,0,0]'), COSINE)
FETCH APPROXIMATE FIRST 3 ROWS ONLY;

This query searches the galaxies table for galaxies closest to a specific vector representation ([0,1,1,0,0]), excluding the galaxy named 'NGC1073'. The COSINE distance metric is used for the search, and the FETCH APPROXIMATE FIRST 3 ROWS ONLY clause retrieves the top 3 approximate matches.

Matching Distance Metrics:

It’s crucial to ensure that the distance metric used in your queries aligns with the metric used during the training of your vector embedding model. This ensures that the calculated distances accurately reflect the similarity between vectors.

RAG

Large Language Models (LLMs) have become a game-changer in natural language processing, but their tendency to generate inaccurate or fabricated information (hallucinations) remains a concern. Retrieval-Augmented Generation (RAG) offers a powerful solution by providing LLMs with relevant context derived from data sources, enhancing the accuracy and factual basis of their outputs.

Oracle AI Vector Search shines in this area by offering a comprehensive, in-database solution for implementing RAG. This eliminates the need for external tools or frameworks, streamlining development and keeping sensitive data secure within the trusted Oracle Database environment.

Implementing RAG with Oracle AI Vector Search

Here’s a breakdown of the implementation process:

  1. Data Loading and Preprocessing:
    • Establish a database table to store documents as BLOB (Binary Large Object) columns.
    • Create a separate table to house vector embeddings, representing smaller, manageable chunks of the original documents.
    • Leverage utility functions within the DBMS_VECTOR_CHAIN PL/SQL package to automate data preparation:
      • DBMS_VECTOR_CHAIN.UTL_TO_TEXT: Easily converts documents (PDF, Word, etc.) into plain text.
      • DBMS_VECTOR_CHAIN.UTL_TO_CHUNKS: Splits documents into smaller, more manageable sections.
      • DBMS_VECTOR_CHAIN.UTL_TO_EMBEDDINGS: Generates vector embeddings for each document chunk.
  2. Query Processing and Retrieval:
    • Utilize the VECTOR_EMBEDDING SQL function to convert the user’s query into a vector representation, enabling efficient comparison with document chunk embeddings.
    • Employ the powerful combination of the VECTOR_DISTANCE SQL function and the FETCH APPROX clause to perform a similarity search. This identifies documents most relevant to the user’s query.
  3. LLM Integration (Optional):
    For even greater flexibility, Oracle provides the UTL_TO_GENERATE_TEXT function within the DBMS_VECTOR_CHAIN package. This function allows you to invoke a third-party LLM to generate text based on the retrieved documents and the user’s query. By feeding LLMs with relevant context from your reliable data sources, RAG ensures factual responses and minimizes the risk of hallucinations.

Wrapping Up

Oracle AI Vector Search packs a punch with its in-database functionalities. Specifically, it streamlines development by handling data loading, preprocessing, search, and even LLM integration – all within the familiar Oracle environment. Consequently, this not only simplifies things but also keeps sensitive data secure.

Moreover, the built-in RAG capabilities are a real game-changer. By feeding LLMs relevant context from your data, RAG ensures factual responses and minimizes the risk of “hallucinations” these models are prone to. As a result, this makes Oracle AI Vector Search a perfect fit for businesses already invested in Oracle databases, especially those looking to leverage AI for tasks like chatbots, personalized recommendations, or content filtering.

Although it’s not a free service, Oracle AI Vector Search’s potential return on investment is undeniable. Its availability on Oracle Cloud Infrastructure (OCI) offers a potentially cost-effective option.

However, the one missing piece of the puzzle right now is benchmarking data. While the features are impressive, real-world performance metrics would solidify Oracle AI Vector Search’s position as a leader. Therefore, we eagerly await these benchmarks to get the full picture.

In conclusion, Oracle AI Vector Search emerges as a secure, flexible, and highly promising solution for building reliable and informative AI applications. Ultimately, its tight integration with the Oracle database and its focus on RAG functionality make it a compelling option for businesses taking the leap into the world of AI.

References and Resource Links

The Ultimate Guide to the Vector Database Landscape: 2024 and …

Navigating the Landscape of Vector Databases – LinkedIn

Oracle Announces General Availability of AI Vector Search in Oracle Database 23ai

Oracle Database Oracle AI Vector Search User’s Guide, 23ai

langchain_cohere.embeddings.CohereEmbeddings — 🦜🔗 LangChain 0.2.3

14. Using Vector Data — python-oracledb 2.3.0b1 documentation

Other blogs

Shape
Get in touch with one of our specialists. Let's discover how can we help you.
Training, developing and delivering machine learning models into production
Document