[go: up one dir, main page]

8i | 9i | 10g | 11g | 12c | 13c | 18c | 19c | 21c | 23ai | 24ai | Misc | PL/SQL | SQL | RAC | WebLogic | Linux

Home » Articles » 23 » Here

AI Vector Search in Oracle Database 23ai

This article provides a simple example of using the AI Vector Search feature in Oracle database 23ai.

Choose the Model

We can't do anything useful with vectors unless we have some way of generating them from our data. This could be done externally and the vectors loaded into the database, or we can load a model into the database and generate vectors from our data using that model. I know nothing about creating my own model, so I'm going to use one provided by Oracle called all-MiniLM-L12-v2. This model is explained here. We are going to use it to generate vectors from plain text.

Download the ONNX model from here.

Load the Model

We create a directory to hold the model, download it, and unzip the model into that directory.

mkdir -p /u01/models
cd /u01/models

wget https://adwc4pm.objectstorage.us-ashburn-1.oci.customer-oci.com/p/VBRD9P8ZFWkKvnfhrWxkpPe8K03-JIoM5h_8EJyJcpE80c108fuUjg7R5L5O7mMZ/n/adwc4pm/b/OML-Resources/o/all_MiniLM_L12_v2_augmented.zip

unzip -oq all_MiniLM_L12_v2_augmented.zip

We connect to the database, create a test user, create a directory object pointing to the physical directory, and grant access to it for our test user.

conn sys/SysPassword1@//localhost:1521/freepdb1 as sysdba

create user if not exists testuser1 identified by testuser1 quota unlimited on users;
grant create session, db_developer_role, create mining model to testuser1;

create or replace directory model_dir as '/u01/models';
grant read, write on directory model_dir to testuser1;

We can now load the model into the database using the DBMS_VECTOR package.

begin
  dbms_vector.drop_onnx_model (
    model_name => 'ALL_MINILM_L12_V2',
    force => true);

  dbms_vector.load_onnx_model (
    directory  => 'model_dir',
    file_name  => 'all_MiniLM_L12_v2.onnx',
    model_name => 'ALL_MINILM_L12_V2');
end;
/

We see the model information in the USER_MINING_MODELS view.

column model_name format a30
column algorithm format a10
column mining_function format a15

select model_name, algorithm, mining_function
from   user_mining_models
where  model_name = 'ALL_MINILM_L12_V2';

MODEL_NAME                     ALGORITHM  MINING_FUNCTION
------------------------------ ---------- ---------------
ALL_MINILM_L12_V2              ONNX       EMBEDDING

SQL>

Generating Vectors (VECTOR Data Type)

We can now generate vectors using the VECTOR_EMBEDDING function. In the following example we generate a vector for the text "Quick test". As you can see, the resulting vector is really big considering the size of the text.

set long 1000000

select vector_embedding(all_minilm_l12_v2 using 'Quick test' as data) AS my_vector;

MY_VECTOR
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[-3.8644433E-002,7.27762729E-002,-6.99380785E-003,-7.29616638E-003,8.81515723E-003,-6.36086538E-002,4.39667143E-003,-4.20215279E-002,-1.32307231E-001,-5.83763141E-003,-1.32369539E-002,-1.62914675E-002
,6.54895976E-003,-4.98352237E-002,-1.98450759E-002,-4.69920859E-002,1.03937693E-001,-8.96753371E-002,-2.77858647E-003,4.13947217E-002,-6.51627034E-002,-1.0990192E-001,-8.73052608E-003,2.533352E-002,-1
.42030744E-002,-2.42071413E-002,1.91592015E-002,4.93750861E-003,6.30867062E-003,-1.24127828E-001,-7.17296917E-003,3.73173878E-002,4.9763605E-002,4.52162512E-002,1.49683282E-002,-2.21796334E-002,-3.679
36343E-002,-6.20212057E-004,7.16803446E-002,5.33913262E-003,1.92087796E-002,-9.91346464E-002,3.90679725E-002,2.22725477E-002,5.04363291E-002,1.81943234E-002,5.34031764E-002,1.44161787E-002,-1.99908093
E-002,-1.20323906E-002,-2.63888612E-002,-4.14666645E-002,6.24738075E-002,-4.68838811E-002,1.16748791E-002,-2.43180897E-002,-3.11982501E-002,-7.57505326E-003,2.25466546E-002,-4.17359509E-002,1.23237111
E-002,4.31706719E-002,-7.83751085E-002,1.24918474E-002,5.42060696E-002,4.33742851E-002,2.52278009E-003,-1.15482165E-002,-9.98713658E-004,-2.12613232E-002,1.00960573E-002,3.17986757E-002,-1.13147125E-0
02,-1.26893111E-002,2.66182888E-002,-7.5068539E-003,-3.70341614E-002,1.9485198E-002,-2.921376E-002,-2.61210185E-002,2.86212768E-002,-9.15901735E-002,1.50552308E-002,-4.9816858E-002,2.29324102E-002,7.8
2513991E-003,4.22973074E-002,3.37974578E-002,-4.23457436E-002,-6.32970929E-002,3.84950414E-002,-1.93851739E-002,1.96237396E-003,-3.91559181E-004,7.80333811E-003,5.63595779E-002,4.45815139E-002,-4.9701
6683E-002,1.36383837E-002,2.76547611E-001,6.35802001E-002,-1.69337578E-002,-3.25948372E-002,2.74621435E-002,-1.84808951E-002,-3.58916447E-002,3.18281911E-003,-3.92074399E-002,-5.03929285E-003,-3.95198
613E-002,2.64223926E-002,5.44404797E-002,-2.97637819E-003,1.0656476E-002,4.55005579E-002,-9.6166715E-002,4.53018732E-002,3.02239601E-002,-1.11025617E-001,6.18583001E-002,8.55141282E-002,-1.51456818E-0
02,-5.6408301E-002,-5.93017961E-004,1.07500203E-001,-6.81523681E-002,1.85917076E-002,3.75313126E-002,-3.27163152E-002,-4.72421981E-002,5.59753105E-002,2.20437739E-002,2.74991728E-002,2.6306238E-002,-5

MY_VECTOR
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
.36230095E-002,6.86871773E-003,5.06029651E-003,8.86866376E-002,3.97678688E-002,-1.4952058E-003,-1.11991599E-001,-1.48920929E-002,-1.42183332E-002,5.44683374E-002,-5.46902604E-002,-3.37714627E-002,-3.9
3099897E-002,-8.8875033E-003,-2.50034444E-002,-3.8671799E-002,7.15422854E-002,-1.72948167E-002,5.7217218E-002,1.23445876E-002,-6.25340268E-002,-1.97963398E-002,4.08164188E-002,9.2235636E-003,2.3645639
4E-002,-4.27598739E-003,-1.24366455E-001,8.28649029E-002,-5.27118742E-002,-1.11121098E-002,4.35752161E-002,1.35777248E-002,-2.25060564E-002,4.55260463E-002,3.89738753E-002,-8.9330703E-002,1.17293425E-
001,5.51190637E-002,-2.56631747E-002,-5.30632809E-002,-8.39535818E-002,4.83218301E-003,6.57674894E-002,8.87271166E-002,-1.52742919E-002,1.05254455E-002,-1.58144645E-002,-3.10783144E-002,-2.9069094E-00
2,7.04223011E-003,-3.09849512E-002,-4.46299138E-003,-7.20088407E-002,-7.05658719E-002,4.65546325E-002,1.10276632E-001,3.60871851E-002,1.86070222E-002,-6.10643029E-002,3.21829244E-002,-1.43657476E-002,
-6.75653145E-002,8.07486102E-002,1.68783069E-002,-1.0059043E-001,-7.55800903E-002,-1.69591829E-002,-4.45710421E-002,-8.60542338E-003,4.33787964E-002,4.29520719E-002,3.94066162E-002,8.67492054E-003,-8.
52123275E-002,1.20206453E-001,-1.14268251E-001,-1.70285124E-002,8.87670461E-003,-4.69080843E-002,-3.02138515E-002,4.61057499E-002,-4.92520146E-002,1.56181203E-002,-9.27053615E-002,-6.08294345E-002,1.4
6451779E-002,-1.8469112E-002,-1.40407547E-001,5.35490811E-002,5.85880674E-033,7.62652978E-002,-3.0770693E-002,-6.7476593E-003,1.03074148E-001,7.20860213E-002,-9.75818858E-002,1.51840553E-001,7.4332207
4E-002,-2.99238227E-002,9.39518213E-002,1.50299622E-002,4.35530245E-002,-7.58084841E-003,-7.49263093E-002,-5.07647246E-002,4.0109925E-002,-7.43360296E-002,4.62087467E-002,9.6142469E-003,3.1514265E-004
,6.2026035E-002,1.57011151E-002,3.29307579E-002,5.6974791E-002,-7.89973959E-002,9.78369173E-003,1.16775157E-002,-3.65987495E-002,-5.30386977E-002,-1.22491308E-002,5.65312728E-002,3.41438502E-002,-4.26
849015E-002,9.84478667E-002,1.52464816E-003,-6.92429468E-002,9.64930728E-002,-1.85021404E-002,4.28027436E-002,-4.41830456E-002,-2.54553054E-002,5.20384647E-002,-1.3808256E-002,-1.59469489E-002,2.10003

MY_VECTOR
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
518E-002,-1.85695086E-002,2.29395907E-002,1.91418286E-002,4.09490913E-002,2.35137846E-002,-3.91655453E-002,3.57466601E-002,4.80409786E-002,-1.02699241E-002,1.45040052E-002,-4.27465513E-002,-5.0925903E
-002,-7.12790117E-002,-9.191221E-002,1.79740768E-002,-3.53490785E-002,-2.26370115E-002,1.64974518E-002,1.05952621E-001,-3.52565721E-002,-3.41151282E-002,-5.72828352E-002,-3.10265515E-002,6.9757998E-00
2,-2.50361152E-002,-3.91423702E-002,1.37532372E-002,6.76134787E-003,-5.15896529E-002,-3.5578981E-002,6.91288859E-002,-3.47342566E-002,-1.09835267E-002,-2.45214235E-002,-3.24611999E-002,1.08489944E-002
,8.2179103E-003,-2.97738314E-002,4.81431969E-002,-5.78631386E-002,2.85624769E-002,4.0292047E-002,2.82907616E-002,-3.79493125E-002,8.59350059E-003,1.02058621E-002,2.48055886E-002,7.16195907E-003,-6.245
35345E-002,-3.2572642E-002,4.26035373E-033,-7.57267745E-003,-4.15649898E-002,-4.98135239E-002,1.02479551E-002,3.28872576E-002,1.50394207E-003,-6.39199391E-002,-7.5357914E-002,-2.46184096E-002,-3.06450
184E-002,4.16100062E-002,7.04020485E-002,-8.15085992E-002,2.55300701E-002,1.8981915E-002,4.26408909E-002,-2.19986513E-002,7.1362392E-003,-3.42554934E-002,3.70068592E-003,-3.15256999E-003,1.41581176E-0
02,5.00134267E-002,7.54985586E-002,6.42605498E-002,7.55612329E-002,1.52721843E-002,1.15661159E-001,-2.45987438E-002,1.08358078E-002,5.02406172E-002,6.28810897E-002,-5.52952439E-002,-5.51969372E-002,-4
.60022828E-003,-1.46539817E-002,6.40283972E-002,5.18338121E-002,2.51765884E-002,6.45218417E-002,-8.35603774E-002,3.57579961E-002,6.28177961E-003,3.15946154E-003,2.32084282E-002,4.76812199E-002,-5.4778
5498E-003,-1.06323607E-001,-1.45862792E-002,-5.92180602E-002,-1.59236379E-002,-1.90922264E-002,4.61262129E-002,2.41158772E-002,-7.9021994E-003,1.11448221E-001,1.11205485E-002,-2.0573834E-002,-4.086579
75E-002,5.54621816E-002,5.37177958E-002,4.76263165E-002,-3.29907499E-002,4.3731384E-002]


SQL>

We can use the VECTOR_EMBEDDING function in DML, which is what we will do in the next example.

We need some text to use in our test, so we grab some movie quotes from Hugging Face. We will use the existing directory as it already has the correct setup.

cd /u01/model
wget https://huggingface.co/datasets/ygorgeurts/movie-quotes/resolve/main/movie_quotes.csv?download=true -O movie_quotes.csv

We connect to the database and create a new table from the CSV file of movie quotes.

conn testuser1/testuser1@//localhost:1521/freepdb1

drop table if exists movie_quotes purge;

create table movie_quotes as
select movie_quote, movie, movie_type, movie_year
from   external (
         (
           movie_quote  varchar2(400),
           movie        varchar2(200),
           movie_type   varchar2(50),
           movie_year   number(4)
         )
         type oracle_loader
         default directory model_dir
         access parameters (
           records delimited by newline
           skip 1
           badfile model_dir
           logfile model_dir:'moview_quotes_ext_tab_%a_%p.log'
           discardfile model_dir
           fields csv with embedded terminated by ',' optionally enclosed by '"'
           missing field values are null
           (
             movie_quote char(400),
             movie,
             movie_type,
             movie_year
           )
        )
        location ('movie_quotes.csv')
        reject limit unlimited
      );


desc movie_quotes
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 MOVIE_QUOTE                                        VARCHAR2(400)
 MOVIE                                              VARCHAR2(200)
 MOVIE_TYPE                                         VARCHAR2(50)
 MOVIE_YEAR                                         NUMBER(4)

SQL>

We add a new column to hold the vector data for each movie quote. we are using the new VECTOR data type.

alter table movie_quotes add (
  movie_quote_vector vector
);


desc movie_quotes
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 MOVIE_QUOTE                                        VARCHAR2(400)
 MOVIE                                              VARCHAR2(200)
 MOVIE_TYPE                                         VARCHAR2(50)
 MOVIE_YEAR                                         NUMBER(4)
 MOVIE_QUOTE_VECTOR                                 VECTOR(*, *)

SQL>

We populate the new column by generating vectors from the movie quotes.

update movie_quotes
set    movie_quote_vector = vector_embedding(all_minilm_l12_v2 using movie_quote as data);

commit;

Vector Search using VECTOR_DISTANCE

We perform a search using the VECTOR_DISTANCE function. This accepts two vectors and as the name suggests, returns a distance between them. Since the model we used generates vectors for text data, we would expect the vector distance to be smaller for two vectors that are similar. In the following examples we create a vector from our search text and order the output of the query by the vector distance between our search text and the quote text.

First we ask for "Films with motivational speaking in them".

variable search_text varchar2(100);
exec :search_text := 'Films with motivational speaking in them';

set linesize 200
column movie format a50
column movie_quote format a100

SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
       movie,
       movie_quote
FROM   movie_quotes
order by 1
fetch approximate first 5 rows only;

  DISTANCE MOVIE                                              MOVIE_QUOTE
---------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------
6.786E-001 Once Upon a Time in Hollywood                      That was the best acting i've ever seen in my whole life.
6.979E-001 Dead Poets Society                                 You must strive to find your own voice because the longer you wait to begin, the less likely you are
                                                               going to find it at all.

7.169E-001 The Pursuit of Happyness                           Walk that walk and go forward all the time. Don't just talk that talk, walk it and go forward. Also,
                                                               the walk didn't have to be long strides; baby steps counted too. Go forward.

7.186E-001 Joker                                              My mother always tells me to smile and put on a happy face. She told me I had a purpose to bring lau
                                                              ghter and joy to the world.

7.234E-001 Blazing Saddles                                    Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you me
                                                              n will only be risking your lives, whilst I will be risking an almost certain Academy Award nominati
                                                              on for Best Supporting Actor.

SQL>

Next we ask for "Films about war".

variable search_text varchar2(100);
exec :search_text := 'Films about war';

set linesize 200
column movie format a50
column movie_quote format a100

SELECT vector_distance(movie_quote_vector, (vector_embedding(all_minilm_l12_v2 using :search_text as data))) as distance,
       movie,
       movie_quote
FROM   movie_quotes
order by 1
fetch approximate first 5 rows only;

  DISTANCE MOVIE                                              MOVIE_QUOTE
---------- -------------------------------------------------- ----------------------------------------------------------------------------------------------------
5.682E-001 Dr. Strangelove                                    Gentlemen, you can't fight in here! This is the War Room!
6.346E-001 Blazing Saddles                                    Men, you are about to embark on a great crusade to stamp out runaway decency in the west. Now you me
                                                              n will only be risking your lives, whilst I will be risking an almost certain Academy Award nominati
                                                              on for Best Supporting Actor.

6.587E-001 Fury                                               Ideals are peaceful; history is violent.
7.243E-001 The Kill Team                                      You give me your loyalty, and I?ll guarantee that each and every one of you will have a chance to be
                                                               a warrior, to actually be a part of history.

7.253E-001 Dr. No                                             Bond. James Bond

SQL>

Create a Vector Index (optional)

If we want to create a vector index, we must set the VECTOR_MEMORY_SIZE parameter for the root container. The amount of memory we need depends on the size and complexity of the data being indexed.

conn / as sysdba

alter system set vector_memory_size = 1G scope=spfile;

shutdown immediate;
startup;

Provided we've assigned enough memory, we can create vector indexes now.

There are two types of vector indexes.

Here is an example of each.

conn testuser1/testuser1@//localhost:1521/freepdb1

drop index if exists movie_quotes_vector_idx;

-- Neighbor Partition Vector Index
create vector index movie_quotes_vector_idx on movie_quotes(movie_quote_vector) organization neighbor partitions
distance cosine
with target accuracy 95;


drop index if exists movie_quotes_vector_idx;

-- In-Memory Neighbor Graph Vector Index
create vector index movie_quotes_vector_idx on movie_quotes(movie_quote_vector) organization inmemory neighbor graph
distance cosine
with target accuracy 95;

More details about vector indexes can be found here.

Considerations

Here are some things to consider when using the AI vector search functionality.

For more information see:

Hope this helps. Regards Tim...

Back to the Top.