Using Models in AI Accelerator Pipelines

Pipelines has a model registry that manages configured instances of models. Any Pipelines functions that use models, such as embedding and retrieving, must reference a model in this registry.

Discover the preloaded models

Pipelines comes with a set of pre-created models that you can use out of the box.

To find them, you can run the following query:

SELECT * FROM aidb.list_models();

This will return a list of all the models that are currently registered in the system. If you have not registered any models, you'll see the default models that come with Pipelines.

         name          |     provider      |              options
-----------------------+-------------------+-----------------------------------
 bert                  | bert_local        | {"config={}"}
 clip                  | clip_local        | {"config={}"}
 t5                    | t5_local          | {"config={}"}
 dummy                 | dummy             | {"config={}"}

The bert, clip, and t5 models are all registered and ready to use. The dummy model is a placeholder model that can be used for testing purposes.

Creating a Model

You can also create your own models. To do this, you can use the aidb.create_model function. Here is an example of how to create a model:

SELECT aidb.create_model('my_model', 'bert_local');

This will create a model named my_model that uses the default bert_local model provider. But, this is essentially the same as using the bert model thats already registered.

Discovering the Model Providers

You can also find out what model providers are available by running the following query:

SELECT * FROM aidb.model_providers;
Output
    server_name     | server_options
--------------------+----------------
 t5_local           |
 openai_embeddings  |
 openai_completions |
 bert_local         |
 clip_local         |
 dummy              |

This will return a list of all the model providers that are currently available in the system. You can find out more about these providers and their capabilities in the Supported Models section.

Creating a Model with a Configuration

You can also pass options to the model when registering it. For example, you can specify the model configuration:

SELECT aidb.create_model('my_model', 
                            'bert_local', 
                            '{"model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
                             "revision": "main"}'::JSONB);

This will create a model named my_model that uses the bert_local model provider and the sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 model from HuggingFace. The revision option specifies the version of the model to use. The options are passed as a JSONB object, with a single quoted string that is then cast to JSONB. Within the string are the key-value pairs that define the model configuration in a single JSON object.

Registering a Model with Configuration and Credentials

This is where the other supported models come in. You can create a different model by specifying the model name in the configuration. The OpenAI Completions and OpenAI Embeddings models are both models which you can create to make use of OpenAI's completions and embeddings APIs.

You need to provide more information to the aidb.create_model function when registering a model like these. Completions has a number of options, including selecting which model it will use on OpenAI. Both Completions and Embeddings requires API credentials. Here is an example of how to create the OpenAI Completions model:

SELECT aidb.create_model(
  'my_openai_model',
  'openai_completions',
  '{"model": "gpt-4o"}'::JSONB,
  '{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB
  };

You should replace the api_key value with your own OpenAI API key. Now you can use the my_openai_model model in your Pipelines functions and, in this example, leverage the GPT-4o model from OpenAI.

You can also create the OpenAI Embeddings model in a similar way.

SELECT aidb.create_model(
  'my_openai_embeddings',
  'openai_embeddings',
  '{"model": "text-embedding-3-large"}'::JSONB,
  '{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB
  };

This will create the text-embedding-3-large model with the name my_openai_embeddings. You can now use this model in your Pipelines functions to generate embeddings for text data.

Using models with OpenAI compatible APIs

These OpenAI models work with any OpenAI compatible API. This allows you to connect and use an even wider range of models, just by passing the appropriate API endpoint to the url option in the aidb.create_model function's options.

For more information about the OpenAI models, see the OpenAI Completions and OpenAI Embeddings pages.


Could this page be better? Report a problem or suggest an addition!