CLIP
Model name: clip_local
About CLIP
CLIP (Contrastive Language-Image Pre-training) is a model that learns visual concepts from natural language supervision. It is a zero-shot learning model that can be used for a wide range of vision and language tasks.
Read more about CLIP on OpenAI's website.
Supported aidb operations
- encode_text
- encode_text_batch
- encode_image
- encode_image_batch
Supported models
- openai/clip-vit-base-patch32 (default)
Creating the default model
There is only one model, the default openai/clip-vit-base-patch32
, so we do not need to specify the model in the configuration. No credentials are required for the CLIP model.
Creating a specific model
There are no other model configurations available for the CLIP model.
Model configuration settings
The following configuration settings are available for CLIP models:
model
- The CLIP model to use. The default isopenai/clip-vit-base-patch32
and is the only model available.revision
- The revision of the model to use. The default isrefs/pr/15
. This entry is a reference to the model revision in the HuggingFace repository, and is used to specify the model version to use, in this case this branch.image_size
- The size of the image to use. The default is224
.
Model credentials
No credentials are required for the CLIP model.
Could this page be better? Report a problem or suggest an addition!