Generative AI Models and Regions for Enterprise AI Agents

Enterprise AI Agents in OCI Generative AI support a subset of OCI Generative AI pretrained models and regions. This page lists the supported models and regions for runtime agentic inference and project memory.

To Call Models

For OCI-OpenAI Compatible Endpoints and Tools
Note

The following models are supported for the OCI OpenAI-Compatible Endpoints and OpenAI-Compatible Tools.

Available Chat Models

Agents can call the following chat models for agentic inference use cases:

Google Vertex AI Platform
OpenAI Open Source
xAI Platform
Important

External Calls to xAI Grok Models

The xAI Grok models are hosted in an OCI data center, in a tenancy provisioned for xAI. The xAI Grok models, which can be accessed through the OCI Generative AI service, are managed by xAI.

Available Commercial Regions (OC1)

You can access agentic inference models in one or more of the following OC1 regions:

Region Name Location Region Identifier Region Key
Brazil East (Sao Paulo) Sao Paulo sa-saopaulo-1 GRU
Germany Central (Frankfurt) Frankfurt eu-frankfurt-1 FRA
India South (Hyderabad) Hyderabad ap-hyderabad-1 HYD
Japan Central (Osaka) Osaka ap-osaka-1 KIX
Saudi Arabia Central (Riyadh) Riyadh me-riyadh-1 RUH
UK South (London) London uk-london-1 LHR
US East (Ashburn) Ashburn us-ashburn-1 IAD
US Midwest (Chicago) Chicago us-chicago-1 ORD
US West (Phoenix) Phoenix us-phoenix-1 PHX

Learn About Regions and Availability Domains.

Note

  • UAE East (Dubai): OCI-OpenAI Compatible endpoints and tools aren't available in this region.
  • Availability: Not every listed model is available in the preceding listed regions. For per-model supported regions and deployment details, see the, see the Models by Region page.
  • External Calls: For notes about models with external calls, see External Calls.

To Enable Project Memory

For Short-Term Memory (Conversation History) Compaction

When you create a project, you can enable short-term memory compaction for conversations and responses related to that project. See the following table for the available models and regions for memory compaction.

For Extracting Key Information For Long-Term Memory

hat

When you create a project, you can enable information extraction from conversations and responses for a long-term memory feature. You select an extraction model that aims to extract key information from conversations. See the following table for regions and model supported for extracting key information to use for long-term memory.

Region Available Extraction Model
Available Commercial Regions (OC1) cross reference with regions that OpenAI gpt-oss-120b is available OpenAI gpt-oss-120b
For Storing Key Information as Embeddings For Long-Term Memory

When you create a project, you can, you can select an embedding model to store extracted memories as searchable vectors. The available embedding model depends on the project region. See the following table for the embedding model available in each region:

Region Region Code Available Embed Model
Brazil East (Sao Paulo) sa-saopaulo-1 Cohere Embed Multilingual 3
Germany Central (Frankfurt) eu-frankfurt-1 Cohere Embed Multilingual 3
UK South (London) uk-london-1 Cohere Embed Multilingual 3
India South (Hyderabad) ap-hyderabad-1 Cohere Embed Multilingual Image 3
US East (Ashburn) (cross region to US Midwest (Chicago) us-ahsburn-1 Cohere Embed 4
Japan Central (Osaka) ap-osaka-1 Cohere Embed 4
Saudi Arabia Central (Riyadh) me-riyadh-1 Cohere Embed 4
US Midwest (Chicago) us-chicago-1 Cohere Embed 4
US West (Phoenix) (cross region to US Midwest (Chicago) us-phoenix-1 Cohere Embed 4

External Calls

External Calls to xAI Grok Models

Important

External Calls to xAI Grok Models

The xAI Grok models are hosted in an OCI data center, in a tenancy provisioned for xAI. The xAI Grok models, which can be accessed through the OCI Generative AI service, are managed by xAI.

External Calls to Google Models

Important

External Calls to Google Gemini 2.5 Pro for US Regions

The Google Gemini 2.5 Pro model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Google Gemini 2.5 Pro model (through the OCI Generative AI service) results in a call to a Google location. For Google Gemini 2.5 Pro, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.

Important

External Calls to Gemini 2.5 Flash for US Regions

The Gemini 2.5 Flash model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Gemini 2.5 Flash model (through the OCI Generative AI service) results in a call to a Google location. For Gemini 2.5 Flash, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.

Important

External Calls to Gemini 2.5 Flash-Lite for US Regions

The Gemini 2.5 Flash-Lite model that can be accessed through the OCI Generative AI service in US regions, are hosted externally by Google. Therefore, a call to a Gemini 2.5 Flash-Lite model (through the OCI Generative AI service) results in a call to a Google location. For Gemini 2.5 Flash-Lite, a Google Americas regional location is used, which routes the request to only a Google Americas location. Machine Learning Processing takes place within a Google Americas location.