Model Group
Multimodel Serving (MMS) introduces a new capability where you can deploy and manage several machine learning models as a group through a construct called a Model Group.
Model Group is a resource representing collection of models in the Model Store. You can deploy and manage up to 500 models (limited by shape) in a single model deployment. Using it simplifies operations by reducing the overhead of managing several individual model deployments. This represents a significant evolution from traditional single-model deployments to a more dynamic model management and cost aware inferencing.
The key capabilities of the Model Group resource is as follows:
- Model Group Lifecycle Management - Model Groups support immutability and versioning, providing robust lifecycle tracking, reproducibility and safe iteration of deployments.
- Inference Keys - You can use SaaS friendly names instead of model OCIDs for inferencing calls. Inference keys are an alias to model OCIDs.
- Custom Metadata – list of key-value pairs passed to inference container and model specific variables in a Model Group. This feature helps several LLMs to be pinned to GPU cards.
To use Model Groups, you need to have applied the Model Group Policies.
Key Concepts
Model Group
- A Model Group is a logical resource that holds several models.
- When deployed, each model in a model group is identified by its model OCID and inference key (optional).
- The following types are supported:
- Homogeneous
- A group of models of the same type deployed together in a shared runtime environment. These models operate independently but use the same compute and memory resources for efficient infrastructure usage.
- Stacked
- An extension of the Homogeneous Group, designed for large language models (LLMs) with a base model and several fine-tuned weights.
- Heterogeneous
- The Model Group consists of models built on different ML frameworks, such as PyTorch, TensorFlow, or ONNX. This group type lets the deployment of diverse model architectures in a single serving environment.
- Inference Keys Inference Keys let you use SaaS-friendly aliases instead of model OCIDs when making inference calls. An inference key acts as an alias mapped to a specific model OCID and is defined during the creation of a Model Group.Note
Inference keys support a maximum length of 32 characters.The following are snapshots showing how to define an inference key using both the REST API and the SDK.
Rest API:SDK:"memberModelEntries": { "memberModelDetails": [ { "inferenceKey": "key1", "modelId": "ocid1.datasciencemodel.oc1.iad.aaaaaaaa4kqzxsqdmlf3x2hedpyghfpy727odfuwr3pwwhocw32wbtjuj5zq" }, { "inferenceKey": "key2", "modelId": "ocid1.datasciencemodel.oc1.iad.aaaaaaaa5oyorntk2xa2swphlzqgjwmevnrentlcay7ixy5bahkuwb34xlpq" }, { "inferenceKey": "key3", "modelId": "ocid1.datasciencemodel.oc1.iad.aaaaaaaatutjajr32s5uggnv3zud3ve4rya57innybhpkuam3egzmvow4zvq" } ] }
member_model_details_list = [ MemberModelDetails( model_id="ocid1.datasciencemodel.oc1.iad.amaaaaaam3xyxziav7hda2c2xn57bifhvfjnb63teaxsyal4hie2uykkwrtq", inference_key="key-1"), MemberModelDetails( model_id="ocid1.datasciencemodel.oc1.iad.amaaaaaam3xyxzia4qrtrviyzlhkvaimsl6aub7nldtnzts72voejpdvmu2q", inference_key="key-2"), MemberModelDetails( model_id="ocid1.datasciencemodel.oc1.iad.amaaaaaam3xyxziaxicmn7domsjwl5ojmks3dki32ffy26prhey6tmxiwkeq", inference_key="key-3") ]
Creating a Model Group
- Create a Model Group
- Use this option to create a new Model Group from scratch.
- Clone and Change an Existing Model Group
- Use this option if you already have an existing Model Group and intend to change it by adding or removing models.
- Clone from Model Group Version History
- Use this option to update the latest model group version of the Model Group Version History.
Create a Model Group
Create a model group from scratch.
Use this option to create a new Model Group. Apply the Model Group Policies. If you already have an existing Model Group and intend to change it by adding or removing models, skip this step and go to Step 2: Clone or Patch Model Group.
Clone or Patch a Model Group
Use this option if you already have an existing Model Group and want to change its composition by adding or removing one or more models. If you don't need to change an existing group, move to the next step.
Use clone to:
-
Create a new model group from an existing one.
-
Change models (add or remove) while cloning.
Patch operations:
-
INSERT: Add new models.
-
REMOVE: Remove existing models.
Clone a Model Group from a Model Group Version History
SDK
Clone the Model Group from the latest version of the Model Group version history:def __clone_from_model_group_version_history(compartment_id, project_id, model_group_version_history_id): print("cloning from the Model Group Version History") new_member_model_details_list = [ MemberModelDetails( model_id="ocid1.datasciencemodel.oc1.<ocid>", inference_key="key-11"), MemberModelDetails( model_id="ocid1.datasciencemodel.oc1.<ocid>", inference_key="key-12") ] remove_member_model_details_list = [ MemberModelDetails( model_id="ocid1.datasciencemodel.oc1.<ocid>", inference_key="key-3") ] patch_insert_model_details = PatchInsertNewMemberModels() patch_insert_model_details.values = new_member_model_details_list patch_remove_model_details = PatchRemoveMemberModels() patch_remove_model_details.values = remove_member_model_details_list patch_instruction_list = [patch_insert_model_details, patch_remove_model_details] patch_model_group_member_model_details_object = PatchModelGroupMemberModelDetails() patch_model_group_member_model_details_object.items = patch_instruction_list modify_model_group_details_object = ModifyModelGroupDetails() modify_model_group_details_object.display_name = "test model group clone from mgvh" modify_model_group_details_object.description = "test model group clone from mgvh" clone_create_from_model_group_version_history_object = CloneCreateFromModelGroupVersionHistoryDetails() clone_create_from_model_group_version_history_object.source_id = model_group_version_history_id clone_create_from_model_group_version_history_object.patch_model_group_member_model_details = patch_model_group_member_model_details_object clone_create_from_model_group_version_history_object.modify_model_group_details = modify_model_group_details_object clone_model_group_details_object = CloneModelGroupDetails() clone_model_group_details_object.compartment_id = compartment_id clone_model_group_details_object.project_id = project_id clone_model_group_details_object.model_group_clone_source_details = clone_create_from_model_group_version_history_object try: model_group_response = data_science_client.create_model_group(clone_model_group_details_object) model_group_id = json.loads(str(model_group_response.data))['id'] logger.info(model_group_id) print(model_group_response.headers) return model_group_id except Exception as e: logger.error("Failed to create model group with error: %s", format(e))
API
Deploy the Model Group
- Select Models.
- Select Model Groups.
- Select the Model Group to deploy.
- Select Submit.
- Create the model group deployment:
# 1. Create model group configuration details object model_group_config_details = ModelGroupConfigurationDetails( model_group_id="ocid1.modelgroup.oc1..exampleuniqueID" bandwidth_mbps=<bandwidth-mbps>, instance_configuration=<instance-configuration>, scaling_policy=<scaling-policy> ) # 2. Create infrastructure configuration details object infrastructure_config_details = InstancePoolInfrastructureConfigurationDetails( infrastructure_type="INSTANCE_POOL", instance_configuration=instance_config, scaling_policy=scaling_policy ) # 3. Create environment configuration environment_config_details = ModelDeploymentEnvironmentConfigurationDetails( environment_configuration_type="DEFAULT", environment_variables={"WEB_CONCURRENCY": "1"} ) # 4. Create category log details category_log_details = CategoryLogDetails( access=LogDetails( log_group_id=<log-group-id>, log_id=<log-id> ), predict=LogDetails( log_group_id=<log-group-id>, log_id=<log-id> ) ) # 5. Bundle into deployment configuration model_group_deployment_config_details = ModelGroupDeploymentConfigurationDetails( deployment_type="MODEL_GROUP", model_group_configuration_details=model_group_config, infrastructure_configuration_details=infrastructure_config_details, environment_configuration_details=environment_config_details ) # 6. Set up parameters required to create a new model deployment. create_model_deployment_details = CreateModelDeploymentDetails( display_name=<deployment_name>, description=<description>, compartment_id=<compartment-id>, project_id=<project-id>, model_deployment_configuration_details=model_group_deployment_config_details, category_log_details=category_log_details ) # 7. Create deployment using SDK client response = data_science_client.create_model_deployment( create_model_deployment_details=create_model_deployment_details ) print("Model Deployment OCID:", response.data.id)
- Create the model group deployment:
{ "displayName": "MMS Model Group Deployment", "description": "mms", "compartmentId": compartment_id, "projectId": project_id, "modelDeploymentConfigurationDetails": { "deploymentType": "MODEL_GROUP", "modelGroupConfigurationDetails": { "modelGroupId": model_group_id }, "infrastructureConfigurationDetails": { "infrastructureType": "INSTANCE_POOL", "instanceConfiguration": { "instanceShapeName": "VM.Standard.E4.Flex", "modelDeploymentInstanceShapeConfigDetails": { "ocpus": 8, "memoryInGBs": 128 } }, "scalingPolicy": { "policyType": "FIXED_SIZE", "instanceCount": 1 } }, "environmentConfigurationDetails": { "environmentConfigurationType": "DEFAULT", "environmentVariables": { "WEB_CONCURRENCY": "1" } } }, "categoryLogDetails": { "access": { "logGroupId": "ocid1.loggroup.oc1.iad.amaaaaaav66vvniaygnbicsbzb4anlmf7zg2gsisly3ychusjlwuq34pvjba", "logId": "ocid1.log.oc1.iad.amaaaaaav66vvniavsuh34ijk46uhjgsn3ddzienfgquwrr7dwa4dzt4pirq" }, "predict": { "logGroupId": "ocid1.loggroup.oc1.iad.amaaaaaav66vvniaygnbicsbzb4anlmf7zg2gsisly3ychusjlwuq34pvjba", "logId": "ocid1.log.oc1.iad.amaaaaaav66vvniavsuh34ijk46uhjgsn3ddzienfgquwrr7dwa4dzt4pirq" } } } }