maxframe.learn.contrib.llm.deploy.config.ModelDeploymentConfig#

class maxframe.learn.contrib.llm.deploy.config.ModelDeploymentConfig(*args, **kwargs)[source]#

Model deployment configuration for extending MaxFrame with custom models.

This configuration is designed for users who need to deploy models that are not available within MaxFrame’s built-in model offerings. It provides a way to specify custom deployment solutions by informing each MaxFrame worker which framework to use, which model path to load, and how to load it.

The configuration assumes that models are already set up in the container image or mounted paths, and uses the current deploy_config to load them. Users are responsible for ensuring the runtime environment state and compatibility.

Parameters:
  • model_name (str) – The name of the model.

  • model_file (str) –

    The local file path of the model, e.g., "/mnt/models/qwen/".

    Note: OSS paths (oss://...) are NOT supported directly.

  • inference_framework_type (InferenceFrameworkEnum) – The inference framework of the model.

  • required_resource_files (List[Union[str, Any]]) – The required resource files of the model.

  • load_params (Dict[str, Any]) – The load params of the model.

  • required_cpu (int) – The required cpu of the model.

  • required_memory (int) – The required memory of the model.

  • required_gu (int) – The required gu of the model.

  • required_gpu_memory (int) – The required gpu memory of the model.

  • device (str, optional) – The device of the model. One of “cpu” or “cuda”. Defaults to None, which allows the server to determine the device at runtime.

  • properties (Dict[str, Any]) – The properties of the model.

  • tags (List[str]) – The tags of the model.

  • envs (Dict[str, str]) – Custom environment variables for the inference subprocess. Example: {"CUDA_VISIBLE_DEVICES": "0", "HF_HOME": "/mnt/cache"}

Notes

  • Preview version for model deployments, all fields could be changed in the future.

User Responsibility Notice: Users must have a complete understanding of what they are computing and ensure they fully comprehend the implications of their configuration choices. You are responsible for:

  • Ensuring model compatibility with the specified inference framework

  • Verifying that model files exist and are accessible in the runtime environment

  • Confirming that resource requirements (CPU, memory, GPU) are adequate

  • Validating that all dependencies and libraries are properly installed

  • Understanding the computational behavior and characteristics of your chosen model

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

check_validity()

Validate the configuration and raise ValueError if invalid.

copy()

copy_to(target)

is_reasoning_model()

Attributes

required_resource_files

image

tags

inference_parameters

required_gu

inference_framework_type

model_file

required_cpu

required_memory

properties

load_params

device

envs

required_gpu_memory

model_name