maxframe.learn.contrib.llm.deploy.config.ModelDeploymentConfig#
- class maxframe.learn.contrib.llm.deploy.config.ModelDeploymentConfig(*args, **kwargs)[source]#
Model deployment configuration for extending MaxFrame with custom models.
This configuration is designed for users who need to deploy models that are not available within MaxFrame’s built-in model offerings. It provides a way to specify custom deployment solutions by informing each MaxFrame worker which framework to use, which model path to load, and how to load it.
The configuration assumes that models are already set up in the container image or mounted paths, and uses the current deploy_config to load them. Users are responsible for ensuring the runtime environment state and compatibility.
- Parameters:
model_name (str) – The name of the model.
model_file (str) –
The local file path of the model, e.g.,
"/mnt/models/qwen/". When using OSS models, this should match one of themount_pathvalues infs_mounts.Note: OSS paths (
oss://...) are NOT supported directly. Usefs_mountsto mount OSS paths to local paths first.inference_framework_type (InferenceFrameworkEnum) – The inference framework of the model.
required_resource_files (List[Union[str, Any]]) – The required resource files of the model.
load_params (Dict[str, Any]) – The load params of the model.
required_cpu (int) – The required cpu of the model.
required_memory (int) – The required memory of the model.
required_gu (int) – The required gu of the model.
required_gpu_memory (int) – The required gpu memory of the model.
device (str, optional) – The device of the model. One of “cpu” or “cuda”. Defaults to None, which allows the server to determine the device at runtime.
properties (Dict[str, Any]) – The properties of the model.
tags (List[str]) – The tags of the model.
fs_mounts (List[FsMountOptions]) –
File system mount configurations for mounting OSS models to local paths. Each FsMountOptions contains:
path: OSS source path, e.g.,"oss://bucket/models/qwen/"mount_path: Local mount path, e.g.,"/mnt/qwen"storage_options: Authentication config (role_arn)
envs (Dict[str, str]) – Custom environment variables for the inference subprocess. Example:
{"CUDA_VISIBLE_DEVICES": "0", "HF_HOME": "/mnt/cache"}
Notes
Preview version for model deployments, all fields could be changed in the future.
User Responsibility Notice: Users must have a complete understanding of what they are computing and ensure they fully comprehend the implications of their configuration choices. You are responsible for:
Ensuring model compatibility with the specified inference framework
Verifying that model files exist and are accessible in the runtime environment
Confirming that resource requirements (CPU, memory, GPU) are adequate
Validating that all dependencies and libraries are properly installed
Understanding the computational behavior and characteristics of your chosen model
- __init__(*args, **kwargs)#
Methods
__init__(*args, **kwargs)check_validity()Validate the configuration and raise ValueError if invalid.
copy()copy_to(target)get_default_enable_thinking()is_reasoning_model()Attributes
model_namerequired_cpurequired_gpu_memoryinference_parametersenvsload_paramsrequired_memoryfs_mountsimagedevicerequired_gutagsinference_framework_typerequired_resource_filespropertiesmodel_file