maxframe.learn.contrib.llm.multi_modal.embed#
- maxframe.learn.contrib.llm.multi_modal.embed(data, model: MultiModalEmbeddingModel, input, simple_output: bool = False, params: Dict[str, Any] | None = None, **kw)[source]#
Embed multimodal input with a multimodal embedding model.
- Parameters:
data (DataFrame or Series) – Input data used to render one embedding request per row.
model (MultiModalEmbeddingModel) – Multimodal embedding model instance.
input (list or ContentPart) – Multimodal input template. Values may contain placeholders that reference columns in
data. The template is rendered row by row and sent as a single multimodal embedding input for that row.simple_output (bool, default False) – Whether to return embedding vectors directly when supported by the model executor, instead of the raw provider response.
params (dict, optional) – Additional embedding parameters.
- Returns:
A DataFrame with
responseandsuccesscolumns. Failed requests store the error message inresponse.- Return type:
Examples
>>> from maxframe.learn.contrib.llm import ContentPart, ImageContentType >>> input = [ ... ContentPart.text("Represent this product image."), ... ContentPart.image( ... data=df.image_url, ... type=ImageContentType.IMAGE_URL, ... ), ... ] >>> result = model.embed(df, input=input, simple_output=True)