maxframe.dataframe.Series.mf.flatmap#
- Series.mf.flatmap(func: Callable, dtypes=None, dtype=None, name=None, args=(), **kwargs)#
Apply the given function to each row and then flatten results. Use this method if your transformation returns multiple rows for each input row.
- This function applies a transformation to each element of the Series, where the transformation can return zero
or multiple values, effectively flattening Python generator, list-liked collections and DataFrame.
- Parameters:
func (Callable) – Function to apply to each element of the Series. It should accept a scalar value (or an array if
raw=True
) and return a list or iterable of values.dtypes (Series, default None) – Specify dtypes of returned DataFrame. Can’t work with dtype.
dtype (numpy.dtype, default None) – Specify dtype of returned Series. Can’t work with dtypes.
name (str, default None) – Specify name of the returned Series.
args (tuple) – Positional arguments to pass to
func
.**kwargs – Additional keyword arguments to pass as keywords arguments to
func
.
- Returns:
Result of DataFrame when dtypes specified, else Series.
- Return type:
Notes
The
func
must return an iterable of values for each input element. Ifdtypes
is specified, flatmap will return a DataFrame, ifdtype
andname
is specified, a Series will be returned.The index of the resulting DataFrame/Series will be repeated based on the number of output rows generated by
func
.Examples
>>> import numpy as np >>> import maxframe.dataframe as md >>> df = md.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) >>> df.execute() A B 0 1 4 1 2 5 2 3 6
Define a function that takes a number and returns a list of two numbers:
>>> def generate_values_array(x): ... return [x * 2, x * 3]
Specify
dtype
with a function which returns list to return more elements as a Series:>>> df['A'].mf.flatmap(generate_values_array, dtype="int", name="C").execute() 0 2 0 3 1 4 1 6 2 6 2 9 Name: C, dtype: int64
Specify
dtypes
to return multi columns as a DataFrame:>>> def generate_values_in_generator(x): ... yield pd.Series([x * 2, x * 4]) ... yield pd.Series([x * 3, x * 5])
>>> df['A'].mf.flatmap(generate_values_in_generator, dtypes={"A": "int", "B": "int"}).execute() A B 0 2 4 0 3 5 1 4 8 1 6 10 2 6 12 2 9 15