DataFrame#

Constructor#

DataFrame([data, index, columns, dtype, ...])

Attributes and underlying data#

Axes

DataFrame.dtypes

Return the dtypes in the DataFrame.

DataFrame.select_dtypes([include, exclude])

Return a subset of the DataFrame's columns based on the column dtypes.

DataFrame.ndim

Return an int representing the number of axes / array dimensions.

DataFrame.shape

Conversion#

DataFrame.astype(dtype[, copy, errors])

Cast a pandas object to a specified dtype dtype.

Indexing, iteration#

DataFrame.head([n])

Return the first n rows.

DataFrame.insert(loc, column, value[, ...])

Insert column into DataFrame at specified location.

DataFrame.pop(item)

Return item and drop from frame.

DataFrame.query(expr[, inplace])

Query the columns of a DataFrame with a boolean expression.

Binary operator functions#

DataFrame.add(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator add).

DataFrame.sub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator subtract).

DataFrame.mul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

DataFrame.div(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

DataFrame.truediv(other[, axis, level, ...])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

DataFrame.floordiv(other[, axis, level, ...])

Get Integer division of dataframe and other, element-wise (binary operator floordiv).

DataFrame.mod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator mod).

DataFrame.pow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator pow).

DataFrame.radd(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator radd).

DataFrame.rsub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator rsubtract).

DataFrame.rmul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator rmul).

DataFrame.rdiv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

DataFrame.rtruediv(other[, axis, level, ...])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

DataFrame.rfloordiv(other[, axis, level, ...])

Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).

DataFrame.rmod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator rmod).

DataFrame.rpow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator rpow).

DataFrame.lt(other[, axis, level, fill_value])

Get Less than of dataframe and other, element-wise (binary operator lt).

DataFrame.gt(other[, axis, level, fill_value])

Get Greater than of dataframe and other, element-wise (binary operator gt).

DataFrame.le(other[, axis, level, fill_value])

Get Less than or equal to of dataframe and other, element-wise (binary operator le).

DataFrame.ge(other[, axis, level, fill_value])

Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).

DataFrame.ne(other[, axis, level, fill_value])

Get Not equal to of dataframe and other, element-wise (binary operator ne).

DataFrame.eq(other[, axis, level, fill_value])

Get Equal to of dataframe and other, element-wise (binary operator eq).

Function application, GroupBy & window#

DataFrame.apply(func[, axis, raw, ...])

Apply a function along an axis of the DataFrame.

DataFrame.agg([func, axis])

DataFrame.aggregate([func, axis])

DataFrame.groupby([by, level, as_index, ...])

DataFrame.transform(func[, axis, dtypes, ...])

Call func on self producing a DataFrame with transformed values.

Computations / descriptive stats#

DataFrame.abs()

DataFrame.all([axis, bool_only, skipna, ...])

DataFrame.any([axis, bool_only, skipna, ...])

DataFrame.count([axis, level, numeric_only])

DataFrame.describe([percentiles, include, ...])

DataFrame.eval(expr[, inplace])

Evaluate a string describing operations on DataFrame columns.

DataFrame.max([axis, skipna, level, ...])

DataFrame.mean([axis, skipna, level, ...])

DataFrame.median([axis, skipna, level, ...])

DataFrame.min([axis, skipna, level, ...])

DataFrame.nunique([axis, dropna])

Count distinct observations over requested axis.

DataFrame.pct_change([periods, fill_method, ...])

Percentage change between the current and a prior element.

DataFrame.prod([axis, skipna, level, ...])

DataFrame.product([axis, skipna, level, ...])

DataFrame.quantile([q, axis, numeric_only, ...])

Return values at the given quantile over requested axis.

DataFrame.round([decimals])

Round a DataFrame to a variable number of decimal places.

DataFrame.sem([axis, skipna, level, ddof, ...])

DataFrame.std([axis, skipna, level, ddof, ...])

DataFrame.sum([axis, skipna, level, ...])

DataFrame.var([axis, skipna, level, ddof, ...])

DataFrame.median([axis, skipna, level, ...])

Reindexing / selection / label manipulation#

DataFrame.add_prefix(prefix)

Prefix labels with string prefix.

DataFrame.add_suffix(suffix)

Suffix labels with string suffix.

DataFrame.drop([labels, axis, index, ...])

Drop specified labels from rows or columns.

DataFrame.drop_duplicates([subset, keep, ...])

Return DataFrame with duplicate rows removed.

DataFrame.duplicated([subset, keep, method])

Return boolean Series denoting duplicate rows.

DataFrame.head([n])

Return the first n rows.

DataFrame.rename([mapper, index, columns, ...])

Alter axes labels.

DataFrame.rename_axis([mapper, index, ...])

Set the name of the axis for the index or columns.

DataFrame.reset_index([level, drop, ...])

Reset the index, or a level of it.

DataFrame.sample([n, frac, replace, ...])

Return a random sample of items from an axis of object.

DataFrame.set_axis(labels[, axis, inplace])

Assign desired index to given axis.

DataFrame.set_index(keys[, drop, append, ...])

Set the DataFrame index using existing columns.

DataFrame.tail([n])

Return the last n rows.

Missing data handling#

DataFrame.isna()

Detect missing values.

DataFrame.isnull()

Detect missing values.

DataFrame.notna()

Detect existing (non-missing) values.

DataFrame.notnull()

Detect existing (non-missing) values.

Reshaping, sorting, transposing#

DataFrame.sort_values(by[, axis, ascending, ...])

Sort by the values along either axis.

DataFrame.sort_index([axis, level, ...])

Sort object by labels (along an axis).

Combining / joining / merging#

DataFrame.join(other[, on, how, lsuffix, ...])

Join columns of another DataFrame.

DataFrame.merge(right[, how, on, left_on, ...])

Merge DataFrame or named Series objects with a database-style join.

Plotting#

DataFrame.plot is both a callable method and a namespace attribute for specific plotting methods of the form DataFrame.plot.<kind>.

DataFrame.plot

alias of DataFramePlotAccessor

DataFrame.plot.area(*args, **kwargs)

Draw a stacked area plot.

DataFrame.plot.bar(*args, **kwargs)

Vertical bar plot.

DataFrame.plot.barh(*args, **kwargs)

Make a horizontal bar plot.

DataFrame.plot.box(*args, **kwargs)

Make a box plot of the DataFrame columns.

DataFrame.plot.density(*args, **kwargs)

Generate Kernel Density Estimate plot using Gaussian kernels.

DataFrame.plot.hexbin(*args, **kwargs)

Generate a hexagonal binning plot.

DataFrame.plot.hist(*args, **kwargs)

Draw one histogram of the DataFrame's columns.

DataFrame.plot.kde(*args, **kwargs)

Generate Kernel Density Estimate plot using Gaussian kernels.

DataFrame.plot.line(*args, **kwargs)

Plot Series or DataFrame as lines.

DataFrame.plot.pie(*args, **kwargs)

Generate a pie plot.

DataFrame.plot.scatter(*args, **kwargs)

Create a scatter plot with varying marker point size and color.

Serialization / IO / conversion#

DataFrame.to_odps_table(table[, partition, ...])

Write DataFrame object into a MaxCompute (ODPS) table.

DataFrame.to_pandas([session])

MaxFrame Extensions#

DataFrame.mf.apply_chunk(func, batch_rows[, ...])

Apply a function that takes pandas DataFrame and outputs pandas DataFrame/Series.

DataFrame.mf.flatmap(func[, dtypes, raw, args])

Apply the given function to each row and then flatten results.

DataFrame.mf.reshuffle([group_by, sort_by, ...])

Shuffle data in DataFrame or Series to make data distribution more randomized.

DataFrame.mf provides methods unique to MaxFrame. These methods are collated from application scenarios in MaxCompute and these can be accessed like DataFrame.mf.<function/property>.