DataFrame#
Constructor#
|
Attributes and underlying data#
Axes
Return the dtypes in the DataFrame. |
|
|
Return the memory usage of each column in bytes. |
Return an int representing the number of axes / array dimensions. |
|
|
Return a subset of the DataFrame's columns based on the column dtypes. |
Conversion#
|
Cast a pandas object to a specified dtype |
|
Convert columns to best possible dtypes using dtypes supporting |
|
Attempt to infer better dtypes for object columns. |
Indexing, iteration#
Access a single value for a row/column label pair. |
|
|
Return the first n rows. |
Access a single value for a row/column pair by integer position. |
|
Purely integer-location based indexing for selection by position. |
|
|
Insert column into DataFrame at specified location. |
Access a group of rows and columns by label(s) or a boolean array. |
|
|
Replace values where the condition is True. |
|
Return item and drop from frame. |
|
Query the columns of a DataFrame with a boolean expression. |
|
Return the last n rows. |
|
Return cross-section from the Series/DataFrame. |
|
Replace values where the condition is False. |
Binary operator functions#
|
Get Addition of dataframe and other, element-wise (binary operator add). |
|
Get Subtraction of dataframe and other, element-wise (binary operator subtract). |
|
Get Multiplication of dataframe and other, element-wise (binary operator mul). |
|
Get Floating division of dataframe and other, element-wise (binary operator truediv). |
|
Get Floating division of dataframe and other, element-wise (binary operator truediv). |
|
Get Integer division of dataframe and other, element-wise (binary operator floordiv). |
|
Get Modulo of dataframe and other, element-wise (binary operator mod). |
|
Get Exponential power of dataframe and other, element-wise (binary operator pow). |
|
Compute the matrix multiplication between the DataFrame and other. |
|
Get Addition of dataframe and other, element-wise (binary operator radd). |
|
Get Subtraction of dataframe and other, element-wise (binary operator rsubtract). |
|
Get Multiplication of dataframe and other, element-wise (binary operator rmul). |
|
Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
|
Get Floating division of dataframe and other, element-wise (binary operator rtruediv). |
|
Get Integer division of dataframe and other, element-wise (binary operator rfloordiv). |
|
Get Modulo of dataframe and other, element-wise (binary operator rmod). |
|
Get Exponential power of dataframe and other, element-wise (binary operator rpow). |
|
Get Less than of dataframe and other, element-wise (binary operator lt). |
|
Get Greater than of dataframe and other, element-wise (binary operator gt). |
|
Get Less than or equal to of dataframe and other, element-wise (binary operator le). |
|
Get Greater than or equal to of dataframe and other, element-wise (binary operator ge). |
|
Get Not equal to of dataframe and other, element-wise (binary operator ne). |
|
Get Equal to of dataframe and other, element-wise (binary operator eq). |
|
Perform column-wise combine with another DataFrame. |
|
Update null elements with value in the same location in other. |
Function application, GroupBy & window#
|
Apply a function along an axis of the DataFrame. |
|
Apply a function to a Dataframe elementwise. |
|
Aggregate using one or more operations over the specified axis. |
|
Aggregate using one or more operations over the specified axis. |
|
Provide exponential weighted functions. |
|
Provide expanding transformations. |
|
Group DataFrame using a mapper or by a Series of columns. |
|
Apply a function to a Dataframe elementwise. |
|
Provide rolling window calculations. |
|
Call |
Computations / descriptive stats#
|
|
|
|
|
Trim values at input threshold(s). |
|
|
|
Compute pairwise correlation of columns, excluding NA/null values. |
|
Compute pairwise correlation. |
|
Compute pairwise covariance of columns, excluding NA/null values. |
|
Generate descriptive statistics. |
|
First discrete difference of element. |
|
Evaluate a string describing operations on DataFrame columns. |
|
|
|
|
|
|
|
|
|
Get the mode(s) of each element along the selected axis. |
|
Count distinct observations over requested axis. |
|
Percentage change between the current and a prior element. |
|
|
|
|
|
Return values at the given quantile over requested axis. |
|
Compute numerical data ranks (1 through n) along axis. |
|
Round a DataFrame to a variable number of decimal places. |
|
|
|
|
|
|
|
|
|
Reindexing / selection / label manipulation#
|
Prefix labels with string prefix. |
|
Suffix labels with string suffix. |
|
Align two objects on their axes with the specified join method. |
|
Select values at particular time of day (e.g., 9:30AM). |
|
Select values between particular times of the day (e.g., 9:00-9:30 AM). |
|
Drop specified labels from rows or columns. |
|
Return DataFrame with duplicate rows removed. |
|
Return Series/DataFrame with requested index / column level(s) removed. |
|
Return boolean Series denoting duplicate rows. |
|
Subset the dataframe rows or columns according to the specified index labels. |
|
Return the first n rows. |
|
Return index of first occurrence of maximum over requested axis. |
|
Return index of first occurrence of minimum over requested axis. |
|
Conform Series/DataFrame to new index with optional filling logic. |
|
Return an object with matching indices as other object. |
|
Alter axes labels. |
|
Set the name of the axis for the index or columns. |
|
Reset the index, or a level of it. |
|
Return a random sample of items from an axis of object. |
|
Assign desired index to given axis. |
|
Set the DataFrame index using existing columns. |
|
Return the elements in the given positional indices along an axis. |
|
Truncate a Series or DataFrame before and after some index value. |
Missing data handling#
|
Remove missing values. |
|
Fill NA/NaN values using the specified method. |
Detect missing values. |
|
Detect missing values. |
|
Detect existing (non-missing) values. |
|
Detect existing (non-missing) values. |
Reshaping, sorting, transposing#
|
Unpivot a DataFrame from wide to long format, optionally leaving identifiers set. |
|
Return the first n rows ordered by columns in descending order. |
|
Return the first n rows ordered by columns in ascending order. |
|
Return reshaped DataFrame organized by given index / column values. |
|
Create a spreadsheet-style pivot table as a DataFrame. |
|
Rearrange index levels using input order. |
|
Sort by the values along either axis. |
|
Sort object by labels (along an axis). |
|
Swap levels i and j in a |
|
Stack the prescribed level(s) from columns to index. |
|
Unstack, also known as pivot, Series with MultiIndex to produce DataFrame. |
Combining / comparing / joining / merging#
|
Append rows of other to the end of caller, returning a new object. |
|
Assign new columns to a DataFrame. |
|
Compare to another DataFrame and show the differences. |
|
Join columns of another DataFrame. |
|
Merge DataFrame or named Series objects with a database-style join. |
|
Modify in place using non-NA values from another DataFrame. |
Plotting#
DataFrame.plot is both a callable method and a namespace attribute for
specific plotting methods of the form DataFrame.plot.<kind>.
|
Draw a stacked area plot. |
|
Vertical bar plot. |
|
Make a horizontal bar plot. |
|
Make a box plot of the DataFrame columns. |
|
Generate Kernel Density Estimate plot using Gaussian kernels. |
|
Generate a hexagonal binning plot. |
|
Draw one histogram of the DataFrame's columns. |
|
Generate Kernel Density Estimate plot using Gaussian kernels. |
|
Plot Series or DataFrame as lines. |
|
Generate a pie plot. |
|
Create a scatter plot with varying marker point size and color. |
Serialization / IO / conversion#
|
Construct DataFrame from dict of array-like or dicts. |
|
Convert structured or record ndarray to DataFrame. |
|
Copy object to the system clipboard. |
|
Write object to a comma-separated values (csv) file. |
|
Convert the DataFrame to a dictionary. |
|
Convert the object to a JSON string. |
|
Write DataFrame object into a MaxCompute (ODPS) table. |
|
|
|
Write a DataFrame to the binary parquet format, each chunk will be written to a Parquet file. |
MaxFrame Extensions#
|
Apply a function that takes pandas DataFrame and outputs pandas DataFrame/Series. |
|
Merge values in specified columns into a key-value represented column. |
|
Extract values in key-value represented columns into standalone columns. |
|
Apply the given function to each row and then flatten results. |
|
Map-reduce API over certain DataFrames. |
|
Make data more balanced across entire cluster. |
|
Shuffle data in DataFrame or Series to make data distribution more randomized. |
DataFrame.mf provides methods unique to MaxFrame. These methods are collated from application
scenarios in MaxCompute and these can be accessed like DataFrame.mf.<function/property>.