maxframe.dataframe.DataFrame.pivot_table#

DataFrame.pivot_table(values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All', sort=True)#

创建一个电子表格风格的透视表作为 DataFrame。

透视表中的层级将存储在结果 DataFrame 的索引和列上的 MultiIndex 对象（分层索引）中。

参数:

values (column to aggregate, optional)
index (column, Grouper, array, or list of the previous) -- 如果传递一个数组，它必须与数据长度相同。列表可以包含任何其他类型（除了列表）。用于在透视表索引上分组的键。如果传递一个数组，它将以与列值相同的方式使用。
columns (column, Grouper, array, or list of the previous) -- 如果传递一个数组，它必须与数据长度相同。列表可以包含任何其他类型（除了列表）。用于在透视表列上分组的键。如果传递一个数组，它将以与列值相同的方式使用。
aggfunc (function, list of functions, dict, default numpy.mean) -- 如果传递函数列表，生成的透视表将具有分层列，其顶层是函数名称（从函数对象本身推断）。如果传递字典，则键是聚合列，值是函数或函数列表。
fill_value (scalar, default None) -- 用于替换缺失值的值（在聚合后的结果透视表中）。
margins (bool, default False) -- 添加所有行/列（例如用于小计/总计）。
dropna (bool, default True) -- 不包括所有条目均为 NaN 的列。
margins_name (str, default 'All') -- 当 margins 为 True 时，包含总计的行/列的名称。
sort (bool, default True) -- 指定结果是否应排序。

返回:

一个 Excel 风格的透视表。

返回类型:

DataFrame

参见

DataFrame.pivot: 无需聚合即可处理非数值数据的透视操作。
DataFrame.melt: 将 DataFrame 从宽格式转换为长格式，可选择保留标识符集合。
wide_to_long: 宽面板转为长格式。比 melt 更不灵活但更用户友好。

示例

>>> import numpy as np
>>> import maxframe.dataframe as md
>>> df = md.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
...                          "bar", "bar", "bar", "bar"],
...                    "B": ["one", "one", "one", "two", "two",
...                          "one", "one", "two", "two"],
...                    "C": ["small", "large", "large", "small",
...                          "small", "large", "small", "small",
...                          "large"],
...                    "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
...                    "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
>>> df.execute()
     A    B      C  D  E
0  foo  one  small  1  2
1  foo  one  large  2  4
2  foo  one  large  2  5
3  foo  two  small  3  5
4  foo  two  small  3  6
5  bar  one  large  4  6
6  bar  one  small  5  8
7  bar  two  small  6  9
8  bar  two  large  7  9

第一个示例通过求和来聚合值。

>>> table = md.pivot_table(df, values='D', index=['A', 'B'],
...                        columns=['C'], aggfunc=np.sum)
>>> table.execute()
C        large  small
A   B
bar one    4.0    5.0
    two    7.0    6.0
foo one    4.0    1.0
    two    NaN    6.0

我们也可以使用 fill_value 参数来填充缺失值。

>>> table = md.pivot_table(df, values='D', index=['A', 'B'],
...                        columns=['C'], aggfunc=np.sum, fill_value=0)
>>> table.execute()
C        large  small
A   B
bar one      4      5
    two      7      6
foo one      4      1
    two      0      6

下一个示例通过对多个列取平均值来进行聚合。

>>> table = md.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
...                        aggfunc={'D': np.mean,
...                                 'E': np.mean})
>>> table.execute()
                D         E
A   C
bar large  5.500000  7.500000
    small  5.500000  8.500000
foo large  2.000000  4.500000
    small  2.333333  4.333333

我们还可以为任何给定值列计算多种类型的聚合。

>>> table = md.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
...                        aggfunc={'D': np.mean,
...                                 'E': [min, max, np.mean]})
>>> table.execute()
                D    E
            mean  max      mean  min
A   C
bar large  5.500000  9.0  7.500000  6.0
    small  5.500000  9.0  8.500000  8.0
foo large  2.000000  5.0  4.500000  4.0
    small  2.333333  6.0  4.333333  2.0