maxframe.dataframe.DataFrame.update#

DataFrame.update(other, join='left', overwrite=True, filter_func=None, errors='ignore')#

使用另一个 DataFrame 中的非 NA 值就地修改。

按索引对齐。无返回值。

参数:

other (DataFrame, or object coercible into a DataFrame) -- 应至少有一个与原始 DataFrame 匹配的索引/列标签。如果传递的是 Series，必须设置其 name 属性，该属性将用作与原始 DataFrame 对齐的列名。
join ({'left'}, default 'left') -- 仅实现左连接，保留原始对象的索引和列。
overwrite (bool, default True) -- 如何处理重叠键的非 NA 值： * True：使用 other 中的值覆盖原始 DataFrame 的值。 * False：仅更新原始 DataFrame 中为 NA 的值。
filter_func (callable(1d-array) -> bool 1d-array, optional) -- 可以选择替换除 NA 以外的值。对于应该更新的值返回 True。
errors ({'raise', 'ignore'}, default 'ignore') -- 如果为 'raise'，当 DataFrame 和 other 在相同位置都包含非 NA 数据时将引发 ValueError。

返回:

此方法直接更改调用对象。

返回类型:

None

抛出:

ValueError --
- 当 errors='raise' 且存在重叠的非 NA 数据时。 * 当 errors 不是 'ignore' 或 'raise' 时
NotImplementedError --
- 如果 join != 'left'

参见

dict.update: 字典的类似方法。
DataFrame.merge: 用于列对列的操作。

示例

>>> import maxframe.tensor as mt
>>> import maxframe.dataframe as md
>>> df = md.DataFrame({'A': [1, 2, 3],
...                    'B': [400, 500, 600]})
>>> new_df = md.DataFrame({'B': [4, 5, 6],
...                        'C': [7, 8, 9]})
>>> df.update(new_df)
>>> df.execute()
   A  B
0  1  4
1  2  5
2  3  6

由于更新，DataFrame 的长度不会增加，只有匹配索引/列标签的值会被更新。

>>> df = md.DataFrame({'A': ['a', 'b', 'c'],
...                    'B': ['x', 'y', 'z']})
>>> new_df = md.DataFrame({'B': ['d', 'e', 'f', 'g', 'h', 'i']})
>>> df.update(new_df)
>>> df.execute()
   A  B
0  a  d
1  b  e
2  c  f

>>> df = md.DataFrame({'A': ['a', 'b', 'c'],
...                    'B': ['x', 'y', 'z']})
>>> new_df = md.DataFrame({'B': ['d', 'f']}, index=[0, 2])
>>> df.update(new_df)
>>> df.execute()
   A  B
0  a  d
1  b  y
2  c  f

对于 Series，必须设置其 name 属性。

>>> df = md.DataFrame({'A': ['a', 'b', 'c'],
...                    'B': ['x', 'y', 'z']})
>>> new_column = md.Series(['d', 'e', 'f'], name='B')
>>> df.update(new_column)
>>> df.execute()
   A  B
0  a  d
1  b  e
2  c  f

如果 other 包含 NaN，则原始 DataFrame 中对应的值不会被更新。

>>> df = md.DataFrame({'A': [1, 2, 3],
...                    'B': [400., 500., 600.]})
>>> new_df = md.DataFrame({'B': [4, mt.nan, 6]})
>>> df.update(new_df)
>>> df.execute()
   A      B
0  1    4.0
1  2  500.0
2  3    6.0