maxframe.dataframe.Series.str.replace#

替换 Series/Index 中的每个模式/正则表达式匹配项。

等效于 str.replace() 或 re.sub()，取决于 regex 的值。

参数:

pat (str, compiled regex, or a dict) -- String can be a character sequence or regular expression. Dictionary contains <key : value> pairs of strings to be replaced along with the updated value.
repl (str or callable) -- Replacement string or a callable. The callable is passed the regex match object and must return a replacement string to be used. Must have a value of None if pat is a dict See re.sub().
n (int, default -1 (all)) -- 从开始位置起要进行的替换次数。
case (bool, default None) -- 确定替换是否区分大小写： - 如果为 True，则区分大小写（当 pat 是字符串时的默认值） - 设置为 False 表示不区分大小写 - 如果 pat 是编译后的正则表达式，则不能设置此参数。
flags (int, default 0 (no flags)) -- 正则表达式模块标志，例如 re.IGNORECASE。如果 pat 是编译后的正则表达式，则不能设置此参数。
regex (bool, default False) -- 确定传入的模式是否为正则表达式： - 如果为 True，表示传入的模式是正则表达式。 - 如果为 False，则将模式视为字面量字符串 - 如果 pat 是编译后的正则表达式或 repl 是可调用对象，则不能设为此值。

返回:

一个副本对象，其中所有与 pat 匹配的内容都被替换为 repl。

返回类型:

Series or Index of object

抛出:

ValueError --

if regex is False and repl is a callable or pat is a compiled regex * if pat is a compiled regex and case or flags is set * if pat is a dictionary and repl is not None.

参见

Series.str.replace: Method to replace occurrences of a substring with another substring.
Series.str.extract: Extract substrings using a regular expression.
Series.str.findall: Find all occurrences of a pattern or regex in each string.
Series.str.split: Split each string by a specified delimiter or pattern.

备注

当 pat 是编译后的正则表达式时，所有标志都应包含在该编译后的正则表达式中。使用 case、flags 或 regex=False 与编译后的正则表达式一起使用将会引发错误。

示例

When pat is a dictionary, every key in pat is replaced with its corresponding value:

>>> import maxframe.tensor as mt
>>> import maxframe.dataframe as md
>>> md.Series(["A", "B", mt.nan]).str.replace(pat={"A": "a", "B": "b"}).execute()
0    a
1    b
2    NaN
dtype: str

当 pat 是字符串且 regex 为 True 时，给定的 pat 会被编译为正则表达式。当 repl 是字符串时，它会替换匹配的正则表达式模式，就像使用 re.sub() 一样。Series 中的 NaN 值保持不变：

>>> md.Series(["foo", "fuz", mt.nan]).str.replace("f.", "ba", regex=True).execute()
0    bao
1    baz
2    NaN
dtype: str

当 pat 是字符串且 regex 为 False 时，每个 pat 都会被替换为 repl，就像使用 str.replace() 一样：

>>> md.Series(["f.o", "fuz", mt.nan]).str.replace("f.", "ba", regex=False).execute()
0    bao
1    fuz
2    NaN
dtype: str

当 repl 是可调用对象时，它会在每个 pat 上使用 re.sub() 调用。该可调用对象应该期望一个位置参数（一个正则表达式对象）并返回一个字符串。

理解概念：

>>> md.Series(["foo", "fuz", mt.nan]).str.replace("f", repr, regex=True).execute()
0    <re.Match object; span=(0, 1), match='f'>oo
1    <re.Match object; span=(0, 1), match='f'>uz
2                                            NaN
dtype: str

反转每个小写字母单词：

>>> repl = lambda m: m.group(0)[::-1]
>>> ser = md.Series(["foo 123", "bar baz", mt.nan])
>>> ser.str.replace(r"[a-z]+", repl, regex=True).execute()
0    oof 123
1    rab zab
2        NaN
dtype: str

使用正则表达式组（提取第二组并交换大小写）：

>>> pat = r"(?P<one>\w+) (?P<two>\w+) (?P<three>\w+)"
>>> repl = lambda m: m.group("two").swapcase()
>>> ser = md.Series(["One Two Three", "Foo Bar Baz"])
>>> ser.str.replace(pat, repl, regex=True).execute()
0    tWO
1    bAR
dtype: str

使用带标志的编译正则表达式

>>> import re
>>> regex_pat = re.compile(r"FUZ", flags=re.IGNORECASE)
>>> md.Series(["foo", "fuz", mt.nan]).str.replace(regex_pat, "bar", regex=True).execute()
0    foo
1    bar
2    NaN
dtype: str