maxframe.tensor.random.zipf#

maxframe.tensor.random.zipf(a, size=None, chunk_size=None, gpu=None, dtype=None)[源代码]#

从 Zipf 分布中抽取样本。

样本从参数为 a > 1 的 Zipf 分布中抽取。

Zipf 分布（也称为 zeta 分布）是一种满足 Zipf 定律的连续概率分布：一个项目的频率与其在频率表中的排名成反比。

参数:

a (float or array_like of floats) -- 分布参数。应大于 1。
size (int or tuple of ints, optional) -- 输出形状。如果给定形状为，例如 (m, n, k)，则抽取 m * n * k 个样本。如果 size 为 None``（默认），且 ``a 是标量，则返回单个值。否则抽取 mt.array(a).size 个样本。
chunk_size (int or tuple of int or tuple of ints, optional) -- 每个维度上期望的块大小
gpu (bool, optional) -- 如果为 True，则在 GPU 上分配张量，默认为 False
dtype (data-type, optional) -- 返回张量的数据类型。

返回:

out -- 从参数化的 Zipf 分布中抽取的样本。

返回类型:

Tensor or scalar

参见

scipy.stats.zipf: 概率密度函数、分布或累积密度函数等。

备注

Zipf 分布的概率密度为

\[p(x) = \frac{x^{-a}}{\zeta(a)},\]

其中 \(\zeta\) 是黎曼 Zeta 函数。

该分布以美国语言学家 George Kingsley Zipf 命名，他指出在语言样本中任何单词的频率与其在频率表中的排名成反比。

引用

示例

从分布中抽取样本：

>>> import maxframe.tensor as mt

>>> a = 2. # parameter
>>> s = mt.random.zipf(a, 1000)

显示样本的直方图以及概率密度函数：

>>> import matplotlib.pyplot as plt
>>> from scipy import special

将 s 值截断在 50，使图形更有趣：

>>> count, bins, ignored = plt.hist(s[s<50].execute(), 50, normed=True)
>>> x = mt.arange(1., 50.)
>>> y = x**(-a) / special.zetac(a)
>>> plt.plot(x.execute(), (y/mt.max(y)).execute(), linewidth=2, color='r')
>>> plt.show()