maxframe.dataframe.Series.value_counts#

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True, method='auto')#

Return a Series containing counts of unique values.

The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.

Parameters:

normalize (bool, default False) – If True then the object returned will contain the relative frequencies of the unique values.
sort (bool, default True) – Sort by frequencies.
ascending (bool, default False) – Sort in ascending order.
bins (int, optional) – Rather than count values, group them into half-open bins, a convenience for pd.cut, only works with numeric data.
dropna (bool, default True) – Don’t include counts of NaN.
method (str, default 'auto') – ‘auto’, ‘shuffle’, or ‘tree’, ‘tree’ method provide a better performance, while ‘shuffle’ is recommended if aggregated result is very large, ‘auto’ will use ‘shuffle’ method in distributed mode and use ‘tree’ in local mode.

Return type:

Series

See also

Series.count: Number of non-NA elements in a Series.
DataFrame.count: Number of non-NA elements in a DataFrame.

Examples

>>> import maxframe.dataframe as md
>>> import maxframe.tensor as mt

>>> s = md.Series([3, 1, 2, 3, 4, mt.nan])
>>> s.value_counts().execute()
3.0    2
4.0    1
2.0    1
1.0    1
dtype: int64

With normalize set to True, returns the relative frequency by dividing all values by the sum of values.

>>> s = md.Series([3, 1, 2, 3, 4, mt.nan])
>>> s.value_counts(normalize=True).execute()
3.0    0.4
4.0    0.2
2.0    0.2
1.0    0.2
dtype: float64

bins

Bins can be useful for going from a continuous variable to a categorical variable; instead of counting unique apparitions of values, divide the index in the specified number of half-open bins.

>>> s.value_counts(bins=3).execute()
(2.0, 3.0]      2
(0.996, 2.0]    2
(3.0, 4.0]      1
dtype: int64

dropna

With dropna set to False we can also see NaN index values.

>>> s.value_counts(dropna=False).execute()
3.0    2
NaN    1
4.0    1
2.0    1
1.0    1
dtype: int64