cudf.DataFrame.quantile#
- DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation=None, columns=None, exact=True, method='single')#
Return values at the given quantile.
- Parameters
- qfloat or array-like
0 <= q <= 1, the quantile(s) to compute
- axisint
axis is a NON-FUNCTIONAL parameter
- numeric_onlybool, default True
If False, the quantile of datetime and timedelta data will be computed as well.
- interpolation{linear, lower, higher, midpoint, nearest}
This parameter specifies the interpolation method to use, when the desired quantile lies between two data points i and j. Default is
linear
formethod="single"
, andnearest
formethod="table"
.- columnslist of str
List of column names to include.
- exactboolean
Whether to use approximate or exact quantile algorithm.
- method{single, table}, default single
Whether to compute quantiles per-column (‘single’) or over all columns (‘table’). When ‘table’, the only allowed interpolation methods are ‘nearest’, ‘lower’, and ‘higher’.
- Returns
- Series or DataFrame
If q is an array or numeric_only is set to False, a DataFrame will be returned where index is q, the columns are the columns of self, and the values are the quantile.
If q is a float, a Series will be returned where the index is the columns of self and the values are the quantiles.
Pandas Compatibility Note
DataFrame.quantile
One notable difference from Pandas is when DataFrame is of non-numeric types and result is expected to be a Series in case of Pandas. cuDF will return a DataFrame as it doesn’t support mixed types under Series.
Examples
>>> import cupy as cp >>> import cudf >>> df = cudf.DataFrame(cp.array([[1, 1], [2, 10], [3, 100], [4, 100]]), ... columns=['a', 'b']) >>> df a b 0 1 1 1 2 10 2 3 100 3 4 100 >>> df.quantile(0.1) a 1.3 b 3.7 Name: 0.1, dtype: float64 >>> df.quantile([.1, .5]) a b 0.1 1.3 3.7 0.5 2.5 55.0