cudf.core.groupby.groupby.GroupBy.agg#

GroupBy.agg(func)#

Apply aggregation(s) to the groups.

Parameters
funcstr, callable, list or dict

Argument specifying the aggregation(s) to perform on the groups. func can be any of the following:

  • string: the name of a supported aggregation

  • callable: a function that accepts a Series/DataFrame and performs a supported operation on it.

  • list: a list of strings/callables specifying the aggregations to perform on every column.

  • dict: a mapping of column names to string/callable specifying the aggregations to perform on those columns.

See :ref:`the user guide <basics.groupby>` for supported
aggregations.
Returns
A Series or DataFrame containing the combined results of the
aggregation(s).

Examples

>>> import cudf
>>> a = cudf.DataFrame(
    {'a': [1, 1, 2], 'b': [1, 2, 3], 'c': [2, 2, 1]})
>>> a.groupby('a').agg('sum')
   b  c
a
2  3  1
1  3  4

Specifying a list of aggregations to perform on each column.

>>> a.groupby('a').agg(['sum', 'min'])
    b       c
  sum min sum min
a
2   3   3   1   1
1   3   1   4   2

Using a dict to specify aggregations to perform per column.

>>> a.groupby('a').agg({'a': 'max', 'b': ['min', 'mean']})
    a   b
  max min mean
a
2   2   3  3.0
1   1   1  1.5

Using lambdas/callables to specify aggregations taking parameters.

>>> f1 = lambda x: x.quantile(0.5); f1.__name__ = "q0.5"
>>> f2 = lambda x: x.quantile(0.75); f2.__name__ = "q0.75"
>>> a.groupby('a').agg([f1, f2])
     b          c
  q0.5 q0.75 q0.5 q0.75
a
1  1.5  1.75  2.0   2.0
2  3.0  3.00  1.0   1.0