# groupbycount()

`groupbycount(col [, ...cols])` counts the number of occurences of each distinct value of the provided column(s). If multiple columns are provided, their values are treated as a tuple.

Note that `groupbycount col` is effectively the same as `stats by col`, differing only in its approximation behavior for very large numbers of groups.

## Technical Notes

* If there are fewer than 1000 distinct groups, `groupbycount` will return those groups and their exact counts.
* If there are >1000 distinct groups, it will return the 1000 most numerous groups, and will approximate their counts using a variant of the [count-min sketch](https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch)—specifically the count-mean-min sketch detailed [here](https://webdocs.cs.ualberta.ca/~drafiei/papers/cmm.pdf).
  * For most inputs, this has a p99 error of 0.1%.
* `groupbycount` does not currently respect `max_bytes` in the API, and will always return at most 1000 distinct groups and their counts.

## Returns

A table with one row for each distinct value of `col [, ...cols]`, with the following columns:

* One column for each `col` provided, containing the value of `col`.
* `@q.count`, containing the number of occurrences of that value.

## Example

Analyze AWS CloudTrail logs to find the most frequently called AWS APIs.

```python
%ingest.source_type: "aws:cloudtrail"
| groupbycount(eventSource)
```
