Of course, the general purpose of a GROUP BY
clause is to apply aggregate functions to the groups. A simple example is to make a simple view with two tables side by side. Name the first one Source
and leave its sample data in place. In the second one, place this expression binding on props.data
:
orderBy(
forEach(
groupBy({../Source.props.data}, it()['country']),
asMap(
'country', it()[0],
'cities', len(it()[1]),
'population', sum(
forEach(
it()[1],
it()['population']
)
)
)
),
descending(it()['population'])
)
You get this output:
[
{
"country": "China",
"cities": 1,
"population": 24153000
},
{
"country": "United States",
"cities": 8,
"population": 19547689
},
{
"country": "Japan",
"cities": 2,
"population": 16308742
},
{
"country": "Nigeria",
"cities": 1,
"population": 16060303
},
{
"country": "Pakistan",
"cities": 1,
"population": 14910352
},
{
"country": "Bangladesh",
"cities": 1,
"population": 14399000
},
{
"country": "Turkey",
"cities": 1,
"population": 14025000
},
{
"country": "Iran",
"cities": 3,
"population": 13024236
},
{
"country": "Indonesia",
"cities": 2,
"population": 12763073
},
{
"country": "India",
"cities": 1,
"population": 11034555
},
{
"country": "Mexico",
"cities": 2,
"population": 10469913
},
{
"country": "Egypt",
"cities": 1,
"population": 10230350
},
{
"country": "Saudi Arabia",
"cities": 2,
"population": 9132880
},
{
"country": "United Kingdom",
"cities": 1,
"population": 8825001
},
{
"country": "Colombia",
"cities": 1,
"population": 7878783
},
{
"country": "Brazil",
"cities": 1,
"population": 6429923
},
{
"country": "Singapore",
"cities": 1,
"population": 5535000
},
{
"country": "Russia",
"cities": 1,
"population": 5191690
},
{
"country": "Ivory Coast",
"cities": 1,
"population": 4765000
},
{
"country": "Tanzania",
"cities": 1,
"population": 4364541
},
{
"country": "Italy",
"cities": 2,
"population": 4237120
},
{
"country": "Germany",
"cities": 1,
"population": 3517424
},
{
"country": "Afghanistan",
"cities": 1,
"population": 3414100
},
{
"country": "Spain",
"cities": 1,
"population": 3233527
},
{
"country": "Ukraine",
"cities": 1,
"population": 2908703
},
{
"country": "Nicaragua",
"cities": 1,
"population": 2560789
},
{
"country": "France",
"cities": 1,
"population": 2229621
},
{
"country": "Philippines",
"cities": 1,
"population": 1780148
},
{
"country": "Canada",
"cities": 1,
"population": 1649519
},
{
"country": "Czech Republic",
"cities": 1,
"population": 1241664
},
{
"country": "Armenia",
"cities": 1,
"population": 1060138
},
{
"country": "Tunisia",
"cities": 1,
"population": 1056247
},
{
"country": "New Zealand",
"cities": 2,
"population": 810000
},
{
"country": "Finland",
"cities": 1,
"population": 635591
},
{
"country": "Australia",
"cities": 1,
"population": 208374
}
]
Since the data is in JSON format, not dataset format, there must be a forEach()
inside the aggregate function to pull the desired data out of the map. If the source data to groupBy()
is a dataset, then the value of it()[1]
when iterating through the groups will also be a dataset, and you would use sum(it()[1], 'population')
instead of generating the sequence for sum()
. Like so:
orderBy(
forEach(
groupBy({../Source.props.data}, it()['country']),
asMap(
'country', it()[0],
'cities', len(it()[1]),
'population', sum(it()[1], 'population')
)
),
descending(it()['population'])
)
Also, found a bug in orderBy()
. Link fixed above.
I should emphasize that when you look at these iterators, remember that the first argument, the source, is evaluated completely before that function starts its own loop. So if you see it()
or idx()
in that position, it is referencing an outer iteration loop.
{ timeMe()
says that first expression runs in 1.7-2.2 milliseconds. }