Hi databricksuser-5173 You're right, make_set()
does not guarantee any specific order of elements, as stated in the documentation. If preserving the input order is important, you can use make_list()
instead, which does retain input order:
| summarize ordered_vals = make_list(columnName) by groupColumn
To ensure correct ordering in distributed contexts, you can add serialize
before the aggregation:
| serialize
| summarize ordered_vals = make_list(columnName) by groupColumn
Note: make_list()
does not remove duplicates.
I hope this information helps. Please do let us know if you have any further queries.
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.