java - Union of sets in spark querying cassandra -

the table structure in cassandra:

identifier, date, set(integer)

what want achieve using spark grouping rows identifier , date, , aggregating sets value. more clear example:

raw data: (consider letters representing integers)

id1, 05-05-2017, {a,b,c} id1, 05-05-2017, {c,d} id1, 26-05-2017, {a,b,c} id1, 26-05-2017, {b,c} id2, 26-05-2017, {a,b,c} id2, 26-05-2017, {b,c,d}

output:

id1, 05-05-2017, {a,b,c,d} id1, 26-05-2017, {a,b,c} id2, 26-05-2017, {a,b,c,d}

since set, want unique values in aggregated results. using java , dataset.

if dataframe has columns mentions can this:

df.withcolumn("set", explode(col("set"))).groupby("identifier", "date").agg(collect_set("set"))

TY