Aggregate Unique Values From Multiple Columns With Pandas GroupBy


Answer :

Use groupby and agg, and aggregate only unique values by calling Series.unique:

df.astype(str).groupby('prop1').agg(lambda x: ','.join(x.unique()))              prop2       prop3      prop4 prop1                                    K20       12,1,66  travis,leo   10.0,4.0 L30    3,54,11,10    bob,john  11.2,10.0 

df.astype(str).groupby('prop1', sort=False).agg(lambda x: ','.join(x.unique()))              prop2       prop3      prop4 prop1                                    L30    3,54,11,10    bob,john  11.2,10.0 K20       12,1,66  travis,leo   10.0,4.0 

If handling NaNs is important, call fillna in advance:

import re df.fillna('').astype(str).groupby('prop1').agg(     lambda x: re.sub(',+', ',', ','.join(x.unique())) )              prop2       prop3      prop4 prop1                                    K20       12,1,66  travis,leo   10.0,4.0 L30    3,54,11,10    bob,john  11.2,10.0 

Comments

Popular posts from this blog

Converting A String To Int In Groovy

"Cannot Create Cache Directory /home//.composer/cache/repo/https---packagist.org/, Or Directory Is Not Writable. Proceeding Without Cache"

Android How Can I Convert A String To A Editable