Aggregate Unique Values From Multiple Columns With Pandas GroupBy


Answer :

Use groupby and agg, and aggregate only unique values by calling Series.unique:

df.astype(str).groupby('prop1').agg(lambda x: ','.join(x.unique()))              prop2       prop3      prop4 prop1                                    K20       12,1,66  travis,leo   10.0,4.0 L30    3,54,11,10    bob,john  11.2,10.0 

df.astype(str).groupby('prop1', sort=False).agg(lambda x: ','.join(x.unique()))              prop2       prop3      prop4 prop1                                    L30    3,54,11,10    bob,john  11.2,10.0 K20       12,1,66  travis,leo   10.0,4.0 

If handling NaNs is important, call fillna in advance:

import re df.fillna('').astype(str).groupby('prop1').agg(     lambda x: re.sub(',+', ',', ','.join(x.unique())) )              prop2       prop3      prop4 prop1                                    K20       12,1,66  travis,leo   10.0,4.0 L30    3,54,11,10    bob,john  11.2,10.0 

Comments

Popular posts from this blog

530 Valid Hostname Is Expected When Setting Up IIS 10 For Multiple Sites

C Perror Example

Converting A String To Int In Groovy