alexa
Facebook
Twitter
LinkedIn
Instagram
Whatsapp
Call Now
Quick Inquiry

How to speed up conditional groupby sum in python pandas ?

How to speed up conditional groupby sum in python pandas ?

One option is a double merge and a groupby:

 date = pd.Series(df.Date1.unique(), name='Date')
step1 = df.merge(date, left_on = 'Date2', right_on = 'Date', how = 'outer')
step2 = step1.loc[step1.Date1 < step1.Date]
step2 = step2.groupby(['Case', 'Id', 'Date']).agg(sum=('Quantity','sum'))
(df
.loc[:, ['Case', 'Id', 'Date2']]
.drop_duplicates()
.rename(columns={'Date2':'Date'})
.merge(step2, how = 'left', on = ['Case', 'Id', 'Date'])
.fillna({'sum': 0}, downcast='infer')
)

   Case  Id       Date  sum
0     1   1 2020-01-01    0
1     1   1 2020-02-01  100
2     1   2 2020-01-01    0
3     1   2 2020-02-01   35

15 0
7

Write a Comments


* Be the first to Make Comment

GoodFirms Badge
GoodFirms Badge

Fix Your Meeting With Our SEO Consultants in India To Grow Your Business Online