pandas獲取groupby分組里最大值所在的行方法
如下面這個(gè)DataFrame,按照Mt分組,取出Count最大的那行
import pandas as pddf = pd.DataFrame({'Sp':['a','b','c','d','e','f'], 'Mt':['s1', 's1', 's2','s2','s2','s3'], 'Value':[1,2,3,4,5,6], 'Count':[3,2,5,10,10,6]})df| Count | Mt | Sp | Value | |
|---|---|---|---|---|
| 0 | 3 | s1 | a | 1 |
| 1 | 2 | s1 | b | 2 |
| 2 | 5 | s2 | c | 3 |
| 3 | 10 | s2 | d | 4 |
| 4 | 10 | s2 | e | 5 |
| 5 | 6 | s3 | f | 6 |
方法1:在分組中過濾出Count最大的行
df.groupby('Mt').apply(lambda t: t[t.Count==t.Count.max()])| Count | Mt | Sp | Value | ||
|---|---|---|---|---|---|
| Mt | |||||
| s1 | 0 | 3 | s1 | a | 1 |
| s2 | 3 | 10 | s2 | d | 4 |
| 4 | 10 | s2 | e | 5 | |
| s3 | 5 | 6 | s3 | f | 6 |
方法2:用transform獲取原dataframe的index,然后過濾出需要的行
print df.groupby(['Mt'])['Count'].agg(max)idx=df.groupby(['Mt'])['Count'].transform(max)print idxidx1 = idx == df['Count']print idx1df[idx1]
Mts1 3s2 10s3 6Name: Count, dtype: int640 31 32 103 104 105 6dtype: int640 True1 False2 False3 True4 True5 Truedtype: bool