Groupby-Pandas-User Guide原文翻译9

744

收藏 2020-06-10

欢迎关注微信个人公众号
，在个人
公众号
中，搜索: 大白学财经，有更多金融、python的话题分享

Applying differentfunctions to DataFrame columns¶

By passing a dict to aggregateyou can apply a different aggregation to the columns of a DataFrame:

在DF的列中使用不同的函数，

通过传入一个字典，可以对DF的不同列进行不同的聚合。

In [85]:grouped.agg({'C': np.sum,

....: 'D': lambda x: np.std(x,ddof=1)})

###用了np.sum这个函数，直接对C列的值处理

###用了np.std这个函数，直接对D列的值处理

85的D列处理和下面这句是一样的

grouped["D"].agg(np.std)

The function names can also be strings. In order for a string tobe valid it must be either implemented on GroupBy or available via dispatching:

这个函数的名字也可以是字符串，为了使得字符串有效，他必须即要是可执行的GroupBy或是通过不匹配也要有效

In [86]: grouped.agg({'C': 'sum', 'D': 'std'})

效果一致

Cython-optimizedaggregation functions¶

Some common aggregations, currently only sum,mean, std,and sem, have optimized Cythonimplementations:

Cython最优化聚合函数

有些普通的最优化函数，当前只有sum mean std和sem有最优化的cython执行。

Of course sumand mean are implemented on pandasobjects, so the above code would work even without the special versions viadispatching (see below).

In [87]: df.groupby('A').sum()

Out[87]:

C D

bar 0.392940 1.732707

foo -1.796421 2.824590

In [88]: df.groupby(['A', 'B']).mean()

Out[88]:

C D

A B

bar one 0.254161 1.511763

three 0.215897 -0.990582

two -0.077118 1.211526

foo one -0.491888 0.807291

three -0.862495 0.024580

two 0.024925 0.592714

当然sum和mean都是pd的执行对象，所以以上的代码有效，即使在没有特殊的dispatching的版本中。

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享