Pandas: How to Use Groupby and Count with Condition

link管理
链接快照平台
输入网页链接，自动生成快照
标签化管理网页链接
相关文章推荐
文雅的猴子 · Bug found in Data ...· 1 月前 ·
曾经爱过的皮蛋 · 西北农林科技大学动物医学院· 2 月前 ·
飞翔的开心果 · 使用 electron 实现类似新版 QQ ...· 3 月前 ·
逼格高的凉面 · 如何添加、编辑和自定义WordPress主题 ...· 4 月前 ·
性感的镜子 · 小马软件库最新app下载官方版-小马软件库4 ...· 4 月前 ·
想出家的萝卜 · 新年搬新家，还有一线江景！杭州一批安置房交付 ...· 4 月前 ·
You can use the following basic syntax to perform a groupby and count with condition in a pandas DataFrame:
df.groupby('var1')['var2'].apply(lambda x: (x=='val').sum()).reset_index(name='count')
This particular syntax groups the rows of the DataFrame based on var1 and then counts the number of rows where var2 is equal to ‘val.’
The following example shows how to use this syntax in practice.
Example: Groupby and Count with Condition in Pandas
Suppose we have the following pandas DataFrame that contains information about various basketball players:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'pos': ['Gu', 'Fo', 'Fo', 'Fo', 'Gu', 'Gu', 'Fo', 'Fo'],
                   'points': [18, 22, 19, 14, 14, 11, 20, 28]})
#view DataFrame
print(df)
  team pos  points
0    A  Gu      18
1    A  Fo      22
2    A  Fo      19
3    A  Fo      14
4    B  Gu      14
5    B  Gu      11
6    B  Fo      20
7    B  Fo      28
The following code shows how to group the DataFrame by the team variable and count the number of rows where the pos variable is equal to ‘Gu’:
#groupby team and count number of 'pos' equal to 'Gu'
df_count = df.groupby('team')['pos'].apply(lambda x: (x=='Gu').sum()).reset_index(name='count')
#view results
print(df_count)
  team  count
0    A      1
1    B      2
From the output we can see:
Team A has 1 row where the pos column is equal to ‘Gu’
Team B has 2 rows where the pos column is equal to ‘Gu’
We can use similar syntax to perform a groupby and count with some numerical condition.
For example, the following code shows how to group by the team variable and count the number of rows where the points variable is greater than 15:
#groupby team and count number of 'points' greater than 15
df_count = df.groupby('team')['points'].apply(lambda x: (x>15).sum()).reset_index(name='count')
#view results
print(df_count)
  team  count
0    A      3
1    B      2
From the output we can see:
Team A has 3 rows where the points column is greater than 15
Team B has 2 rows where the points column is greater than 15 
You can use similar syntax to perform a groupby and count with any specific condition you’d like.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Count Unique Values Using Pandas GroupBy

How to Apply Function to Pandas Groupby

How to Create Bar Plot from Pandas GroupBy
	




    
Zach Bobbitt
Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.
					Hi, thanks for this post.
Is there a way that I can add another column to this dataframe, like you created ‘count’ where the points greater than 15 are shown. What if I want to add another column in the same dataframe, named ‘average’ where it shows the average per team. 
Thanks
				Reply
			
		Leave a Reply Cancel reply
Your email address will not be published. Required fields are marked *
Comment * <textarea id="comment" aria-label="hp-comment" aria-hidden="true" name="comment" autocomplete="new-password" style="padding:0 !important;clip:rect(1px, 1px, 1px, 1px) !important;position:absolute !important;white-space:nowrap !important;height:1px !important;width:1px !important;overflow:hidden !important;" tabindex="-1"/></p><p class="comment-form-author"><label for="author">Name <span class="required">*</span></label> <input id="author" name="author" type="text" value="" size="30" maxlength="245" autocomplete="name" required=""/></p>
<p class="comment-form-email"><label for="email">Email <span class="required">*</span></label> <input id="email" name="email" type="email" value="" size="30" maxlength="100" aria-describedby="email-notes" autocomplete="email" required=""/></p>
	<button type="submit" class="search-submit"><span class="screen-reader-text">Search</span></button>
</aside><aside id="text-7" class="widget widget_text"><h2 class="widget-title">ABOUT STATOLOGY</h2>			<div class="textwidget"><p><a href="https://www.statology.org/about/"><img loading="lazy" decoding="async" class="alignleft wp-image-39773" src="https://www.statology.org/wp-content/uploads/2023/08/statology_circle-150x150.png" alt="" width="132" height="132" srcset="https://www.statology.org/wp-content/uploads/2023/08/statology_circle-150x150.png 150w, https://www.statology.org/wp-content/uploads/2023/08/statology_circle-300x300.png 300w, https://www.statology.org/wp-content/uploads/2023/08/statology_circle.png 768w" sizes="(max-width: 132px) 100vw, 132px"/></a>Statology makes learning statistics easy by explaining topics in simple and straightforward ways. Our team of writers have over 40 years of experience in the fields of Machine Learning, AI and Statistics. <b><a style="color: purple;" href="https://www.statology.org/about/">Learn more about our team here.</a></b></p>
		</aside><aside id="block-6" class="widget widget_block"><h2 class="widget-title">Featured Posts</h2><div class="wp-widget-group__inner-blocks"><ul class="wp-block-latest-posts__list has-dates wp-block-latest-posts"><li><div class="wp-block-latest-posts__featured-image alignleft"><a href="https://www.statology.org/5-statistical-skills-that-will-make-you-stand-out-in-a-data-science-interview/" aria-label="5 Statistical Skills That Will Make You Stand Out in a Data Science Interview"><img loading="lazy" decoding="async" width="150" height="150" src="https://www.statology.org/wp-content/uploads/2024/08/job-interview-candidate-selection-employment-150x150.jpg" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" style="max-width:105px;max-height:105px;"/></a></div><a class="wp-block-latest-posts__post-title" href="https://www.statology.org/5-statistical-skills-that-will-make-you-stand-out-in-a-data-science-interview/">5 Statistical Skills That Will Make You Stand Out in a Data Science Interview</a><time datetime="2024-08-05T10:00:54-04:00" class="wp-block-latest-posts__post-date">August 5, 2024</time></li>
<li><div class="wp-block-latest-posts__featured-image alignleft"><a href="https://www.statology.org/how-to-parse-dates-from-text-in-python/" aria-label="How to Parse Dates from Text in Python"><img loading="lazy" decoding="async" width="150" height="150" src="https://www.statology.org/wp-content/uploads/2024/07/planner-calendar-schedule-date-concept-150x150.jpg" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" style="max-width:105px;max-height:105px;"/></a></div><a class="wp-block-latest-posts__post-title" href="https://www.statology.org/how-to-parse-dates-from-text-in-python/">How to Parse Dates from Text in Python</a><time datetime="2024-08-05T08:00:43-04:00" class="wp-block-latest-posts__post-date">August 5, 2024</time></li>
<li><div class="wp-block-latest-posts__featured-image alignleft"><a href="https://www.statology.org/from-correlation-to-causation-deep-dive-into-data-interpretation/" aria-label="From Correlation to Causation: Deep Dive into Data Interpretation"><img loading="lazy" decoding="async" width="150" height="150" src="https://www.statology.org/wp-content/uploads/2024/08/gulati-correlation-to-causation-150x150.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" style="max-width:105px;max-height:105px;"/></a></div><a class="wp-block-latest-posts__post-title" href="https://www.statology.org/from-correlation-to-causation-deep-dive-into-data-interpretation/">From Correlation to Causation: Deep Dive into Data Interpretation</a><time datetime="2024-08-05T06:00:44-04:00" class="wp-block-latest-posts__post-date">August 5, 2024</time></li>
<li><div class="wp-block-latest-posts__featured-image alignleft"><a href="https://www.statology.org/how-to-resample-time-series-data-in-python/" aria-label="How to Resample Time Series Data in Python"><img loading="lazy" decoding="async" width="150" height="150" src="https://www.statology.org/wp-content/uploads/2024/07/sta-header-timeseries-resample-python-150x150.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" style="max-width:105px;max-height:105px;"/></a></div><a class="wp-block-latest-posts__post-title" href="https://www.statology.org/how-to-resample-time-series-data-in-python/">How to Resample Time Series Data in Python</a><time datetime="2024-08-02T10:00:39-04:00" class="wp-block-latest-posts__post-date">August 2, 2024</time></li>
<li><div class="wp-block-latest-posts__featured-image alignleft"><a href="https://www.statology.org/how-to-format-dates-in-python/" aria-label="How to Format Dates in Python"><img loading="lazy" decoding="async" width="150" height="150" src="https://www.statology.org/wp-content/uploads/2024/07/sta-header-python-datetime-format-dates-python-150x150.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" style="max-width:105px;max-height:105px;"/></a></div><a class="wp-block-latest-posts__post-title" href="https://www.statology.org/how-to-format-dates-in-python/">How to Format Dates in Python</a><time datetime="2024-08-02T08:00:02-04:00" class="wp-block-latest-posts__post-date">August 2, 2024</time></li>
<li><div class="wp-block-latest-posts__featured-image alignleft"><a href="https://www.statology.org/step-by-step-guide-to-linear-regression-in-python/" aria-label="Step-by-Step Guide to Linear Regression in Python"><img loading="lazy" decoding="async" width="150" height="150" src="https://www.statology.org/wp-content/uploads/2024/07/linear-regression-150x150.png" class="attachment-thumbnail size-thumbnail wp-post-image" alt="" style="max-width:105px;max-height:105px;"/></a></div><a class="wp-block-latest-posts__post-title" href="https://www.statology.org/step-by-step-guide-to-linear-regression-in-python/">Step-by-Step Guide to Linear Regression in Python</a><time datetime="2024-08-02T06:00:00-04:00" class="wp-block-latest-posts__post-date">August 2, 2024</time></li>
</ul></div></aside><aside id="custom_html-15" class="widget_text widget widget_custom_html"><h2 class="widget-title">Statology Study</h2><div class="textwidget custom-html-widget"><p><b><a style="color: purple;" href="https://www.statology.org/study-register/">Statology Study</a></b> is the ultimate online statistics study guide that helps you study and practice all of the core concepts taught in any elementary statistics course and makes your life so much easier as a student.</p>
<img src="https://www.statology.org/wp-content/uploads/2021/01/statology_study_cover1.png" alt="statology study"/></a></p></div></aside><aside id="custom_html-19" class="widget_text widget widget_custom_html"><h2 class="widget-title">Introduction to Statistics Course</h2><div class="textwidget custom-html-widget"><p><b>Introduction to Statistics</b> is our premier online video course that teaches you all of the topics covered in introductory statistics. <b><a style="color: purple;" href="https://www.statology.org/course-register/">Get started </a></b> with our course today.</p>
<img src="https://www.statology.org/wp-content/uploads/2022/06/Intro-to-Statistics-Cover-Photo-2.jpg" alt="introduction to statistics"/></a></p></div></aside><aside id="widget_crp-2" class="widget crp_posts_list_widget"><h2 class="widget-title">You Might Also Like</h2><div class="crp_related crp_related_widget    crp-text-only"><ul><li><a href="https://www.statology.org/pandas-add-count-column/" class="crp_link post-27363"><span class="crp_title">How to Add a Count Column to a Pandas DataFrame</span></a></li><li><a href="https://www.statology.org/pandas-groupby-two-columns/" class="crp_link post-33916"><span class="crp_title">Pandas: How to Groupby Two Columns and Aggregate</span></a></li><li><a href="https://www.statology.org/pandas-group-by-where-clause/" class="crp_link post-31742"><span class="crp_title">Pandas: How to Use Group By with Where Condition</span></a></li><li><a href="https://www.statology.org/pandas-groupby-apply/" class="crp_link post-22434"><span class="crp_title">How to Apply Function to Pandas Groupby</span></a></li><li><a href="https://www.statology.org/pandas-frequency-table-multiple-columns/" class="crp_link post-32257"><span class="crp_title">Pandas: Create Frequency Table Based on Multiple Columns</span></a></li><li><a href="https://www.statology.org/pandas-groupby-to-dataframe/" class="crp_link post-25930"><span class="crp_title">How to Convert Pandas GroupBy Output to DataFrame</span></a></li></ul><div class="crp_clear"/></div></aside>	</div><!-- .sidebar -->