TechBeamersTechBeamers
  • Learn ProgrammingLearn Programming
    • Python Programming
      • Python Basic
      • Python OOP
      • Python Pandas
      • Python PIP
      • Python Advanced
      • Python Selenium
    • Python Examples
    • Selenium Tutorials
      • Selenium with Java
      • Selenium with Python
    • Software Testing Tutorials
    • Java Programming
      • Java Basic
      • Java Flow Control
      • Java OOP
    • C Programming
    • Linux Commands
    • MySQL Commands
    • Agile in Software
    • AngularJS Guides
    • Android Tutorials
  • Interview PrepInterview Prep
    • SQL Interview Questions
    • Testing Interview Q&A
    • Python Interview Q&A
    • Selenium Interview Q&A
    • C Sharp Interview Q&A
    • PHP Interview Questions
    • Java Interview Questions
    • Web Development Q&A
  • Self AssessmentSelf Assessment
    • Python Test
    • Java Online Test
    • Selenium Quiz
    • Testing Quiz
    • HTML CSS Quiz
    • Shell Script Test
    • C/C++ Coding Test
Search
  • Python Multiline String
  • Python Multiline Comment
  • Python Iterate String
  • Python Dictionary
  • Python Lists
  • Python List Contains
  • Page Object Model
  • TestNG Annotations
  • Python Function Quiz
  • Python String Quiz
  • Python OOP Test
  • Java Spring Test
  • Java Collection Quiz
  • JavaScript Skill Test
  • Selenium Skill Test
  • Selenium Python Quiz
  • Shell Scripting Test
  • Latest Python Q&A
  • CSharp Coding Q&A
  • SQL Query Question
  • Top Selenium Q&A
  • Top QA Questions
  • Latest Testing Q&A
  • REST API Questions
  • Linux Interview Q&A
  • Shell Script Questions
© 2024 TechBeamers. All Rights Reserved.
Reading: Pandas Get Average Of Column Or Mean in Python
Font ResizerAa
TechBeamersTechBeamers
Font ResizerAa
  • Python
  • SQL
  • C
  • Java
  • Testing
  • Selenium
  • Agile
  • Linux
  • MySQL
  • Python Quizzes
  • Java Quiz
  • Testing Quiz
  • Shell Script Quiz
  • WebDev Interview
  • Python Basic
  • Python Examples
  • Python Advanced
  • Python OOP
  • Python Selenium
  • General Tech
Search
  • Programming Tutorials
    • Python Tutorial
    • Python Examples
    • Java Tutorial
    • C Tutorial
    • MySQL Tutorial
    • Selenium Tutorial
    • Testing Tutorial
  • Top Interview Q&A
    • SQL Interview
    • Web Dev Interview
  • Best Coding Quiz
    • Python Quizzes
    • Java Quiz
    • Testing Quiz
    • ShellScript Quiz
Follow US
© 2024 TechBeamers. All Rights Reserved.
PandasPython Tutorials

Pandas Get Average Of Column Or Mean in Python

Last updated: May 26, 2024 5:33 pm
By Soumya Agarwal
Share
11 Min Read
Pandas Get Average Of Column Or Mean in Python
SHARE

Welcome to another Python tutorial on Pandas! In this guide, we’ll explore how to get the average of a column using the powerful Pandas library. Pandas is a versatile data manipulation and analysis library that provides easy-to-use data structures for handling and analyzing structured data. If you’re working with datasets in Python, Pandas is a must-have tool in your toolkit.

Contents
Importing PandasLoading a Sample DatasetUsing Pandas to Get the Average or Mean for a ColumnDealing with Missing ValuesPandas to Get the Average for Multiple ColumnsGet the Average Using Pandas Describe() MethodConclusion

Setting Up Pandas to Get the Average of a Column

Before we dive into calculating column averages, let’s make sure you have Pandas installed. Open your terminal or command prompt and type the following command:

# Installing Pandas library
pip install pandas

If you haven’t installed Pandas yet, this command will download and install it for you.

Now, let’s create a Python script or Jupyter Notebook to follow along with the examples.

Importing Pandas

Start by importing Pandas into your script or notebook. Conventionally, Pandas is imported with the alias ‘pd’:

# Import Pandas and refer it using the pds alias
import pandas as pds

This allows us to use ‘pd’ as a shorthand reference for Pandas functions throughout the tutorial.

Loading a Sample Dataset

For the purpose of this tutorial, let’s use a complex dataset. You can easily adapt the code to work with your own dataset later. We’ll use a hypothetical dataset representing the sales of different products:

Empl_IDNameDeptSalaryExp_Years
101EmmaHR600003
102AlexIT800005
103SamSales700002
104JackIT900008
105OliviaHR6500011
106MaxFinance750004
107SophiaIT850006
108CharlieSales7200011
109LilyHR690001
110LucasFinance800007
Complex data set for Pandas to get the average or mean of a column

the following is using the above data and creating a data frame for calculating the mean or average.

import pandas as pds

# Create a sample DataFrame
empl_data = {
    'Empl_ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
    'Name': ['Emma', 'Alex', 'Sam', 'Jack', 'Olivia', 'Max', 'Sophia', 'Charlie', 'Lily', 'Lucas'],
    'Dept': ['HR', 'IT', 'Sales', 'IT', 'HR', 'Finance', 'IT', 'Sales', 'HR', 'Finance'],
    'Salary': [60000, 80000, 70000, 90000, 65000, 75000, 85000, 72000, 69000, 80000],
    'Exp_Years': [3, 5, 2, 8, 11, 4, 6, 11, 1, 7]
}

dfr = pds.DataFrame(empl_data)

If you want the above data read from CSV using Pandas, then use the following code:

# Load the dataset from CSV
dfr = pds.read_csv('empl_data.csv')

Now, you have a basic DataFrame with five columns: ‘Empl_ID’, ‘Name’, ‘Dept’, ‘Salary’, and ‘Exp_Years’.

Using Pandas to Get the Average or Mean for a Column

To find the average of a specific column, you can use the mean() function provided by Pandas. Let’s calculate the average salary:

average_sales = dfr['Salary'].mean()
print(f'The average sales is: {average_sales}')

In this example, dfr['Salary'] selects the ‘Sales’ column, and .mean() calculates the average. The result is then printed to the console. Easy, right?

In real-world scenarios, datasets are often more extensive. That’s why, we took a rich dataset with more rows and columns: However, you can load your own CSV having larger data.

Pandas seamlessly handle larger datasets, making Pandas library an efficient choice for data analysis tasks.

Dealing with Missing Values

In real-world datasets, you might encounter missing values. Pandas provide a convenient way to handle them. Let’s introduce a missing value in our sales data:

# Introduce a missing value
dfr.at[2, 'Exp_Years'] = None
dfr.at[5, 'Exp_Years'] = None
dfr.at[4, 'Salary'] = None
dfr.at[7, 'Salary'] = None

Now, if you try to calculate the average as before, Pandas will automatically skip the missing value:

avg_exp = df['Exp_Years'].mean()
print(f'The Average Experience is: {avg_exp}')

avg_sal = df['Salary'].mean()
print(f'The Average Salary is: {avg_sal}')

Pandas take care of missing values by excluding them from calculations, ensuring you get accurate results.

Pandas to Get the Average for Multiple Columns

What if you want to find the average for multiple columns? Pandas make it straightforward. Let’s extend our dataset with an ‘Expenses’ column:

import pandas as pds

# Reuse the Eployee Sample Data
empl_data = {
    'Empl_ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
    'Name': ['Emma', 'Alex', 'Sam', 'Jack', 'Olivia', 'Max', 'Sophia', 'Charlie', 'Lily', 'Lucas'],
    'Dept': ['HR', 'IT', 'Sales', 'IT', 'HR', 'Finance', 'IT', 'Sales', 'HR', 'Finance'],
    'Salary': [60000, 80000, 70000, 90000, 65000, 75000, 85000, 72000, 69000, 80000],
    'Exp_Years': [3, 5, 2, 8, 11, 4, 6, 11, 1, 7]
}

dfr = pds.DataFrame(empl_data)

To calculate the average for both ‘Salary’ and ‘Exp_Years’, you can use the following code:

avg_sal = dfr['Salary'].mean()
avg_exp = dfr['Exp_Years'].mean()

print(f'The Average Salary is: {avg_sal}')
print(f'The Average Experience is: {avg_exp}')

You can easily adapt this approach to calculate averages for as many columns as needed. Let’s now combine all the pieces of code and print the final output.

import pandas as pds

# Reuse the Eployee Sample Data
empl_data = {
    'Empl_ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
    'Name': ['Emma', 'Alex', 'Sam', 'Jack', 'Olivia', 'Max', 'Sophia', 'Charlie', 'Lily', 'Lucas'],
    'Dept': ['HR', 'IT', 'Sales', 'IT', 'HR', 'Finance', 'IT', 'Sales', 'HR', 'Finance'],
    'Salary': [60000, 80000, 70000, 90000, 65000, 75000, 85000, 72000, 69000, 80000],
    'Exp_Years': [3, 5, 2, 8, 11, 4, 6, 11, 1, 7]
}

dfr = pds.DataFrame(empl_data)

avg_sal = dfr['Salary'].mean()
avg_exp = dfr['Exp_Years'].mean()

print("========= Printing Avg With Actual Data ========")
print(f'The Average Salary is: {avg_sal}')
print(f'The Average Experience is: {avg_exp}')

# Introduce a missing value
dfr.at[2, 'Exp_Years'] = None
dfr.at[5, 'Exp_Years'] = None
dfr.at[4, 'Salary'] = None
dfr.at[7, 'Salary'] = None

avg_sal = dfr['Salary'].mean()
avg_exp = dfr['Exp_Years'].mean()

print("========= Printing Avg With Missing Values ========")
print(f'The Average Salary is: {avg_sal}')
print(f'The Average Experience is: {avg_exp}')

When you run this code, it will give you the following result:

========= Printing Avg With Actual Data ========
The Average Salary is: 74600.0
The Average Experience is: 5.8
========= Printing Avg With Missing Values ========
The Average Salary is: 76125.0
The Average Experience is: 6.5

Once we intentionally introduced some missing values, Pandas reported a slight increase in the average or mean value of the ‘Salary’ and ‘Exp_Year’ columns. This is because some of the rows were discarded due to missing values.

Get the Average Using Pandas Describe() Method

Pandas also has a describe() method which calculates several other things including the mean or average. So, to test this, let’s get the above code modified to call the describe() method.

import pandas as pds

# Reuse the Employee Sample Data
empl_data = {
    'Empl_ID': [101, 102, 103, 104, 105, 106, 107, 108, 109, 110],
    'Name': ['Emma', 'Alex', 'Sam', 'Jack', 'Olivia', 'Max', 'Sophia', 'Charlie', 'Lily', 'Lucas'],
    'Dept': ['HR', 'IT', 'Sales', 'IT', 'HR', 'Finance', 'IT', 'Sales', 'HR', 'Finance'],
    'Salary': [60000, 80000, 70000, 90000, 65000, 75000, 85000, 72000, 69000, 80000],
    'Exp_Years': [3, 5, 2, 8, 11, 4, 6, 11, 1, 7]
}

dfr = pds.DataFrame(empl_data)

avg_sal = dfr['Salary'].mean()
avg_exp = dfr['Exp_Years'].mean()

print("========= Printing Avg With Actual Data ========")
print(f'The Average Salary is: {avg_sal}')
print(f'The Average Experience is: {avg_exp}')

# Introduce missing values
dfr.at[2, 'Exp_Years'] = None
dfr.at[5, 'Exp_Years'] = None
dfr.at[4, 'Salary'] = None
dfr.at[7, 'Salary'] = None

# Use describe() to get a summary including mean
summary = dfr.describe()

# Display the mean from the summary
avg_sal_from_desc = summary.loc['mean', 'Salary']
avg_exp_from_desc = summary.loc['mean', 'Exp_Years']

print("\n========= Printing Avg With Missing Values Using describe() ========")
print(f'The Average Salary is: {avg_sal_from_desc}')
print(f'The Average Experience is: {avg_exp_from_desc}')

So, even with this new method, we get the similar results as shown below:

========= Printing Avg With Actual Data ========
The Average Salary is: 74600.0
The Average Experience is: 5.8

========= Printing Avg With Missing Values Using describe() ========
The Average Salary is: 76125.0
The Average Experience is: 6.5

Conclusion

Congratulations! You’ve successfully learned how to calculate the average for a column in Pandas. We covered importing Pandas, loading a sample dataset, handling missing values, and calculating averages for both single and multiple columns.

Pandas provides a powerful and intuitive way to work with data, making tasks like this a breeze. As you continue your journey with Python, you’ll find Pandas to be an invaluable tool for data manipulation and analysis.

Python for Data Science

Check this Beginners’s Guide to Learn Pandas Series and DataFrames.

19 Min ReadPython Pandas Tutorial

If you want us to continue writing such tutorials, support us by sharing this post on your social media accounts like Facebook / Twitter. This will encourage us and help us reach more people.

Happy Coding,
Team TechBeamers

You Might Also Like

How to Connect to PostgreSQL in Python

Generate Random IP Address (IPv4/IPv6) in Python

Python Remove Elements from a List

Selenium Python Extent Report Guide

10 Python Tricky Coding Exercises

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
Loading
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Soumya Agarwal Avatar
By Soumya Agarwal
Follow:
I'm a BTech graduate from IIITM Gwalior. I have been actively working with large MNCs like ZS and Amazon. My development skills include Android and Python programming, while I keep learning new technologies like data science, AI, and LLMs. I have authored many articles and published them online. I frequently write on Python programming, Android, and popular tech topics. I wish my tutorials are new and useful for you.
Previous Article Linux Commands for Beginners With Examples Basic Linux Commands for Beginners With Examples
Next Article How to Get the List Shape in Python How to Find the List Shape in Python

Popular Tutorials

SQL Interview Questions List
50 SQL Practice Questions for Good Results in Interview
SQL Interview Nov 01, 2016
Demo Websites You Need to Practice Selenium
7 Sites to Practice Selenium for Free in 2024
Selenium Tutorial Feb 08, 2016
SQL Exercises with Sample Table and Demo Data
SQL Exercises – Complex Queries
SQL Interview May 10, 2020
Java Coding Questions for Software Testers
15 Java Coding Questions for Testers
Selenium Tutorial Jun 17, 2016
30 Quick Python Programming Questions On List, Tuple & Dictionary
30 Python Programming Questions On List, Tuple, and Dictionary
Python Basic Python Tutorials Oct 07, 2016
//
Our tutorials are written by real people who’ve put in the time to research and test thoroughly. Whether you’re a beginner or a pro, our tutorials will guide you through everything you need to learn a programming language.

Top Coding Tips

  • PYTHON TIPS
  • PANDAS TIPSNew
  • DATA ANALYSIS TIPS
  • SELENIUM TIPS
  • C CODING TIPS
  • GDB DEBUG TIPS
  • SQL TIPS & TRICKS

Top Tutorials

  • PYTHON TUTORIAL FOR BEGINNERS
  • SELENIUM WEBDRIVER TUTORIAL
  • SELENIUM PYTHON TUTORIAL
  • SELENIUM DEMO WEBSITESHot
  • TESTNG TUTORIALS FOR BEGINNERS
  • PYTHON MULTITHREADING TUTORIAL
  • JAVA MULTITHREADING TUTORIAL

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Loading
TechBeamersTechBeamers
Follow US
© 2024 TechBeamers. All Rights Reserved.
  • About
  • Contact
  • Disclaimer
  • Privacy Policy
  • Terms of Use
TechBeamers Newsletter - Subscribe for Latest Updates
Join Us!

Subscribe to our newsletter and never miss the latest tech tutorials, quizzes, and tips.

Loading
Zero spam, Unsubscribe at any time.
x