How to Use If-Else Function in Pandas DataFrame
As a data scientist or software engineer, you are probably familiar with the concept of conditional statements. One of the most common conditional statements used in programming is the if-else statement. The if-else statement is used to execute a block of code if a certain condition is true, and another block of code if the condition is false.
In this article, we will discuss how to use the if-else function in Pandas DataFrame. Pandas is a popular data analysis library in Python that provides various functions to manipulate and analyze data.
What is Pandas DataFrame?
Before we dive into the if-else function, let’s first understand what a Pandas DataFrame is. A Pandas DataFrame is a two-dimensional size-mutable, tabular data structure with columns of potentially different types. It is similar to a spreadsheet or SQL table, where each column represents a variable, and each row represents an observation.
Pandas provides various functions to manipulate and analyze data in a DataFrame. One of the most useful functions is the if-else function, which allows you to apply a certain condition to a DataFrame and return a new DataFrame with the desired values.
How to Use If-Else Function in Pandas DataFrame
To use the if-else function in Pandas DataFrame, you can use the
apply()
function along with a lambda function. The
apply()
function applies a function along an axis of the DataFrame. The lambda function is a short, anonymous function that takes in a value and returns a value based on a certain condition.
Let’s take an example to understand how the if-else function works in Pandas DataFrame. Suppose you have a DataFrame with two columns,
'Name'
and
'Score'
, and you want to add a third column
'Result'
, which will contain the value
'Pass'
if the score is greater than or equal to 50 and
'Fail'
otherwise.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'Dave', 'Eve'],
'Score': [70, 40, 60, 80, 30]})
# Apply if-else function to create a new column 'Result'
df['Result'] = df['Score'].apply(lambda x: 'Pass' if x >= 50 else 'Fail')
# Print the DataFrame
print(df)
Output:
Name Score Result
0 Alice 70 Pass
1 Bob 40 Fail
2 Charlie 60 Pass
3 Dave 80 Pass