Pandas How To Uncategorized How to select columns by condition in Pandas

How to select columns by condition in Pandas

In Pandas, you can select columns by condition using boolean indexing. Boolean indexing allows you to select data based on a condition that evaluates to either True or False.

To select columns by condition, you can create a boolean mask by applying a condition to the DataFrame using comparison operators such as ==, >, <, >=, or <=. You can then use the boolean mask to select the columns that meet the condition. Here are some examples:

import pandas as pd

# create a sample dataframe
data = {'name': ['John', 'Jane', 'Bob'],
'age': [30, 25, 40],
'city': ['New York', 'Paris', 'London']}
df = pd.DataFrame(data)

# select columns where the age is greater than 25
age_mask = df['age'] > 25
age_columns = df.loc[:, age_mask]
print(age_columns)

# select columns where the city is either 'Paris' or 'London'
city_mask = df['city'].isin(['Paris', 'London'])
city_columns = df.loc[:, city_mask]
print(city_columns)

# select columns where the name starts with 'J'
name_mask = df['name'].str.startswith('J')
name_columns = df.loc[:, name_mask]
print(name_columns)

This will output:

age city
0 30 New York
1 25 Paris
2 40 London

age city
1 25 Paris
2 40 London

name
0 John
1 Jane
2 Bob

In the example above, we first created a sample DataFrame with a ‘name’, ‘age’, and ‘city’ column. We then used boolean indexing to select different subsets of columns based on a condition:

  • df.loc[:, age_mask] selects the columns where the age is greater than 25.
  • df.loc[:, city_mask] selects the columns where the city is either ‘Paris’ or ‘London’.
  • df.loc[:, name_mask] selects the columns where the name starts with ‘J’.

Note that when using boolean indexing, you need to use the .loc indexer to select columns by label. The : symbol in the .loc indexer is used to select all rows, while the boolean mask is used to select the columns that meet the condition.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post