How to replace nan by mean in Pandas

In this post, you will learn how to replace NaN by mean in Pandas.

There are many methods to get rid of unspecified values from the dataframe. I will use the sklearn module to replace the NaN value with the average.

import pandas as pd

my_data = {'Column1': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7],
           'Column2': [100000000, 120000000, None, 260000000, 210000000, 80000000, 40000000]}

my_df = pd.DataFrame(my_data)

How to replace nan by mean in Pandas

The SimpleImputer method allows you to replace NaN values using various strategies. Below I paste the code that replaces NaN with the average.

import pandas as pd
from sklearn.impute import SimpleImputer


my_data = {'Column1': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7],
           'Column2': [100000000, 120000000, None, 260000000, 210000000, 80000000, 40000000]}

my_df = pd.DataFrame(my_data)

nan_to_mean = SimpleImputer(strategy='mean')

my_df['Column2'] = nan_to_mean.fit_transform(my_df[['Column2']])

print(f'No more NaN values in my dataframe: \n {my_df}')

There are also other strategies for replacing unspecified values. Can replace NaN with:

  • mean
  • median
  • most_frequent
  • constant

You can choose the strategy that best suits your data and analysis needs.

See also:
Documentation of SimpleImputer method

This Post Has 2 Comments

Leave a Reply