How to replace nan by mean in Pandas

In this post, you will learn how to replace NaN by mean in Pandas.

There are many methods to get rid of unspecified values from the dataframe. In this post, I will use the sklearn module to replace the NaN value with the average.

import pandas as pd

my_data = {'Column1': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7],
           'Column2': [100000000, 120000000, None, 260000000, 210000000, 80000000, 40000000]}

my_df = pd.DataFrame(my_data)

How to replace nan by mean in Pandas

The SimpleImputer method allows you to replace NaN values using various strategies. Below I paste the code that replaces NaN with the average.

import pandas as pd
from sklearn.impute import SimpleImputer


my_data = {'Column1': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7],
           'Column2': [100000000, 120000000, None, 260000000, 210000000, 80000000, 40000000]}

my_df = pd.DataFrame(my_data)

nan_to_mean = SimpleImputer(strategy='mean')

my_df['Column2'] = nan_to_mean.fit_transform(my_df[['Column2']])

print(f'No more NaN values in my dataframe: \n {my_df}')

There are also other strategies for replacing unspecified values. Can replace NaN with:

  • mean
  • median
  • most_frequent
  • constant

See also:
Documentation of SimpleImputer method

2 Replies to “How to replace nan by mean in Pandas

Leave a Reply

Your email address will not be published. Required fields are marked *