You will learn how to apply function to a column in Pandas.
Applying a function to a column in Pandas can be useful for data preprocessing in Python. You will also use the function when you find an error in the data or problems with validation, e.g. redundant whitespace.
Fortunately, in Pandas it’s very easy to apply functions to a data column because Pandas offers built-in functions that apply code to a data column. This post will cover the use of two different functions: apply and map.
How to apply function to a column using apply method
To apply a function to an entire column of data, you want to use the apply function. After all, the very name of the function indicates that it is used to apply code.
Below I created an example function and applied it to the first column of numerical data.
import pandas as pd my_df = pd.DataFrame({'id':['id1','id2','id3', 'id2'], 'Column1': [2, 7, 6, 7], 'Column2': [2, 5, 8, 5], 'Column3': [4, 1, 9, 1]}) def my_func(x): x = x+1 return x my_df['Column1'] = my_df['Column1'].apply(my_func) print(f'This is how to apply function to the whole column \n{my_df}')
The syntax of the apply function is intuitive, as the only parameter I use the name of my function that I apply.
How to apply function to a column using map method
In the second example, I use the map function. Using the map function is not intuitive, but in practice it is preferable. The map function is more powerful and more efficient than the apply function.
import pandas as pd my_df = pd.DataFrame({'id':['id1','id2','id3', 'id2'], 'Column1': [2, 7, 6, 7], 'Column2': [2, 5, 8, 5], 'Column3': [4, 1, 9, 1]}) def my_func(x): x = x+1 return x my_df['Column1'] = my_df['Column1'].map(my_func) print(f'This is how to map function to the whole column \n{my_df}')
As you can see, the map function works in the same way as the apply method. I recommend using the map method and the use of the apply method is only shown as a curiosity.
Pingback: How To Apply Lambda • Pandas How To