This Python package cleans your data easily!

Manpreet Singh
2 min readOct 15, 2021

Welcome back! Python is an amazing programming language thats great for data science, so what if there was a way to easily clean up your data with Python? Well, there are actually a ton of ways to do this! So let's take a look at dataprep, another awesome package for data cleansing. If you want to check out their GitHub page, check out the link below:

I’m sure everyone has used Pandas before, but this package makes it easy to use universal commands to prepare your data, if you want to install this package, use the following command:

pip install -U dataprep

Once installed, we can utilize many of the built in commands, lets take a look at some examples!

from dataprep.clean import clean_country
import pandas as pd
df = pd.DataFrame({‘country’: [‘USA’, ‘country: Canada’, ‘233', ‘ tr ‘, ‘NA’]})
df2 = clean_country(df, ‘country’)
df2

In the example above, we’re cleaning the name of the country column, take a look at the output of this specific script:

--

--