This Python package cleans your data easily!
Welcome back! Python is an amazing programming language thats great for data science, so what if there was a way to easily clean up your data with Python? Well, there are actually a ton of ways to do this! So let's take a look at dataprep, another awesome package for data cleansing. If you want to check out their GitHub page, check out the link below:
I’m sure everyone has used Pandas before, but this package makes it easy to use universal commands to prepare your data, if you want to install this package, use the following command:
pip install -U dataprep
Once installed, we can utilize many of the built in commands, lets take a look at some examples!
from dataprep.clean import clean_country
import pandas as pd
df = pd.DataFrame({‘country’: [‘USA’, ‘country: Canada’, ‘233', ‘ tr ‘, ‘NA’]})
df2 = clean_country(df, ‘country’)
df2
In the example above, we’re cleaning the name of the country column, take a look at the output of this specific script: