Best Python projects for beginner Data Scientists!

Welcome back! Let’s go ahead and talk about some of my favorite projects to start off with when using Python. There are probably better / easier projects out there, but this is my article so I get to choose what goes on here ☺️ Luckily for you every project I talk about in this article i’ve already coded out and written articles on, so i’ll leave those linked within the posts as well! Let’s get started!

1. Scraping Google Trends data

One of my favorite things to do is web scraping, it’s very important for almost any programmer to know how to do this (especially data analysts / scientists). With this project you understand how to use web scraping packages within Python, you learn a bit about certain HTML variables and how to inspect them within your browser, you also learn how to store these attributes into a variable that you can print out, or even store into a dataset to manipulate further. Click here to view the article that I posted on exactly how to do this.

Adding to this project: After going through that post I created, to make this project even more valuable, I would recommend looking into adding more keywords to the search, maybe even scraping in more data points rather than just the top “breakout” attribute from the search. Also, you could try adding a GUI to this project that allows you to put keywords into a text box, you should read this article I posted about building GUI’s within Python, then takes those keywords and searches through them within the Google Trends page. This may seem impossible, but once you follow that post I created these next steps should seem a bit more clear.

2. Building a TikTok Scraping Tool

We all love TikTok, at least I do, and one of my favorite projects I looked into was scraping data from TikTok. This project that I wrote about shows exactly how to develop a TikTok scraping tool with a front end (GUI) which can be a very valuable starting point to both software development and backend data processing. With this project we start off by installing certain Python packages we need, we build out the backend by setting up our web scraping tool to go to certain TikTok webpages, we then extract the meta-data from the website for certain data points, we then build out the front end using a Python package called Stream Lit. The User would essentially run the Python code, input a link within the text box, the output would be shown below that to the user.

Adding to this project: This is one of my favorite projects because it allows so much flexibility, because this is a barebones software, you could realistically add whatever you want. I would look into maybe adding some graphs, then maybe adding some more outputs (maybe number of followers), etc.

3. Creating Instagram Scraping Tool

Similar to my Google Trends project, Instagram holds a lot of data as well that maybe useful to the end user. I have written 2 articles on scraping different Instagram data points including follower count (click here to read) and the number of posts (click here to read). With these projects we go ahead and learn how to build out a web scraping bot that takes in an instagram user link and outputs either the follower count or number of posts (depending on the article that you read).

Adding to these projects: First off, these are 2 different articles, so I would recommend combing these 2 projects into one. Secondly, I would recommend building out a front end (just like the post above this) that reads in a link from a text box and outputs the data to the GUI, you should read this article I posted about building GUI’s within Python.

4. Predicting stock prices using Machine Learning

This wouldn’t be a Data Scientist article without throwing some Machine Learning to the mix. First off, this is a pretty thorough / boring article I wrote but it’s so important to understand the key concepts of Machine Learning, so read the article here to understand the project. Secondly, this project is a very good introduction to Machine Learning in a pretty practical sense. With this project we learn how to create separate data sets, create our training data sets, create a machine learning model, and forecast our data while plotting those points out to a graph. This is definitely not the easiest project to follow, but as a beginner Data Scientist it would be a very good place to start understanding some Machine learning code.

Adding to this project: Honestly maybe using different data sets of different stocks, you can also try messing around with the epochs to train the Machine learning model a bit more.

That pretty much covers it, have fun creating some of these projects!

Data Scientist / Engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store