How To Scrape Websites Using Python & BeutifulSoup4

Manpreet Singh
6 min readMay 3, 2021

Welcome back! I’ve discussed tons of different ways to scrape data from websites using tons of different languages / packages, now let’s talk about one of the biggest web scraping packages for Python: Beautiful Soup.

Installation

First off, we’re going to install the Beautiful Soup package, to do this use the following pip command(s):

pip install bs4
#OR
pip3 install bs4

Awesome, we’re pretty much ready to start scraping websites, but there is one more important thing to keep in mind.

Understanding HTML

If you’re wondering, yes I took this from my other article, but it’s such a good description of this process ☺️, we first must learn the layout of how the data is going to be scraped. When I started using this package I always saw a ton of tutorials speeding past this part, it led me to being stuck on tons of basic steps, so this is a very important concept to understand during web scraping, let’s take a look at the following HTML code:

<!DOCTYPE html>
<html>
<body><p1>This is a test.</p>
<p2>This is not a test.</p>
<p3>This is still a test.</p></body>
</html>

As most of you know, every single web page is built using HTML, CSS, Javascript or some variation of…

--

--

Manpreet Singh
Manpreet Singh

Responses (1)