Member-only story

A New Web Scraper For Python

Manpreet Singh
2 min readMar 14, 2022

--

Welcome back! Python is one of my favorite programming languages ever made, if you’re new to Python, check out the link below to learn more about it:

So, let’s take a look at a new web scraper you can use with Python, this web scraper is called dude, and it’s an uncomplicated way of extracting data. Luckily for us, this project is hosted on GitHub, check out the link below to check out their repository:

This web scraper is focused around simplicity, there are also a ton of features built inside of this Python package as well:

Installing this web scraper is fairly easy, you can use the following pip command to do so:

pip install pydude
playwright install # Install playwright binaries for Chrome, Firefox and Webkit.

You can then start using this web scraper! Here is an example of simplest scraper you could make (taken from their GitHub page:

from dude import select


@select(css="a")
def get_link(element):
return {"url": element.get_attribute("href")}

You can then run this scraper by using the following command:

dude scrape --url "<url>" --output data.json path/to/script.py

Note: please remember to change the “path/to/script.py” to the path of your Python file. At this point, you will then see your script executed inside of your Terminal!

There you have it! Do you plan on using this web scraper? I would love to hear your thoughts about this!

Thanks So Much!

if you have any suggestions, thoughts, or just want to connect, feel free to contact/follow me on Twitter! Also, below is a link to some of my favorite resources for learning programming:

Thanks so much for reading!

--

--

Manpreet Singh
Manpreet Singh

No responses yet

Write a response