Get Blogpost Titles Links and Tags From a RSS Link Using Python Feedparser
I wanted to get metadata from my other blog sysadmins.co.za, such as each post’s title, link and tags using the RSS link. I stumbled upon feedparser, where I will use it to scrape all the posts details from the link and append it to a list, which I can then use to ingest it into a database or something like that.
Install feedparser and requests:
$ pip install feedparser requests
The Python Code:
I’m not too sure at this point how to get pagination going, so I’ve set a range to check, and if a status code of 200 is received, it will check if the title is in the list that I defined, if not, it will append it to the list.
At the end of the loop, the script will return the list that was defined, which will provide the info mentioned earlier: