In 2017, I whipped up a quick post to summarize some reusable Extract Transform Load (ETL) functions using Python and Pandas. To my pleasant surprise, it still receives a steady stream of visitor traffic. That fact alone indicates there are enough interests out there.
In this follow-on post, I am going to push the complexity a little further. The key differences include
- Reading data from mutli-sheet Excel files, instead of CSV files
- Checking for blank data
- Removing unwanted columns
- Visualizing data with basic bar graphs and pivot tables
The Jupyter (iPython) version is also available.
The sample data files are published on Github: