Here is the general workflow I used to complete a recent data merge and cleanup project. TASKs: Merge data from 5 different data structures (Approximately 4Gb in 148 files) Extract business related to commercial painting Format data Create consolidated Master file Create separate files divided by State where business is located Tools: Python’s Pandas libraryContinue reading “Business Data Merge and Cleanup”
Category Archives: Data Cleanup
Data Cleanup in Python
This notebook shows the initial data cleanup workflow for a capstone project in fulfilment of Springboard’s Data Science Track Program. The data was retrieved from the National Transportation and Safety Board website. The original data resides in a 20-table MS Access database. The pertinent information was exported to Coma-Separated Value (CSV) files utilizing Access’ queryContinue reading “Data Cleanup in Python”