Excel Power Query for Data Cleanup

Checkout this introductory series detailing data cleanup workflows with the Power Query in MS Excel. Video 1: Introduction to Data Query– Standardize text formatting on select columns– Remove duplicate entries Video 2: Formatting Phone Numbers– How to remove unwanted characters– How clean ‘null’ entries– How to split a column by character count– How to createContinue reading “Excel Power Query for Data Cleanup”

Business Data Merge and Cleanup

Here is the general workflow I used to complete a recent data merge and cleanup project. TASKs: Merge data from 5 different data structures (Approximately 4Gb in 148 files) Extract business related to commercial painting Format data Create consolidated Master file Create separate files divided by State where business is located Tools: Python’s Pandas libraryContinue reading “Business Data Merge and Cleanup”

Data Cleanup in Python

This notebook shows the initial data cleanup workflow for a capstone project in fulfilment of Springboard’s Data Science Track Program. The data was retrieved from the National Transportation and Safety Board website. The original data resides in a 20-table MS Access database. The pertinent information was exported to Coma-Separated Value (CSV) files utilizing Access’ queryContinue reading “Data Cleanup in Python”

Merging EXCEL Data

Data Integration This notebook highlights the process of integrating data from four different data sources (Excel files) onto a master file. For efficiency, the data manipulation and cleanup is done with Python. After the processing is done, a master Excel file is exported. Raw data files and table definitions can be found at https://www.kaggle.com/anikannal/solar-power-generation-data **Continue reading “Merging EXCEL Data”