If you hang around the nonprofit digital world long enough, chances are you’ll eventually need to clean up some data. And if you’re doing data hygiene tasks, then you’ll probably use a spreadsheet program like Excel or Google Sheets to organize and clean your data. Here are some of our favorite quick tricks to make this task as painless as possible.
- Freeze top row
- Special Format for ZIP Codes
- Change ALL CAPS to Title Case
- Text to Columns
- Look for transposed data by sorting
- Flag Duplicates
- Remove Duplicates
- Use Filter to count
1. Freeze the top row of column labels. Have you ever sorted a spreadsheet and your column header gets sorted too? Annoying. In Google Sheets you can select the top row and then use View – Freeze to define the top row as column headers. In Excel, freezing the top row won’t prevent it from being sorted, but will keep it visible as you scroll down (useful). (To prevent the top row from being sorted as data, choose Custom Sort and check the box for “my data has headers.”)
1. Special Format for ZIP Codes – Lots of times Excel will drop a leading zero in a ZIP code. To prevent that, format the cells or column as ZIP Code. Also works for Phone Numbers! Here’s how: Home – Cells – Format Cells, choose Special, then ZIP Code (or ZIP Code + 4, if you have the +4). You can also shortcut to this by selecting the cells or column and right-clicking to choose Format Cells.
3. Change ALL CAPS to Title Case – sometimes your data might be displayed in ALL CAPS, but you want it displayed in Title Case instead (where the first letter of each word is capitalized). You can transform your data by using the formula =PROPER(cells). Then copy and paste the Values back into your original column for import (since Excel will “see” a formula instead of the actual data). Make sure to doublecheck for words like “of” and “the” if you don’t want them capitalized too, and also any contractions like ‘s or ‘ve – you’ll need to change them by hand.
4. Text to Columns. Did you ever get data where the first and last names are all in one field, but the CRM you’re uploading into wants First Name and Last Name to be separate? Its easy to split them up. Just add an additional column (or do this in another worksheet so you don’t overwrite other data). Select the names, choose Data – Text to Columns, and choose a custom delimiter of a space. You’ll need to manually clean up any middle initials that get spaced out too, and account for last names with two or more separate words (like de Leon). Then copy and paste back into your original data set.
5. Look for transposed data by sorting. We all make mistakes, and sometimes people copy/paste or data-enter data into the wrong field. If your data set is large, it’s hard to spot them just by scrolling through the spreadsheet. You can rigorously find errors like this by doing advanced validation, but many of us aren’t up for that. For a quick and dirty way to spot data in the wrong place, Custom Sort your columns A-Z and Z-A to scan for anything unexpected at the top or bottom (numbers where you expect to see text, and vice versa).
6. Flag duplicates. If the destination for your data would be better off without duplicates, then it’s a good idea to catch them in advance. Excel can flag duplicates in your file using Conditional Formatting. Select a column, and then use Home – Conditional Formatting – Highlight Cells Rules – Duplicate Values. Duplicates will be flagged using colors. Then pick the one you want to keep.
7. Remove duplicates. If you wish, you can also use Excel to remove the duplicates you’ve highlighted. Find this in Data – Data Tools – Remove Duplicates. Be careful with this, especially if you have two rows with different values in a different column – you might want to keep both phone numbers if they’re different. If it’s a matter of data in one cell and a blank value in the other, Excel will keep the data rather than the blank.
8. Want to count how many of each type of value exists in a column – or how many blanks there are? Don’t count by hand – use the Filter tool. In Home – Sort and Filter, highlight the column and click Filter. You can select the specific values you want to count, or select Blank. Once you press OK, then at the bottom of the worksheet, Excel will tell you the count. Then you can report back on the status of the data before the cleanup and after.
These are our favorite simple tricks in Excel to make data cleanup faster. Do you have a favorite trick? Let us know!