Most of the datasets that fisheries scientists work with have problems.
In this week, we will focus on working with imperfect datasets, and discuss how to identify and reproducibly correct common entry errors in data.
- Identifying common problems:
- Mixed data types
- Impossible values
- Hard-to-detect issues (blanks, numbers with “e”)
- Data entry errors (typos, big or small numbers)
- Data exploration as part of the verification workflow
The data wrangling cheat sheet is critical to today’s activity.
- Clean up a dataset
Code and Data
- Week 6 code
- Week 6 in-class data - Lake Trout Broken
- Week 6 activity data - Inch Lake Broken
- Solution to Week 6 assignment
Slides available via speakerdeck