An introduction to the resources available to support your learning
When you work with data, being able to evaluate the quality of the information you are using is important. This page looks at things to consider and suggests places you can go to build your skills in evaluating data.
If you will be working with your own or others' data you may need Research Data Management. Find out more in the managing your data section of this guide.
It is worth considering the following factors when evaluating the quality of a data object:
Is there a content map or guide of some sort? What is covered? What is not covered? Is there metadata included?
Who created the data? Who is managing it? Who paid for the data? What bias might be implicit? Is the data object currently maintained? Are there any references on how this data object has been used in the past? Are there clear release versions and updates information?
Are there clear format expectations? What units are used? What fields are present? What naming conventions are used? Are the dates of creation or last update easily located?
Is quality control explicitly outlined? Who is in charge of checking for quality? What process do they use? How is missing data handled?
Can a file be opened and a user understand the content? Is the file available for download in an open format? Is there a clear process to download?
Statistical fallacies are common tricks data can play on you, which lead to mistakes in data interpretation and analysis. Geckoboard explore some common fallacies, with real-life examples, and suggest how you can avoid them.
Data fallacies infographic reused with permission from Geckoboard
Data credibility checklist by Zilinski, Nelson and Epps (2014) under CC-BY 4.0 license