Can you describe the characteristics of a well-formed dataset?

joergbigdada
Jörg Lehmann, FUB

If not, don’t worry – you are in good company. The majority of people working with data have not been training in this respect: It is surprising that amongst those who establish and collect data, only rarely a statistician or data scientist can be found. We know that datasets consist of variables and observations; but it is surprisingly difficult to precisely define variables and observations in general. Tidy datasets provide a standardized way to link the structure of a dataset with its meaning; they form the basis for effective computation, modeling and visualization. In his contribution “Tidy data” Hadley Wickham, Chief Scientist at RStudio and Adjunct Professor of Statistics, explains what accounts for the quality of datasets, and how they can effectively be cleaned and prepared for analysis. This article from the Journal of Statistical Software is freely accessible.

https://www.jstatsoft.org/article/view/v059i10/v59i10.pdf

One thought on “Can you describe the characteristics of a well-formed dataset?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s