How Do Users Evaluate Your Data For Use?

Once users have found datasets they evaluate whether it meets their needs, across multiple dimensions. Some of these are evident in the data itself, but others require further information. As covered in the section on Creating an Aligned Open Data Culture, just because your data is not complete, or has some cleanliness errors, does not mean you should not publish it - users' ideas about perfection and their willingness to work with incomplete data vary.

Relevance:

Scope (e.g., topical, geographical, temporal) Granularity (e.g., number of traffic incidents per hour, day, week) Comparability (e.g., identifiers, units of measurement) Context (e.g., original purpose of the data) Documentation (e.g., explanation of the variables, samples)

Useability:

Format (e.g., data type, structure, encodings, etc.) Documentation (e.g., understandability of variables, samples) Comparability (e.g., identifiers, units of measurement) References to connected sources (e.g., links, urls) Size (e.g., MB, GB) Language (e.g., used in headers or for string values)

Quality:

Provenance (e.g., authoritativeness, context and original purpose) Accuracy (e.g., correctness of data) Completeness (e.g., missing values) Cleanliness (e.g., well-formatted, no spelling mistakes, error-free) Methodology (e.g., how was the data collected, sample) Timeliness (e.g., how often is it updated)

How do we know this? Read the original paper.

PreviousContent Co-location NextDataset Summarisation

Last updated 6 months ago