# Data Quality

Data quality is a contingent concept - exactly what constitutes quality will be highly dependent on the audience for the data. Further, research shows that '[strange things' such as errors and outliers in the data may actually help users ](https://user-centric-open-data-publishin.gitbook.io/user-centric-open-data-publishing/preparation/data-quality-2)engage with the data by providing an opportunity for them to question the data.&#x20;

The first step in the data quality process is map out exactly what your version of data quality looks like.

<figure><img src="https://2677491958-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F4tiWPkv75at4eCXxyJ7O%2Fuploads%2FEO4rgThZf5lG73KND1Cd%2FML%20quality%20planning.png?alt=media&#x26;token=6a960b60-dbdb-4905-a164-f37cfab7962a" alt=""><figcaption><p>The 4 steps of data quality </p></figcaption></figure>

Source: [A Survey of Data Quality Requirements That Matter in Machine Learning Pipelines](https://dl.acm.org/doi/10.1145/3592616)
