This is an overview of the DaMa data qualities. It can be extended with the implications for a specific project
Versie | 1.0 | Creatie datum | 02-05-2021 |
Accurateness refers to the degree of which a data entity displays reality. Accurateness can be decided by comparing a data entity with the entity in reality. An example is a a difference between a mailing list of clients and the true clients of an organisation..
Degree of which a data entity display the current situation of reality. Good examples are deceased people that received a letter from an older data set. Replication of data is often a source of low actuality
This refers to the degree in which certain attributes are present within a data entity. In addition to that the completeness also counts for a certain set of entities (rows) within a data set always being present. For example a person could only have the quality name, yet also consist of nickname, first names, surname and maiden name. In the last case there is a higher completeness
This refers to the fact that the one data set of a certain entity is equal to another data set. In other words a data entity is always the same regardless of the source. An example of a low consistency is when there are differences between data sets of the same entities originating from different sources. Replication of data is often a source of low consistency.
Degree of detail in which a data entity displays reality. For example this refers to the precision of numbers and such. Storage of numbers and dates can be insufficiently accurate because rounding is needed in storage. Domains in features can also have insufficient precision (such as a Dutch postal code in an international data storage.)
For some data entities access control (authorisation and authentication) or monitoring of use is needed. Take for example requirements that are placed on the access of confidential data. In the GBA there are multiple levels of confidentiality. As such queries of officials are logged and displayed to the civilian yet for investigating officers they are logged and not displayed.
Mainly refers to expectations within a certain operational context. Take for example the accepting of a lower performance during peak loads or having to wait a long time on a result set of archived data entities.
This is the situation where referrals from one data entity always correctly refer to the related data entities. Examples are double keys in a data set which makes the connected entities unable to decide which entity is older. Also dangling references or floating references where the parent no longer exists belong to this.
Is a data set available on time within the set expectations. It is the difference between the moment of need and availability. For example requesting data in a Call Center. In this situation waiting five minutes for a response is not acceptable.
Uniqueness of a data entity is focused on the fact that there are no other data entities with the same data. An example from practice was a twin with the same initials, surname and birth date. A distinction could not be made due to the completeness being to low.
This is the degree in which a data entity meets the desired format in storage and exchange. Take for example the domain yet also the datatype of the attributes of a data entity. Within chain exchange for example this is of the highest importance. Nobody wants to find out that at the end of the chain the data is not valid. This needs to be detected in an early stage. Everyone knows the examples from the past with web applications where after sending a message “invalid data” showed up without further explanation.