2.4 Considerations in Data Quality, Normalisation, and Consistency
The power of a database ultimately depends on the consistency and quality of the data it contains. Queries can only be as precise as the terms and categories on which they rely. If the same artist appears under slightly different spellings—for instance, “Terbrugghen” in one record and “Ter Brugghen” in another—search results will fragment, and broader patterns may remain hidden. To avoid such issues, the DttG database makes use of controlled vocabularies wherever possible. For artist names, we drew on RKDartists, which provides standardised forms of names. Variants were recorded in a separate field so that queries capture these forms too. Wherever possible, paintings were also linked to their RKDimages record, ensuring a stable and authoritative reference point even in cases where museum catalogues lack a permanent digital identifier.
Other controlled vocabularies exist within the humanities. Longstanding systems such as ICONCLASS, or more recent initiatives like the Getty Vocabularies, provide models for consistency.1 The CIDOC Conceptual Reference Model (CIDOC-CRM), developed under the International Council of Museums (ICOM), has become a central ontology for structuring and linking cultural heritage data across institutions.2 Although the DttG database does not currently include iconographical or typological information—and therefore does not employ these systems directly—they provide important models for interoperability and linked open data. Should the scope expand to include pigment use, conservation treatments, or iconographic categories, such vocabularies would form an essential point of reference. In domains where no such frameworks exist, institutions have created their own. For instance, the Victoria and Albert Museum’s Chinese Iconography Thesaurus was designed to standardise terminology that had previously remained inconsistent or ambiguous.3 Beyond terminology, shared analytical and imaging protocols are also critical for comparability. The IPERION-CH project, for example, demonstrated how simple adjustments to imaging protocols could achieve significantly more consistent cross-section images across multiple institutions.4
In developing the DttG database, we also encountered areas where no suitable controlled vocabulary was available. Two aspects were particularly critical for the project’s goals: the description of ground colours and the assessment of the reliability of data from varied sources. Without consistent colour terminology, the database could not support meaningful queries across different paintings, as free-text descriptions such as “reddish brown,” “light umber,” or “brownish red” vary widely between sources and observers. Similarly, given that the collected data derives from a mixture of published analyses, unpublished reports, and visual observations, it was essential to provide users with a clear way of judging the relative reliability of each record. To address these challenges, we developed two bespoke systems: a colour classification scheme tailored to ground layers, and a reliability index that rates the strength of evidence for each entry. Together, these frameworks ensure that users can filter and compare records in meaningful and reproducible ways.
The Down to the Ground colour system was developed by Hall-Aquitania as a means of transparently describing and systematically grouping ground colours. It has been used to normalise confusing or inconsistent colour descriptions of similar layer compositions. These issues often arise due to researchers prioritising detailed and specific language over consistent categorisation. For example, the same researcher could describe a light brown ground of brown earth, some yellow iron oxide, and lead white as “light yellowish brown” or “pale yellow-brown” or “light brown with yellow, white, and brown particles.” While these are all accurate descriptions, they do not allow for the grouping, and thus querying, across larger amounts of data. By reducing such variation into a shared category—“Light Brown”—similar records can be compared and queried together. Detailed descriptions are preserved as free-text notes, but the standardised term enables large-scale analysis. More targeted investigations would have more use for the free-text detailed description, but these later investigations are often facilitated by identifying larger patterns.5
The DttG colour system uses a custom colour checker (Figure 2.4) that was designed to describe sixteenth- and seventeenth-century Netherlandish ground colours. Language is limited to basic colour terms and hexadecimal colour codes are included to increase transparency—subsequent researchers can see a representation of what “Light Brown” means for this project.6
The Down to the Ground Reliability System is a simple rating system for the level of technical analysis performed on a painting or cross-section, i.e. what the database entry is based on. Within this system, each painting is assigned a numerical rating from 1 to 4, with 1 indicating the highest reliability and 4 the lowest:
Figure 2.4 Down to the Ground Colour Checker for coloured grounds. A reference chart showing the standardised colour categories and corresponding HEX codes used in the DttG database to ensure consistent description and visualisation of ground-layer colours.
- Elemental analysis (such as SEM-EDX) confirming microscopic cross-section analysis.
- Microscopic cross-section analysis without elemental confirmation.
- Surface analysis by a skilled individual, ideally a paintings conservator with magnification.
- Written descriptions form the basis, but the research methods are unclear.
Most of the entries fall within categories 1 or 2, meaning that the information on the ground is based on cross-section analysis. Filtering by these ratings, which is possible in the Advanced Search function of the database, allows users to critically assess the quality of the data, and to check whether observed trends are consistent across sampled and unsampled paintings.7
Notes
1 “ICONCLASS,” Iconclass, accessed 10 October 2025, https://iconclass.org/; The Getty Vocabularies include the Art & Architecture Thesaurus (AAT), the Cultural Object Name Authority (CONA), the Getty Iconography Authority (IA), the Getty Thesaurus of Geographic Names (TGN),
2 “Home,” CIDOC-CRM Conceptual Reference Model, accessed 10 October 2025, https://www.cidoccrm.org/
3 “Chinese Iconography Thesaurus (CIT),” Victoria and Albert Museum, accessed 10 October 2025,
4 Joanna Russell et al., ‘Experiments Using Image Processing Software (NIP2) to Define the Colour of Preparatory Layers in 16th-Century Italian Paintings’, in Ground Layers in European Painting 1550–1750, ed. Anne Haack Christensen et al., CATS Proceedings V (Archetype, 2020).
5 For patterns, see Hall-Aquitania, “Common Grounds.”
6 For further information on the development of the DttG Colour System, see chapter 1 of Hall-Aquitania. For more on Basic Color Terms, see Brent Berlin and Paul Kay, Basic Color Terms: Their Universality and Evolution (Berkeley: University of California Press, 1969).
7 Hall-Aquitania, “Common Grounds,” 41–43.