Linda van den Brink
Publications on Geodesy 93
2018, 232 pages.
Geospatial data is an increasingly important information asset for decisionmaking, from simple every day decisions like where to park your car, to national and international policy on topics like infrastructure and environment. Because of the location aspect, geospatial data is often the linking pin between different datasets and therefore important for data integration. A lot of geospatial data is created, for example, as part of governmental processes and nowadays, also disseminated as open data, traditionally through "Spatial data infrastructures" (SDIs).
There is a lot of potential for reusing this data in other domains than the domain and use case for which it was originally created. My main research question was: "How to reuse geospatial data, from dierent, heterogeneous sources, via the web across communities?" Several aspects of data dissemination must be addressed before open data is actually in a good position for getting reused. These aspects have been coined the "FAIR principles": findability, accessibility, interoperability, and reusability.
A general foundation of my work is the common knowledge in the geospatial domain that interoperability between systems is required to make reuse of data possible, and standards are able to realise this interoperability. Based on this, I have addressed several dierent problem areas where the potential to reuse geospatial data was there, but hampered in some way. These problems are introduced in chapter 1. All the research was exploratory in nature; the methodology a combination of desk study, analysis, literature study and experimentation.
Chapters 2 and 3 deal with the lack of a standard for describing threedimensional (3D) geospatial data in the case of the Netherlands, hindering the reuse of 3D data. To solve this, a national standard for two-dimensional (2D) geospatial, topographic data, Information model Geography (IMGeo), was combined with an international 3D standard, CityGML. Both standards describe topographic objects that represent physical objects in the real world and are largely similar.
Chapter 2 describes how CityGML was selected as the basis for a national 3D standard; how IMGeo was semantically aligned with it; how it was formally dened as an extension of CityGML on the level of classes, properties and code lists; and how other interoperability aspects were addressed: geometry, topological structure, and reference system.
Chapter 3 describes in more detail how IMGeo was formally dened as an extension of CityGML. This addresses technical information modelling issues, related to the use of Unied Modeling Language (UML) and the specic extension mechanism dened by CityGML. Based on the IMGeo case, I dened a model-driven framework for developing CityGML application domain extensions.
In the case of IMGeo and CityGML, the semantic harmonisation, i.e. the alignment of concepts dened in both standards, was relatively straightforward. Both standards describe the same kinds of things; as a result, most IMGeo classes can be said to be the same as or a subclass of a class in CityGML. This is not always the case. Independently developed domain models model similar concepts in dierent ways, which makes reuse of data in other domains diffcult.
This problem of semantic harmonisation is addressed in chapter 4. Different Dutch standards were examined to discover areas where they overlap and where semantic harmonisation can solve real world reuse issues. To aid this examination, a methodology was developed which combines human interaction with computer-aided analysis. In cases where information models were developed in cooperation between domains, the semantics were already harmonised. However, results showed that in the Netherlands, most domain models are developed independently, and reuse of concepts from other models occurs only on an ad-hoc basis. This is for a large part due to the lack of discoverability and accessibility of domain models.
Semantic harmonisation improves usability of data, but it is not enough to fully enable reuse of geospatial data outside the geospatial sector. Geospatial data disseminated via SDI methods is dicult to nd, access and use for non-geospatial experts, who are familiar with more general data publication methods.
Chapter 5 describes how general methods for data publication on the World Wide Web, Linked Data standards in particular, can be applied to geospatial data. Conversion of geospatial data formats such as Geography Markup Language (GML) to linked data standard Resource Description Format (RDF) is straightforward, but a choice between dierent ways to encode geometry in RDF is required, as is a URI strategy to ensure persistent and scalable URI identiers for all data objects. Less straightforward is the conversion of geospatial UML-based data models to linked data models expressed
in RDF Schema and Web Ontology Language (OWL), because of different underlying paradigms. One aspect, the reuse of existing vocabularies, is addressed in detail.
Linked data, while broad in its applicability, is somewhat of a niche set of standards. When applied to geospatial data, it enables reuse of this data by linked data practitioners; however, it would still keep other potential users away, who experience linked data as an impediment to ease of use. To further improve reuse of geospatial data outside the geospatial sector, it is necessary to apply general web architecture principles and standards without mandating a specic metamodel such as linked data.
Chapter 6 describes a set of best practices for publishing geospatial data on the web, discovered in practice, and based on general web principles and standards. When implemented, these best practices make it easier to discover, interpret and use geospatial data for data users in general, eg web developers - not just for geospatial experts. In addition, some areas are identied where a best practice has not yet emerged.
Chapter 7 gives an overview of relevant developments since the research was carried out. 3D standardisation is ongoing in an international context. Semantic harmonisation in the Netherlands is progressing slowly but steadily. Geospatial linked data availability has grown signicantly in the last few years, although there are still a few issues to solve. Several datasets implementing the Spatial data on the web best practices are available. These best practices have also triggered the further evolution of geospatial standards towards alignment with general web standards and principles.
Chapter 8 concludes the thesis and identies the shift to (lighter) general web standards and principles as an important development and area for future work.