In this white paper we look back over ten years of developing and managing standards. We focus explicitly on the set Geostandards as listed on the Comply or Explain list of the Standardiastion Forum. This set of geo standards acts as backbone of the Dutch Spatial Data Infrastructure. In this white paper we identify an number of trends and technological developments. The Dutch SDI can only successfully participate in this developments, if the underlying set of standards adequately supports this. This is why we are asking ourselves, in this white paper, whether the current set of standards is still fully compliant.

This is the English translation of the final version of the whitepaper. The preliminary findings were presented by Friso Penninga during the 'Open Geodag' on May 31, 2017. A public consultation started that day and ended July 21, 2017. This final version incorporates the responses introduced during the consultation round.

Introduction

Looking back over ten years of developing and managing standards, we can conclude that, in the Netherlands, we now have a mature set of geo standards which act as backbone of our National Spatial data infrastructure (SDI). This set of standards is listed on the Comply or Explain list of the Standardisation Forum. Geonovum, as guardian of this backbone, manages this set and is also the manager of most of the individual standards. The set geostandards contains the Basis model Geo-information [[NEN3610]], the 'mother model' of many information models with which the semantics of datasets are captured unequivocally, and a number of standards that define how we can exchange [[GML]] and provide (Dutch profiles on WMS [[WMS-NL]] and WFS [[WFS-NL]]) these datasets, and make these discoverable (Dutch metadata profiles for geography [[ISO19115-nl]] and for services [[ISO19119-nl]]).

Even though the current SDI within the INSPIRE directive, the datasets of geo-basis registers, and provisions such as PDOK are really common now in the daily practice of the geo-information field, the development of an SDI never ends. Technological developments are progressing rapidly and societal requirements are evolving. Where the SDI was originally only focused on the geo-professional, nowadays we expect the SDI to be accessible to a much larger group of users, including scientists and web-developers. This intended broadening results from the increased attention given to the (re)usability of data, where originally the focus was more on the unlocking itself. This shifted focus is now also seen under the heading 'from Open data to FAIR data', a concept originally conceived of for scientific (research)data. Essence: data should not only be open, but also be findable, accessible, interoperable and re-usable.

"Fair" open data

In future, we expect more coherence (and consistency) between individual datasets, more detail in the datasets (transition from 2D to 3D), more data volume, data that changes much more frequently or even continuously (think of sensor data), and forms of access to data that correspond with the demands of the users of the much broader user group of the Dutch SDI.

The Dutch SDI can only successfully participate in this developments, if the underlying set of standards adequately supports this. This is why we are asking ourselves, in this white paper, whether the current set of standards is still fully compliant. To what extent have innovations and developments become so mature that they have an impact on the standards we are now managing? Following the format of the geo standards listed on the Comply or Explain list, we outline which updates of these standards will be possible over the next 5 years. Input for these updates comes from developments that Geonovum has explored and followed in recent years, in the innovation platforms around 3D, Linked Data, Sensors and the Web and in an international context. On some points we can already make recommendations, in other areas questions still remain.

Scope

The scope of this white paper is limited to the set of geo standards on the Comply or Explain list of the Standardisation Forum. Together these standards form the backbone of the Dutch SDI. By looking ahead to the developments around this set of standards, we automatically outline the future development of the Dutch SDI. However, it is also quite conceivable that, in five years’ time, the Dutch SDI is less recognisable as a separate infrastructure, but will be seen as the spatial component of the generic digital infrastructure.

Relevant trends

More 3D geo-information

Although the increasingly realistic-looking visualisations, including tools such as the Oculus Rift VR glasses and the Hololens AR glasses, appeal to our imagination, the emergence of these tools is not the main reason that SDIs are moving more and more towards 3D. The move from 2D to 3D is also a move from a rougher to a more detailed model. And these more detailed models are increasingly needed, as society’s demands for more detailed spatial information, for example about the physical living environment and its quality, rises. “How much ‘sound load’ will I experience after the realisation of that new road?”, “Will I have a drop-shadow in my garden when this windmill is actually placed?” At the same time, the world about which these questions are being asked, is becoming increasingly complex. To be able to answer more detailed questions about a more complex world, models should approach that reality better. 3D is, therefore, increasingly necessary to provide answers to those societal questions.

In addition to this increasing demand, we also see a number of technological breakthroughs being made, that will make it possible to meet the demand for up-to-date 3D models at reasonable costs. In the coming years, 3D models with nationwide coverage will become available at a scale level comparable to the BGT. The Dutch SDI must be able to deal with such models.

More sensors

The use of sensors in public space is really taking off. The applications for public order and security, for example the Stratumseind Living Lab, appeal to the imagination, but we also see many applications in the area of environment, (e.g. measuring air quality), accessibility (including, for example, flow and available parking places) and the self-driving car. Many applications are created within Smart City-initiatives or experiments around the Internet of Things. With this enormous growth in applications, the number of standardisation-initiatives rapidly increases too. International organisations such as OGC, W3C, IETF and IEEE all work on standards in this field. These standards focus on both capturing information about sensors and unlocking information collected by these sensors.

Although the INSPIRE framework provides standards for unlocking sensor data, there are hardly any concrete applications in this field. Compared to trends in the field of 3D and linked data, the sensor trend is newer and less mature. One development, however, that is clearly visible, and can be seen as a precursor to the breakthrough of sensors, is the growing relevance of data from the Observations and Measurements category. Through technological developments around retrieval techniques the provision of coverages increases, for example, around altitude (think of AHN - Actual Height model of the Netherlands) and depth (bathymetry). We see this increase in domains such as hydrography, meteorology and the substrate. Another trend we see around sensors is lighter standards, partly with the view to minimising data traffic (which is really important for LoRa) and for maximising the battery life of certain sensors.

More linked data

In the Platform Linked Data Netherlands, formerly PiLOD, Geonovum, and an active community, has, over the past four years, investigated what linked data is, how to apply it, why it is useful and how you could publish and use geodata as linked data. The #geo4web testbed has also provided insights into this. Meanwhile, the first serious Dutch geo linked datasets have seen the light of day, thanks to the Dutch Cadastre (Cadastral parcels (BRK) and small scale Topography (BRT)).

In the coming five years, a lot of (government) linked geodata will be published. In addition to the Netherlands, Ireland, Scotland, UK, Flanders, Switzerland and Finland have already published linked geodata or are involved in this. Linked data is important for publishing government data in as rich a way as is practicable. This makes it possible to link the meaning, the semantics of the data, immediately to the data itself, and to establish a high granularity, the links between various datasets. In short, linked data is important for semantic interoperability and data integration. Currently, this is the most rich and powerful way of making data available.

Linked data is currently a niche technology. For most of the web developers it is too complex. But this could change over the next few years. We see a development towards the 'web of data' which does inherit some basic principles of linked data, but not the (complex) data model. In other words "linked data" is not necessarily "Linked Data": but any object on the web with its own URI, metadata and links between objects; but not necessarily with RDF [[rdf-concepts]], triple stores and SPARQL endpoints.

In the summer of 2017 the W3C/OGC Spatial Data on the Web Best Practice [[sdw-bp]] appeared. This document, to which we also contributed, is not exclusively about linked data, but contains many recommendations about it from the current practice and can serve as a guideline for publishing linked GEO data in the Netherlands. In a number of respects, the Best practice does not yet have clear, good practices. For example, in the coming years, we still need to work on a standardised vocabulary for geo-data and also one for geo-metadata.

Linked data is a continuation of semantic web. Slowly but surely we work towards this. We are thinking of semantic harmonisation. Without wanting to harmonise the entire geo world, because this would be a huge job and we could not guarantee its practical utility. But we will continue to focus on areas where semantic harmonisation leads to new possibilities to re-use data, and to optimise processes, and to apply linked data principles to reach a Web of data on which no longer redundant data is copied.

More users: Geo & World Wide Web

A wealth of geo-information is unlocked via the Dutch SDI which is offered for the most part as open data. However the use of this data is often limited to the traditional field of geo-information, although the basis information, in particular, should be usable for a much broader group of users. These days the intermediary between the data provider and the final user is more often than not a web developer. But these web developers currently only use tiny amounts of the high-quality geo-information via the Dutch SDI. They often choose platforms such as Google Maps or Open Streetmap. This is caused by a combination of factors: it is partly due to unfamiliarity with the existence and construction of geo-information, partly because of the fact that geo-information is poorly found by search engines, and partly because of the use of specific geo-standards, which are not very well-known outside the geo domain and thus a threshold for web developers. Ultimately, this ensures that the re-use of open data is only limited, so that the full potential of its social impact cannot be realised. Web developers certainly play an important role in commercial and citizen-orientated applications.

In a number of recent initiatives (e.g. the testbed Spatial data on the web and the joint OGC and W3C Working Group - Spatial data on the web) there was a search made for opportunities to broaden the target group of the Dutch SDI beyond its traditional geo-information domain. The world of web developers, including standards that are used in this world, is central here. Directions for solutions include the use of lighter data standards and the provision of APIs, code examples and other coordinate systems. Improving findability in search engines is also being worked on. The W3C Data on the Web Best Practices [[dwbp]] and Spatial Data on the Web Best Practices [[sdw-bp]], which is partly written by Geonovum, are a good place to start. Another interesting concept in this area is the Feature Catalogue of the Digital System Environment and Planning Act, which is now being developed. This catalogue provides information about which data will become available in this Digital System Environment and Planning Act and what this data means (semantics). The catalogue does this by unlocking concepts and definitions, giving an overview of datasets, publishing information models and providing an overview of available products and services. The feature catalogue can help make data easier to find for new users; if the catalogue is indexed by search engines, users who search via Google or Bing for data on high-voltage masts, can access the BGT (large scale topography) data, something that would not work via PDOK or NGR.

More channels: the platform concept

All the developments described above may raise the question as to whether the current SDI is still meaningful, and therefore also whether the underlying set of standards on the Comply or Explain list is still meaningful. To be able to answer that question, reference must be made to one of the latest developments in the field: the rise of the platform. Data platforms are characterised by the multitude of ways in which data can be collected: via downloads, via services, via APIs or via SPARQL endpoints. The underlying reason for this multitude is the recognition of the fact that there are many different target groups, each with their own applications, and with their own different way of retrieving their data. The emergence of platforms illustrates that we are no longer striving for a uniform one size fits all approach, but for a multichannel approach, more fit for purpose. This target group-specific approach is the result of the shift of focus from unlocking data to the (re)usability of data. Non-specialist users can also benefit from platforms that visualise the data, for example when assessing the usability of the data for the intended application.

With the emergence of platforms, the question also arises as to what extent the current SDI can already be considered as a platform. Currently, the SDI seems to be primarily aimed at the geo professional, including standards from the geo world (OGC). Other target groups, such as web developers, are clearly less supported. A platform approach would be more appropriate for the SDI, so the set of standards can be extended to include lighter standards that better relate to the web-based practice.

Impact on Comply or Explain standards

In this chapter we try, based on outlined trends, to predict the particular influence that these trends are exerting on current geo standards. In the near future we do not expect to remove any geo standards from the Comply or Explain list. However, already existing standards may change under the influence of trends, and the list may grow to meet the demands of new user groups.

Information model

Information models remain a key corner stone of the SDI. The role of semantics, which you can describe with the help of the information models, is becoming more and more important. [[NEN3610]] not only acts as a mother model for all sectoral information models, but it also forms a working method for the system of sectoral information models. NEN3610 will develop itself further in both aspects over the coming years. The first development is the step towards harmonisation between information models, for example between the information models under different geo key registers. Now that the use of these registers is continuing to increase, which can also be partly attributable to new users who are not so familiar with the geo sector, the (apparent) discrepancies between the various datasets is experienced as being increasingly bothersome. There are several initiatives in this area, including further development in the context of the Ministry of Infrastructure and the Environment and successive projects at Geonovum, including projects around semantic alignment between NWB (road networks) and BGT (key registration large scale topography). From these projects we learn that harmonisation is not only about data (harmonising the population of datasets), but also about harmonising processes (recognising triggers that are relevant to each other’s processes). It also appears that different definitions or delimitations of (apparently) similar objects is sometimes not only explainable, but even exceptionally desirable if we look at the different ways individual datasets are used. The inconsistencies that users sometimes experience, are to a large extent, caused by the lack of proper separation of function and physical appearance in some of the models. It would, for example, be logical for a pavement type to be linked to a BGT Part of the Road, while the qualification as to whether something is a motorway, regional road or local road, could be better linked to a NWB Road. Although the actual modifications will take place at the level of the individual information models (for example, in IMGeo 3.0), it might be possible for modelling guidelines such as the distinction between function and physical appearance, to be anchored into the NEN3610 working method.

Another development is related to the emergence of Linked Data: more and more sectoral datasets are ‘interlinked’. At the moment, this is done at the level of individual information models. The risk here is that we lose the coherence which now exists between the different information models in the UML-world, in the Linked Data world. Therefore, we need a Linked Data-profile on NEN3610. Such a profile could provide a helping hand for publishing geo-information as linked data, but could also, for example, establish more common concepts and create relationships with other, common vocabularies, such as schema.org. With both of them we intend to make the 'linkability’ of data in the Dutch SDI as large as possible, as this increases both the use (through better findability) and the usability (by having a better understanding of the meaning).

Applying linked data to NEN3610-models however, raises a fundamental question. The central point about linked data is the 'open world assumption', where ‘linkability’ is at its maximum. This justifies the question whether it would be desirable – or perhaps even necessary – to introduce this as a starting point for NEN3610. However, at this point we consciously choose not to do so and therefore we still use the closed world assumption of NEN3610. The most important consideration here is that the closed world assumption allows validation. And because reliable and correct government data is central to SDI, validating is a must. We therefore choose to leave NEN3610 intact and to increase the linkability with a Linked Data-profile. For this purpose, for example, we provide assistance on how to achieve this by choosing the right semantic level or detail. If you look, for example, at the BAG (key registration Adresses and Buildings): the BAG knows the object Public Space. The probability that 'random’ users who are not familiar with the BAG-context want to link relevant data to public spaces is small, because this is an unknown concept from a reasonably high level of abstraction, while there is a much larger probability they would search for the different types of Public Space (including Road, Water and Railway). By taking Road as the object (instead of the more generic Public Space), you increase the linkability.

From a Linked Data perspective, establishing a better definition of the relationships between concepts is going to be the focus for the coming years, but this is certainly not the only motive. We can see that, as SDI matures, the range of datasets increases. And these different datasets also contain more often data about (apparently) the same objects. Take roads for example: more and more often the question arises as to why the populations of the NWB (National Road Database), the BGT (Basic Register of Large-Scale Topography) and the BAG (Basic Register of Addresses and Buildings) are, in fact, different. Experts from the geo domain are often familiar with such differences and usually know the causes, but with the increasingly widespread use outside the traditional working field, the number of users who cannot indicate such differences is on the increase. To be able to deal with this, it is necessary to get an insight into the relationships between them. Sometimes understanding the interrelationship will lead to understanding (for example, different definitions, each of which is justified within their own context), in other cases this may be a reason to harmonise certain datasets. The definitions of, and relationships between, concepts should be unlocked in the core of the concept registers.

Exchange format

First of all, exchange formats play a role in different contexts: we distinguish the exchange within a chain of government parties versus the exchange also known as the unlocking for users. Geography Markup Language [[GML]] is listed on the Comply or Explain list as an exchange format. GML is widely used within the SDI, not only within chains but also as format for unlocking in which a Web Feature Service [[WFS]] can return data. Statements about future developments around standards for exchange are not automatically valid for all GML applications. Before we make such statements, it is good to first list the requirements from the geo sector, for the chain of exchange standards:

GML is still the only open standard that meets all criteria. Lighter exchange standards, including GeoJSON [[rfc7946]] and [[GeoPackage]], do not meet all these criteria. GeoJSON, for example, only uses WGS'84 as a coordinate system and does not support all geometry types (and explicitly excludes extending it with additional types). GeoPackage does not support solids and for that reason is not suitable for 3D data. This means that there is no discussion about the position of GML as a chain exchange format (for example between source holders and national provisions). Where lighter formats are mentioned as possible alternatives, this refers to the use of GML for the unlocking of this data (see also the section Providing data). Moreover, with the emergence of 3D geo-information it would be quite conceivable that CityGML [[citygml20]] (as a specific 3D sharpened extension of GML) will be added to the set of geo standards on the Comply or Explain list.

In the longer term, the preceding statement about the position of GML might lose its validity. The Open Geospatial Consortium now discusses the development of a geoJSON-like standard that meets the requirements of the geo-sector. Further adoption of the linked data could change things too. After all, with the emergence of linked data there is no longer a strict separation between the information model (semantics) and the exchange format (as in the schema, in which the implementation of an information model is captured). With Web Ontology Language, in short OWL [[owl2-overview]], you actually capture both the information model and the schema in the world of linked data. The term 'exchange’, by the way, is no longer applicable, because with linked data, data is not exchanged (duplicated), but you gain access to data available on the web.

Although the scope of the set of geo standards is not explicitly limited to standards focusing on vector data, GML is specifically focusing on vector data. With the emergence of coverages (pointclouds and grids) from the angle of Observations and measurements, it is clear that GML is not applicable to all types of data. Other exchange formats are more obvious then. In these areas Geonovum chooses – in compliance with OGC – not to develop, or actively prescribe exchange standards for coverages themselves, but to adopt the community standards that arise in the relevant domains (LAS/LAZ for pointclouds). Geonovum also strives for good support within the Dutch SDI for such data. Applications within for example, INSPIRE can give a better insight into possible bottlenecks and the need for further standardisation. Our choice to now focus on (sensor)data that will be adopted in the Dutch SDI (so mostly data that is validated, meant for re-use and with a ‘quality stamp’ from the government and less raw data from individual sensor), is in line with the delimitation of the current standards. Take the BGT as a use case: we do not standardise the retrieval (method) or rough measurements, but only the resulting dataset.

Providing data

In the area of providing data, most developments are to be expected the coming years. The aim of those developments is to increase the use and usability of the SDI. To this end, the supply of unlocking mechanisms will widen where, – in compliance with the platform idea – different types of unlocking mechanisms will be supported for different target groups. A professional geo-user will probably want to get a full, centimetre-precise dataset as the starting point for spatial analysis, where a web developer may need the global locations of charging poles in the lightest possible format, for optimal performance of the app. The web developer becomes more important, because he/she, (as mentioned in the Section More users) nowadays is more often the intermediary between data provider and end user.

Emergence REST convenience APIs

A category of unlocking mechanisms that will greatly increase in popularity, are the REST convenience APIs. REST convenient APIs simplify queries (because they use generic protocols such as HTTP instead of query languages as in, for example, [[WFS]]) and are much more fit for purpose: small 'convenience APIs' provide precisely the desired data for a specific purpose, (instead of generic APIs which you could use to ask every possible question). This fits with the specific questions that users have: hardly anyone asks for 'the BGT', but people want information about trees, parking places or electric street lights. An important advantage of REST convenient APIs is that occasional users such as (web)developers do not have to delve into specific querying mechanisms. For example, to query a WFS, a developer will need to understand how to build a request to successfully obtain data. When comparing WFS to a REST convenient API, you could describe this as a generic XML-based API. Two aspects are important in this difference: the question-specific information of a convenient API, opposed to the generic ('all questions are possible) of WFS and REST (using generic protocols for queries) compared to the specific XML-based queries that you could ask a WFS. Please note that a WFS can also be considered as an API: although in everyday speech we sometimes still talk about 'APIs vs. services', this is not correct because both a REST API and a Web Feature Service provide a specification allowing machines to communicate with each other and exchange data.

The question now is, to what extent the emergence of REST APIs calls for an expansion of the current set of geo standards on the Comply or Explain list. As far as international, generic standards (OGC, ISO, W3C) are concerned, we can state that REST APIs do not know or demand geo-specific standards. So, modification of the set of geo standards will not happen, at most generic standards can be added to the Comply or Explain list. For example, it follows from the analysis in the discussion document 'RESTfulAPIs within the government' (in Dutch) of the Standardisation Forum that OAuth as an authentication standard would be a desirable addition to the list. Also, the Data on the Web Best Practices, which contains many general directives for APIs, is a potential candidate. At this level, therefore, adjusting the set of geo standards is not obvious.

The set of geo standards not only contains international standards, but also several Dutch profiles in which such standards are sharpened (the set contains Dutch profiles for WFS [[WFS-NL]], WMS [[WMS-NL]] and metadata from datasets and services). On that level one might think of profiles for REST APIs, but at the moment that seems to be too narrow. REST is an architectural pattern and not a standard and standardising it requires a different approach to the well-known exchange standards of the OGC or the Dutch government (StUF, Digikoppeling). When applying the REST architecture pattern you can fully standardise a number of things, for other things a best practice or some design considerations are adequate. A generic API 'strategy ' or contribution for the Government could capture this without pouring everything into concrete, but rather giving the user the freedom to put in place a solution appropriate to his situation. A number of recommendations can be made at both strategic and technical level (e.g. with regard to RESTful principles, security, geo-aspects and documentation).

REST convenience APIs & OGC webservices

What does the emergence of the REST convenience APIs mean for the current OGC web service standards? Are we saying goodbye to geo-specific standards, with the increasing use of generic web standards? We don’t see that happening. We expect both categories to exist alongside each other and that they will retain this role: they are intended to be used by different target groups for different purposes. The OGC web services require a steep learning curve for the user, but then offer much functionality. Another benefit of standardised geo-web services such as [[WMS]] and [[WFS]] is, that there are standard toolkits / software libraries and applications available that can use data from a WMS or WFS service without programming work. OGC web services are, therefore, ideal for geo-specialists and other daily users. REST convenience APIs require a much less steep learning curve, but also offer less functionality. This makes them especially suitable for simple, frequent searches. As a variation on the 80/20 rule: with the convenience APIs you can answer 80 per cent of the questions with 20 per cent of the effort. For a lot of uses and for many users, this suffices, but for some professional users it is not enough. In summary: we do not envisage a radical transition from OGC web services to APIs, but a broadening of unlocking mechanisms in which REST convenience APIs will be offered alongside OGC web services.

Although we do not see REST convenience APIs as the successor of OGC web services, we do anticipate the emergence of these APIs to influence further development of the services. In the coming years, there will probably be lighter, RESTful versions of OGC web service standards available. But we also see evolution taking place at the level of output: it is, therefore, desirable to offer more choice of output formats, including lighter formats such as GeoJSON [[rfc7946]] where this is possible. For view services we see an evolution taking place towards vector tiling, which gives the output of a WMS more flexibility to be adapted for styling purposes. At the level of the Dutch profiles, there is certainly room for evolution. From the point of view of broader usability (including for the web) it is conceivable to prescribe more coordinate systems (e.g. not only RD but also Web Mercator) and for WFS more output formats (e.g. minimal GML and GeoJSON).

File formats

The broadening of output formats is not limited solely to WFS, but also applies to various download facilities in the Dutch SDI. [[GML]] is still the best option because of its broad applicability, but nothing stands in the way of data providers offering multiple formats. For example, OGC GeoPackage seems to be a suitable solution for all those users who, in practice, require a shape-file, provided that it does not involve 3D data (currently [[GeoPackage]] does not support solids). Other users are increasingly asking for JSON [[rfc7159]], GeoJSON [[rfc7946]] or [[JSON-LD]], with or without additional work arrangements about, for example, how to deal with coordinate systems. Geonovum wants to explore (possibly in the form of a testbed) the extent to which it can draw up a guide, in collaboration with others in the professional field, to offer lighter formats in addition to GML.

3D

At the point of providing 3D data, the standards will have to prove themselves even more. OGC 3D portrayal services [[3dps]] may play a role in this, supplemented by (at time of writing still a candidate) OGC Community Standards such as 3D Tiling (based on Cesium) or I3S. The specific challenge of 3D data lies in the data volume: for both visualisation facilities and download facilities, the performance and required memory capacity and computing power must explicitly be taken into account. Downloads will, for the time being, mainly take place in CityGML [[citygml20]]. However, we do see an interesting – albeit still very early – development around CityJSON, a JSON encoding of the CityGML information model.

Linked Data

APIs are going to play an important role in the unlocking of Linked Data, but we will also see more SPARQL endpoints. These endpoints allow the querying of Linked Data in an RDF-structure. The rate at which such endpoints will become commonplace, will also depend on the rate at which triple stores will become commonplace. Currently, we don’t see the need for a Dutch profile on [[GeoSPARQL]] in the professional field or supplementing the set of Geo standards on the Comply or Explain list.

Sensor data

Standards for the unlocking of sensor data are under development at different locations. For example, we are working on OGC standards ([[SOS]], [[SENSORTHINGS]] API), but also on W3C standards (including SSN [[vocab-ssn]] and Web of Things standards). For these standards, even if they are adequately prepared, they will still be limited. In this respect, therefore, it is still too early to make a statement about the desirability of standards for the unlocking of sensor data. However, we see a clear growth in the unlocking of data from Observations and measurements, also in the INSPIRE context (bathymetry, altitude). This broadens the focus of Geonovum. We will continue to actively monitor developments in this broader area.

Discoverability / metadata

Great discoverability is a crucial prerequisite for the further growth of the use of geo-information and of the SDI. In the current SDI, findability is strongly based on the concept of a metadata catalogue. This catalogue as a front-end itself, is searchable (e.g. on key words, categories, provider and location) and can be unlocked via an OGC catalogue service [[CSW]]. The Achilles heel of this approach is that the party making the search must know where to find the catalogue, or at least a catalogue is synchronised with the geo metadata catalogue which, in the Netherlands, is the Nationaal Georegister. The degree to which a user can find data successfully, is thus determined by the level of knowledge of the user (which effectively amounts to the degree in which the user is familiar with the geo domain).

The findability of geo-information can be increased considerably when search engines also start indexing the available data. Metadata must become crawlable and machine readable. So far, the metadata in catalogues is not, or is barely, indexed. Right now the most effective workaround seems to be the use of landing pages for data, a concept that anyway fits into the current trend towards data platforms. As an example you might think of a new, complete data-based website for PDOK where you will, when searching for the BAG, find this on a landing page. On a central BAG page such as this you’ll ideally find the descriptive elements (metadata), an explanation about the BAG (which is now scattered across various web pages), links to all forms in which the BAG is unlocked (view and download services, APIs, linked data, BAG extract, ...), support from a helpdesk and forum (community) and a viewer to get an immediate impression of the data.

In this form, metadata may be a less recognisable (separate) front-end, but the catalogue will become more of a back-end to clearly capture metadata once and for all. You could still run a OGC catalogue service based on this back-end, which, for example, is mandatory for INSPIRE. The tendency for metadata in the coming years is similar to that of the OGC web services for unlocking: we expect an evolution in the near future aimed at a better connection to the web. In regard to the direction Linked Data will take, unlocking the metadata in RDF [[rdf-concepts]] via the GeoDCAT application profile [[GeoDCAT-AP]] is a good example of such an evolution: this way ISO-compliant metadata can also be unlocked for other open data portals. Abandoning the ISO-standards is currently not addressed: GeoDCAT-AP is an additional unlocking mechanism of metadata towards a generic open data world, but it is not a replacement. In addition, INSPIRE also will continue to ask for ISO-compliant metadata. At the same time, Geonovum is working on updating the Dutch metadata profiles and in W3C-context (in the Dataset Exchange Working Group) on the harmonisation of metadata from various contexts such as DCAT, ISO, CKAN and INSPIRE.

Another important purpose of metadata is to allow users to assess whether data is usable for their application. With an increasing number of users from outside the geo domain, this function of metadata is only becoming more important. This is not just about understanding the 'what' (what the dataset relates to), but also about the 'who' (to whom does the dataset belong?) and more importantly: what does it tell me about the quality). This insight into the origin of data (the term provenance is becoming increasingly common) is an important supporting argument for whether or not use the data. Reasoning from this fit-for-purpose assessment, it may be interesting to explore the extent to which the concept of the Feature Catalogue of the Digital System Environment and Planning Act could be used for more purposes (nationally rather than just within the context of this act). By applying a number of Linked Data concepts to this architecture, you would be able to unlock the descriptive elements of the Feature Catalogue, including notions and definitions from information models, at the same location as the data itself (via services, REST convenience APIs and/or SPARQL endpoints), which would eliminate the need for a separate metadata catalogue AND data can be searched based on terms from the related information model. Geonovum wants to explore what an architecture like this might look like and together with the working field, look at whether such an outline could serve as a dot on the horizon. A focus in this project is the way in which users are informed about the quality of data. Amongst other things, Best Practice 6 from the Data on the Web Best Practices and Best Practice 14 from the Spatial Data on the Web Best Practices provide useful contributions about this.

Discussion

In this white paper we ask ourselves whether the current set of geo standards is future proof and we outline developments for the coming five years. This outline is primarily intended to offer guidance in the steps leading to the further development of the Dutch SDI, but secondarily also to provide insight into the mutual relationships between the innovation projects and management projects that Geonovum has carried out over the last decade. In the meantime, techniques from a number of innovation processes have become so mature that they will have an impact on the design of the Dutch SDI.

In summary: we can state that the current SDI is mature, but at the same time will have to continue to develop, if it is to be able to respond sustainably to technological and societal developments. The emergence of new techniques such as Linked Data and the further development of web standards, doesn’t appear to be disruptive: especially from the target group-specific approach (no longer one size fits all) and the platform concept; current and future standards will, for the time being, be applied in parallel. Moreover, we expect current geo standards to slowly evolve more towards new techniques and standards, making differences less fundamental. This way, users of the Dutch SDI will also not be confronted by radical fractures in the field of technology.

The development perspective provides a global direction for the evolution of the standards that form the backbone of the Dutch SDI, but at the same time presents a number of strategic choices:

Conclusion

Based on this white paper and the responses introduced during the open consultation round, we can draw a number of conclusions: