Building IoT based applications for Smart Cities: How can ontology catalogs help?

The Internet of Things (IoT) plays an ever-increasing role in enabling Smart City applications. An ontology-based semantic approach can help improve interoperability between a variety of IoT-generated as well as complementary data needed to drive these applications. While multiple ontology catalogs exist, using them for IoT and smart city applications require significant amount of work. In this paper, we demonstrate how can ontology catalogs be more effectively used to design and develop smart city applications? We consider four ontology catalogs that are relevant for IoT and smart cities: READY4SmartCities, LOV, OpenSensingCity (OSC) and, LOV4IoT. To support semantic interoperability with the reuse of ontology-based smart city applications, we present a methodology to enrich ontology catalogs with those ontologies. Our methodology is generic enough to be applied to any other domains as is demonstrated by its adoption by OSC and LOV4IoT ontology catalogs. Researchers and developers have completed a survey based evaluation of the LOV4IoT catalog. The usefulness of ontology catalogs ascertained through this evaluation has encouraged their ongoing growth and maintenance. The quality of IoT and smart city ontologies have been evaluated to improve the ontology catalog quality. We also share the lessons learned regarding ontology best practices and provide suggestions for ontology improvements with a set of software tools.

Keywords: Semantics-based Smart Cities, Ontology Catalogs, Knowledge Directory, Semantic Data Interoperability, Ontology Best Practices, Ontology Improvement, Ontology Validation, Semantic Web Technologies, Reusable Knowledge

I. I ntroduction

The Internet of Things (IoT) aims at interconnecting surrounding devices (e.g., thermometer) to the Internet in order to send and process data generated by them [1]. The report from Gartner 1 predicts that more than 20 billion devices, also called Things, will be in use in 2020. IoT plays an ever increasing role in enabling Smart City applications. Smart city infrastructures are expensive to design, create, deploy, and maintain. Interoperability is key to reduce cost, and is needed at multiple levels, including (1) the system, (2) architecture, (3) workflow to process IoT data, (4) applications and services, and (5) reasoning on data. A semantic approach, especially one that is enabled by the use of relevant ontologies, can help deal with the variety associated with IoT and relevant complementary data types, and support interoperability. However, there are multiple ontology catalogs that are relevant to IoT and smart cities, which in turn presents the challenge of selecting the proper catalog and ontologies.

Consider Spain’s Santander 2 smart city initiative that deployed more than 20,000 sensors to measure air quality, monitor parking spaces, manage electricity, optimize garbage collection, and regulate light intensity [2]. Smart city applications rely on the efficient utilization of data generated by these devices and cover a variety of domains such as water management and irrigation, healthcare, transportation, energy management, resource (e.g., parking space) utilization, etc. Those applications are redesigned continuously in various cities (e.g., parking availability applications, bike sharing availability applications). For example, the CityPulse project listed 101 applications 3 and analyzed tens of them [3]. Those applications are frequently redesigned exploiting similar datasets. Smart city datasets are available on open source data portal platforms such as Comprehensive Knowledge Archive Network (CKAN) 4 . Such platforms encourage to reuse datasets and even link them with each other to follow Linked Data principles [4]. The main shortcoming of such portals is the lack of links between the datasets and the data model used to structure datasets. Reusing ontologies designed for smart city applications would increase semantic interoperability between systems and cities and could reduce development time of applications. For this reason, cities such as Santander are integrating semantic web technologies as already demonstrated in the context of the FIESTA-IoT EU 2020 project 5 [5]. In France, more and more cities are releasing open data generated by sensors. OpenSensingCity 6 , a project funded by the French National Resaerch Agency (ANR), aims at unifying those datasets with the usage of semantic web technologies. For instance, we organized a hackathon 7 to use datasets from five cities (Paris, Lyon, Nantes, Rennes, and Strasbourg) on different domains (pollution, weather, parking space, and bike availability) to build smart city applications.

This paper advocates the use of semantic web technologies for better data interoperability and integration in smart city applications. Ontologies allow developers to reuse and share application domain knowledge using a common vocabulary across heterogeneous systems, platforms, environments, etc. [6]. There is also a real need to encourage best practices when developing ontologies, in particular: (1) reusing existing ontologies as much as possible, and (2) aligning the ontologies to increase interoperability by reducing heterogeneity issues across models and to reduce development time.

Given that ontologies underpin semantic web technologies, an early step to consider is identifying a relevant ontology for reuse if one exists. Ontology is a set of concepts and categories in a specific domain to explicitly describe relationships between them [7]. Arumugam et al., in 2001, is one of the pioneering works encouraging on finding the most relevant set of ontologies for a given need [8].

In a more contemporary scenario, we advocate the reuse of models by investigating the usage of ontology catalogs, with a focus on OWL-based ontologies due to its broad adoption since it became a World Wide Web Consortium (W3C) recommendation in 2004.

The Semantic Sensor Networks (SSN) ontology [9] is one of the first initiatives to support semantic interoperability of data generated by sensors or devices. SSN became a W3C recommendation in October 2017 8 , extending and improving the SSN ontology published in 2011 [10]. However, there are some limitations such as real-time aspect and a lack of a taxonomy (i.e., a scheme of classification). There is a need of a taxonomy to classify measurement units, context, quantity kinds (measurement type such as temperature) and services provided by devices to expose sensor data. For this reason, developers still design new ontologies for their need to develop smart city applications. We could take inspiration from the software engineering communities providing online code sharing environment. Correspondingly, we could build an ontology catalog environment to encourage the reuse of the ontologies, not only the design but also their implementations by releasing the code online. To the best of our knowledge, the surveys regarding ontology catalogs do not report recent work and are not comprehensive for the IoT and smart city research field [11].

Ontology catalogs applied to the IoT and smart city domain are relevant for three user categories: (1) application developers to find, choose and reuse the ontologies that might fit their needs, (2) ontology developers to publish and share their ontologies for promoting reuse, and (3) developers and maintainers of the ontology catalogs.

A. Research Challenges

We address the following Research Challenges (RC):

RC 1: Which methodologies can assist ontology developers in reusing existing IoT and smart city ontologies?

RC 2: What methodology would help choose the ontology fitting our needs among a set of similar ontologies?

RC 3: How the state of the art analysis could be shared in an innovative way to reduce the learning curve of investigating, studying and classifying it?

RC 4: How to efficiently analyze exiting IoT and smart city ontologies?

RC 5: What would be the set of criteria and best practices to compare ontologies and ontology catalogs?

RC 6: How can developers collect the set of ontologies relevant for IoT and smart cities? RC 7: How can ontology designers stay updated with the latest ontologies designed for smart cities? RC 8: How could we guide developers and ontology engineers to evaluate ontologies?

B. Main Contributions

We enumerate our contributions. Each contribution is matched to the RCs presented above. Further, each contribution is matched to specific sections in the paper. The contributions and the novelty of this paper are as follows:

Designing ontology catalogs for smart cities: OpenSensingCity and LOV4IoT ontology catalogs address challenges RC 1, RC 3, RC 4, and RC 7 and are introduced in Section III.

A set of criteria and relevant tools to improve the quality of ontologies (explained in Section VI) addresses challenges RC 2, RC 5 and RC 8. Additional dissemination regarding semantic web methodologies, best practices and recommendations is needed to go beyond the IERC Cluster Semantic Interoperability Best Practices and Recommendations (IERC AC4) [12] by suggesting and integrating ontology quality tools.

A methodology to enrich ontology catalogs (explained in Section V), implemented within LOV4IoT, addresses the challenge RC 6.

The comparison of four ontology catalogs for IoT and Smart Cities (Ready4SmartCities, LOV, LOV4IoT and OpenSensingCity) explained in Section III addresses the challenge RC 6.

An analysis of most relevant ontologies for smart cities (Section IV) addresses the challenge RC 3.

An evaluation with users to evaluate the LOV4IoT catalog (explained in Section VI-C) addresses the challenge RC 8.

C. Structure of the Paper

The rest of the paper is structured as follows: Section II investigates related work of semantics-based smart city projects and ontology catalogs. Section III compares existing ontology catalogs for smart cities. Section IV describes smart city ontologies. Section V focuses on the methodology to enrich ontology catalogs. VI evaluates the LOV4IoT ontology catalogs and the quality of ontologies and provides a use case. Section VII concludes the paper and Section VIII shares our vision regarding future work. The paper has an appendix section with all figures, code examples, tables, etc.

II. R elated W ork

We review the related work of semantics-based smart cities in Section II-A. Section II-B is dedicated to schema catalogs. A focus on surveys for IoT and smart city ontologies is explained in Section II-C. Section II-D introduces work regarding semantic interoperability. Section II-E concludes the limitation of the existing literature.

A. Semantics-based Smart City Projects

In this section, we review papers having the “smart city” keyword with an interest in integrating semantic web technologies. When the projects (CityPulse, KM4City, etc.) already designed the ontology, they are explained in Section IV-A.

Alkandari et al. provide a survey about smart cities [13]. However, we expect the classification of semantics-based smart city projects which is missing from the paper.

Zanella et al. design a proof-of-concept of the Italian Padova smart city [14]. The paper focuses on architecture, web services, data format (XML and EXI) but without employing semantic web technologies. It highlights heterogeneity issues (e.g., communications and devices) within the application layer, transport layer and link layer technologies from the OSI model. The paper highlights the main application domains for smart cities: (1) smart building, (2) waste management, (3) air quality, (4) noise monitoring, (5) traffic congestion, (6) city energy consumption, (7) smart parking, and (8) smart lighting. This paper demonstrates that smart city applications cover numerous domains.

SEN2SOC (SENsor measurements and SOCial interactions) project is based on the FP7 EU SmartSantander project which provides real sensor data. SEN2SOC integrates SmartSantander sensor data with social networks data (Twitter, Flickr, and Foursquare) [15] to add value to the data. The semantic data annotation is done on SmartSantander project side. The paper does not explain which ontologies are used for the semantic annotation neither referenced. Further, the real-time aspect is introduced, but no ontologies have been mentioned satisfying the real-time requirement. The work seems close to the “Cyber-Physical Systems (CPS)” research field, but the work is not compared with this research topic. Smart city applications such as heatmap for temperature have been build as a proof-of-concept.

Zhang et al. design a scalable framework to deal with variety, volume, and real-time data generated within smart cities [16]. The framework employs semantic web technologies combined with machine learning techniques. The semantics-based framework has been used for two use cases in smart cities: pollution detection and traffic pattern decision.

Open Agile Smart Cities (OASC) 9 is an initiative towards designing a unified system for smart cities by focusing on: (1) a common API, (2) an open data platform, and (3) data models. This is precisely the focus of this paper: where can we find data models reusable for smart cities?

OntoPhil is an ontology matching algorithm specifically designed for smart cities [17] and had been evaluated with the Ontology Alignment Evaluation Initiative (OAEI) benchmark. OntoPhil is adapted to those requirements: modular ontology size and lightweight matching process. OntoPhil has also been used to match 39 agent ontologies that need to interact with the smart city SOFIA ontology. The main shortcoming of the work is that the ontology matching system has not been evaluated on smart cities ontologies, but with the ontologies from the OAEI initiative. It demonstrates the need for a benchmark for smart city ontologies.

Conclusion:

Finding existing semantics-based smart city projects and smart city ontologies is challenging and time-consuming. Mechanisms are missing to encourage the reuse and the evaluation of those ontologies. Frequently ontology must be improved before being able to load them with ontology quality or ontology matching tools. A deeper analysis of ontology for smart cities is done in Section IV 10 . This concise survey provides an overview of the main semantics-based smart city projects, but it is by no means comprehensive, as it is not the main focus of this paper.

B. Schema Catalogs

A survey of eleven ontology libraries from d’Aquin et al. was published in 2012 [11], including ontology libraries for domains other than IoT and smart cities. In our paper, we prefer to use instead the term ontology catalog. The survey from d’Aquin et al. does not mention any catalogs for IoT and smart cities, which makes the respective coverage (see Section III) a key contribution of this paper.

Ontology libraries are categorized as follows: (1) Purpose and coverage explains that ontologies can be limited to a particular domain and vary in size, and type of ontologies. (2) Library content explains how new ontologies are inserted within the library and what are the quality controls done before adding the ontology. (3) Size of the ontology library. (4) Ontology metadata provides ontology name, domain, creators, date of creation and modification, version, license, etc. In this paper, we count the number of ontologies referenced within each ontology catalog. We also encourage ontology metadata description to design automatic mechanisms such as discovery.

Schema.org is a schema catalog for use in structured data embedded in Web pages to describe locations, people, products, services, etc. [18]. The IoT Schema.org extension is under development 12 .

BioPortal [19] is an ontology catalog for biomedical ontologies. It provides a friendly-user interface for users, and REST API for developers. Numerous functionalities are provided such as searching for a specific class, finding an ontology, and ontology statistics (the number of ontologies, the number of classes, etc.). BioPortal provides a set of tools: (1) Ontology Browser to browse the library of ontologies, (2) Search to look for a class in multiple ontologies, (3) Annotator to get annotations with specific ontologies, (4) Mapping between a selected ontology and all ontologies referenced within BioPortal, (5) Recommender for the most relevant ontologies, and (6) Resource Index to display all ontologies. Such ontology catalogs and its functionalities should be provided for smart cities and IoT. When we browse the BioPortal ontology browser with the “sensor’” keywords, only the SSN ontology is found. For the “IoT” keyword, 0 results are found.

AberOWL Repository 13 references 570 biomedical ontologies [20].

WebProtégé 14 is a collaborative ontology development tool which references ontologies that have been built with it [21]. WebProtégé provides functionalities to discuss and annotate ontologies. The critical requirement is to provide a simple way for users to make their ontology available on the Web so that other people can browse it without the need to install any software.

Ontology Design Patterns (ODPs) 15 can be seen as a repository of ontologies [22]. Ontologies can be classified into the following categories: (1) Content ODPs, (2) Reengineering ODPs, (3) Alignment ODPS, (4) Logical ODPs, (5) Architecture ODPs, (6) Lexico Syntactic ODPs, and (7) Exemplary ODPs. It is hard to search for a specific keyword such as “city” or “IoT” to retrieve ontologies referenced within the catalog. For instance, a request done in October 2017 to look for the IoT domain returns only one ontology 16 .

Conclusion:

An analysis of IoT and smart city ontology catalogs has been lacking. Also missing is the lack of guidance regarding the demanding task of learning and reusing ontologies. In quite a few cases, the documentation is missing and not referenced within ontology libraries. We also expect the description of the methodology used to design ontology libraries, the way they have been automatized to update additional ontologies. A key contribution of this paper is to analyze four ontology catalogs (Ready4SmartCities, OpenSensingCity, LOV, and LOV4IoT) for IoT and smart cities (see Section III). These have not been studied in the ontology catalog/library surveys.

C. Existing Surveys for IoT and Smart City Ontologies

The Web of Things (WoT) [23] is considered as an extension of the IoT to easily send sensor data by exploiting Web technologies, and then exposing data to developers via websites and web services. Since sensor networks, IoT, and WoT technologies are considered as a basis to build smart cities, we introduce the existing surveys related to those topics in this section.

While the Semantic Sensor Networks (SSN) ontology specification [24] has been released as a W3C Recommendation in October 2017, several surveys related to sensor ontologies 17 have used SSN as the basis. 23 ontologies have been compared: AEMET, aws, BCI, CF, DogOnt, Energy, iotlite, IoT-O, M3 Lite, OpenIoT, PEP-SSN Alignment, RAMI, SAN, SAO, SPITFIRE, VITAL, Geologic timescale IoT-O (SOSA), SAN (SOSA), FixO3, SEAS-SSN Alignment, and LSO Trajectory. Those ontologies have been compared according to the following criteria: (1) Imports SSN/SOSA, (2) Observations, (3) Actuations, (4) Samplings, (5) Features of Interest and Properties, (6) Results, (7) Procedures, and (8) Systems and their Deployments.

Further details regarding the survey of sensor ontologies can be found on the W3C SSN documentation. It is the continuation of the SSN ontology work published in 2012 [9]. A deep analysis of sensor ontologies 18 has been done to build the SSN ontology V1 in 2012 explained in [25].

Within Szilagyi et al.’s survey [26], published in 2016, the authors design their own semantic web stack for IoT. The stack is interesting but not enough explained. Further, it seems the architecture is an extension from Barnaghi et al. [27] and Serrano et al. [28], but it is not clearly explained. An IoT namespace is introduced in the paper without mentioning where it originates: from an existing ontology or from their own. The paper does not provide an in-depth analysis of existing ontologies.

A survey on IoT ontologies from Bajaj et al. was published on ARXIV [29] in July 2017. An analysis of IoT ontologies as a classification of ontologies has been done and mainly focused on IoT ontologies since 2012 (after the first release of W3C SSN). They classified ontologies in the following categories: (1) sensor ontologies, (2) context-aware ontologies, (3) location ontologies, and (4) time-based ontologies. Each category is split into generic ontologies, and domain-specific ontologies. The need to evaluate ontologies has been highlighted and explained. What is missing is the explanation of the evolution of such ontologies and a deep comparison.

Conclusion:

A survey regarding ontology-based smart city projects is lacking. For this reason, one of the contributions of this paper is to introduce the most popular ontologies used within smart city projects which is done in Section IV. The focus of our paper is also to disseminate and encourage best practices and ontology improvement which is done in Section VI-A.

D. Semantic Interoperability

The semantic interoperability survey from Ganzha et al. [30] discusses the following popular ontologies: IoT, sensor ontologies, (e/m) Health, and port transportation/logistics. The main shortcoming of the paper is that the authors do not introduce the existing ontology catalogs for IoT and smart cities. They claim more work is needed to achieve semantic interoperability. From our point of view, there is a need to define a set of best practices for ontologies. No tools have been provided to facilitate the access to all ontologies. A set of tables to compare ontologies within the same domain according to the concepts covered is also missing.

A set of best practices and recommendations for semantic interoperability designed by the European Research Cluster on the Internet of Things (IERC) AC4 was released in March 2015 [12]. The needs to overcome the following challenges are mentioned: (1) a unified model to semantically annotate IoT data, (2) reasoning mechanisms, (3) linked data approach, (4) horizontal integration with existing applications, (5) design lightweight versions for constrained environments, and (6) alignment between different vocabularies.

Conclusion:

IERC AC4 does not reference tools encouraging: (1) semantic web best practices, (2) the use of methodologies to ensure interoperability among ontology-based IoT applications, and (3) reuse of the domain knowledge already designed. For this reason, a set of concrete tools is provided in Section VI-A2.

E. Limitations of Current Approaches

Our detailed analysis of the related work demonstrates the need for the following:

An analysis of ontology catalogs since the last survey has been published in 2012. The analysis is explained in Section III.

A focus on ontology catalogs for IoT and smart cities is explained in Section II-B. The comparison between existing ontology catalogs for smart cities is explained in Section III.

There is a lack of synthesis regarding existing smart city ontologies. We introduce those ontologies in Section IV.

There is a lack of methodologies to enrich ontology catalogs with new semantics-based projects. We design a methodology explained in Section V.

We expect to retrieve all semantics-based smart city projects referenced within the related work section available within ontology catalogs. LOV4IoT ontology catalog is being updated as explained in Section V-B.

Guiding ontology designers in reusing popular IoT or smart cities ontologies is explained in Section VI by defining a set of criteria to evaluate ontologies.

Facilitating the task of ontology designers in making better ontologies to encourage semantic interoperability is explained in Section VI. We suggest a set of easy-to-use tools to improve ontologies.

III. O ntology C atalog A nalysis for S mart C ities

We describe and compare four ontology catalogs relevant for IoT and smart cities in Section III-A, and compare them in Section III-B.

A. Ontology Catalog Description

We describe and compare four ontology catalogs relevant for IoT and smart cities: Ready4SmartCities, OpenSensingCity, LOV, and LOV4IoT. Figure 4 provides an interactive mindmap, available online 19 , to explore more ontology catalogs and semantic search engines. MindMap offers numerous benefits such as fast thinking and learning [31]. Due to the overloaded information, mindmaps can help newbies to discover ontology catalogs in an interactive way and as a playground.

We surveyed ontology catalogs based on OWL ontologies since OWL is a W3C recommendation. Further, we selected ontology catalogs supporting the activity of ontology reuse. Our criteria to compare ontology catalogs are as follows (see Table I within Annexe Section X):

Ontology number counts the number of ontologies referenced within the catalog. Maintenance of the system which can be automatic, semi-automatic or manual. Ontology quality evaluates ontologies referenced within ontology catalogs. Ontology collection explains the way ontologies have been integrated within the catalogs. Ontology metrics counts the number of classes or properties. Datasets structured according to ontologies.

Integration with tools which improves the reusability of ontology. For instance, automatic ontology documentation, ontology visualization, and ontology alignment can be provided.

Ready4SmartCities was an FP7 EU project providing a catalog of ontologies for building smart cities [6] [32]. The Ready4SmartCities catalog focuses on seven domains: energy, climate, weather, environment, building, occupancy, user behavior, and characteristics. It also provides five transversal domains: temporal, organizational, statistical, spatial and measurement. The project also includes the alignment of such ontologies. Unfortunately, the project does not seem maintained anymore, with the website indicating “latest revision July 2015”. The project classifies ontologies according to the following criteria: (1) ontology name, (2) online availability (RDF, HTML), (3) open license, (4) ontology language, (5) syntax, (6) domain, and (7) natural language (e.g., English).

The ontology collection has been done by reviewing the literature, standards, looking up ontology catalogs and search engines (LOV, Watson, and Swoogle), dataset investigation and stakeholders (contributors through an online form, populators to include new ontologies within the catalog and metadata curators to review and improve ontologies).

The Ready4SmartCities ontology catalog designed an ontology to classify all ontologies. The ontology employs concepts and properties from several ontologies (VOAF, OMV, DC, and VANN) to describe ontology metadata. This ontology catalog integrates the OOPS ontology validation tool to improve ontology quality.

The OpenSensingCity catalog 20 has been designed for the ANR-funded OpenSensingCity project which aims at fostering the usage of real-time open data in the context of smart cities by providing operating tools including an ontology catalog for smart cities. OpenSensingCity aims at helping application developers to take advantage of open data streams. The Smart City Artifacts (SCA) web portal 21 collects information about smart cities and provides web applications to visualize the list of existing projects, ontologies, and datasets. The SCA ontology has been designed to classify and describe smart city projects and artifacts [33]. A SPARQL endpoint is provided to query the RDF dataset developed according to the SCA ontology. The SCA ontology also reuses external ontologies: DC, DOAP, Prov-O, FOAF, sc, muto, fabio, dbowl, and OMV. The catalog references 124 ontologies as depicted in Table I . 59 domains are provided to classify ontologies (e.g., energy, geography, sensors, transportation, tourism) and tags 22 . When clicking on an ontology, statistics are provided (number of classes and properties, etc.). Ontologies can be automatically visualized with WebVOWL. The ontology syntax can be validated with TripleChecker and the ontology design with OOPS.

Linked Open Vocabularies (LOV) is an ontology catalog referencing more than 648 vocabularies (as of May 2018), but few of them are referenced for IoT and smart cities. We are focused on the IoT tag 23 which has been added to the LOV catalog [34] upon request by the LOV4IoT team that we are part of. In May 2018, 27 ontologies with this tag have been referenced. A tag such as smart cities would be relevant to easily retrieve such ontologies. For instance, when we request the city keyword within LOV 24 , only 4 ontologies have been found: km4city, gci, turismo, and iso37120. LOV provides an interface for contributors to suggest ontologies. A bot is checking some requirements such as ontology metadata [35] to allow the insertion of a new ontology.

Linked Open Vocabularies for Internet of Things (LOV4IoT) 25 references 448 ontologies (in May 2018), most of the projects are referenced when they are related to an IoT application domain exploiting sensors and semantic web technologies. In this paper, we are focused on IoT ontologies and smart cities ontologies. LOV4IoT classifies ontologies according to the best practices as well. It provides a keyword search (browser search functionality) and navigation mechanism (by domain) in a manually gathered collection of ontologies. Web services are also offered to select ontologies per domain to query the LOV4IoT RDF dataset. The target audience is people involved in designing IoT and smart city applications or any domains already referenced within the catalog (e.g., building automation, healthcare). The main difference with other ontology catalogs is that it provides the publications to highlight the context of the ontology and ontology best practices status. According to the ontology library survey from d’Aquin et al. “the libraries where administrators are the only ones making decisions on what to include, usually do not have well-defined requirements”. Within LOV4IoT, we decided to insert all ontologies that have been mentioned within publications from IoT and smart city topics, but we also classify ontologies according to their best practices learned from the LOV community.

B. Ontology Catalogs Comparison

We compare four ontology catalog since they are referencing IoT and smart city ontologies. One of the contributions of this paper is to enrich the survey from [11] with a focus on IoT and smart city. Another way to find ontologies would be semantic search engines such as Swoogle, Watson, etc.

Ontology catalogs compared in Table I provide human-readable and machine processable formats. Catalogs are published as HTML web site for humans. HTML interfaces exploit in the back end the RDF datasets. RDF is a format processable by machines thanks to the URI discovery mechanism which enables browsing datasets. None of the four ontology catalogs provide the automatic inclusion of the ontology. All catalogs prefer a manual checking before inserting new ontologies. To address RC 7: How ontology designers can stay updated with the latest ontologies designed for smart cities?; The users can check the year of the publications (e.g., 2017 or 2018) on the LOV4IoT HTML interface to be aware of the latest insertions. We also provide the web service to query smart city ontology URLs 26 .

Table II provides for each ontology catalog: (1) its name, (2) the year of creation, (3) the scientific publications describing the catalog, and (4) the ontology catalog GUIs URL. Table III provides ontology URL designed for each ontology catalog referenced above. Finding each ontology designed for each ontology catalogs is challenging because the links are not easy to find through the portals.

C. Lessons Learned

LOV4IoT is innovative in the way that it provides a structured state of the art as a tool for IoT and smart city ontology practitioners. LOV4IoT references much more ontologies, and could be updated with any ontologies from OpenSensingCity and Ready4SmartCities that are not referenced on LOV4IoT yet. As already explained, LOV4IoT has a huge impact because it is an ontology incubator to assist ontology designers in following best practices in various communities not familiar yet with ontology quality, and to later reference their ontologies on LOV. Best practices are encouraged through the complementary PerfectO 27 project [36].

In the next section, we investigate and introduce the most popular smart city ontologies. When we discover new smart city ontologies, we update the LOV4IoT catalog. Meanwhile, we extract a methodology from it which is explained in Section V. We evaluate the quality of smart city ontologies in Section VI-A1.

IV. O ntologies for S mart C ities

We investigate existing smart city ontologies in Section IV-A. We define a set of criteria to compare ontologies in Section IV-B. We found those ontologies either on ontology catalogs presented previously, or by investigating the literature.

A. Existing Ontologies for Smart Cities

In this section, we investigate existing smart city ontologies. We encourage the readers to use the LOV4IoT and OpenSensingCity ontology catalogs to get the ontology URL. Table V summarizes smart city ontologies and provides ontology or documentation URL to get more technical details.

KM4City (Knowledge Model for City), an Italian national project, modeled an ontology designed for aggregating static or dynamic smart city data [37]. The authors reuse ontologies such as OWL-Time, DC Terms, FOAF, WGS84, GoodRelations, and Ontology Transportation Networks (OTN). The project is scalable since they handle 81 million triples with a growth of 4 million triples per month. It provides a linked data graph, visualization and exploration tool and service map applications exploiting the aggregated data.

STAR-CITY (Semantic Traffic Analytics and Reasoning for CITY), an IBM project, is deployed in four smart cities: Dublin, Bologna, Miami, and Rio de Janeiro [38]. The project is focused on designing ontologies to diagnose and predict road traffic congestions. Data processing exploits six heterogeneous sources: (1) road weather conditions, (2) weather information, (3) Dublin bus stream, (4) social media feeds, (5) road works and maintenance, and (6) city events. SWRL rules have been designed to define rules such as heavy traffic flow.

FIESTA-IoT (Federated Interoperable Semantic Internet of Things (IoT) testbeds and applications) is an H2020 European project [5]. The FIESTA-IoT ontology [39] is designed to unify existing IoT-related ontologies to structure data generated by testbeds. The Smart Santander city or even smart buildings are testbeds producing real data, which is semantically annotated according to the ontology.

VITAL, an FP7 European project, designed an ontology to deal with heterogeneous data streams generated by devices within smart cities [40]. The ontology models sensors and their measurements (based on the SSN ontology V1), for IoT systems and services, and for smart city applications. VITAL is innovative since it provides an operating system for IoT to deal with service creation, orchestration, and protocols. VITAL provides the following characteristics: virtualization, modularity, standards-based (RDF and JSON-LD) and loosely coupled, and open-source.

CityPulse, an FP7 European project, provides the Stream Annotation Ontology (SAO) to unify smart city datasets [41], [42]. SAO has been designed to address real-time aspects.

Smart City Ontology (SCO) is an ontology published in 2015 by Komninos et al. [43]. It reuses some ontologies such as SKOS, but it does not reuse the SSN ontology and lacks of best practices. For instance, the ontology is not shared in a proper way.

Smart city SOFIA2 ontology does not extend SSN ontology but reuses IoT.est ontology [17].

PRISMA project designed an ontology [44] which reuses WGS81, NeoGeo, and Collections ontologies. However, it mentions neither the use of data generated by devices nor the usage of SSN ontology. The ontology is mainly designed to unify heterogeneous data: (1) GeoData from the Geographic Information System (GIS), data on lines and stops of the public transport bus system (REST web service in JSON format). (2) Public lighting system for the maintenance of the city (XML file). (3) State of the roads, sidewalks, signs and markings (Microsoft SQL Server database). (4) Historical data on municipal waste collection (Microsoft Excel file). (5) Historical data on the urban fault reporting service (MySQL Server database). The project provides the LODView tool for an HTML representation of RDF resources and the LODLive tool to browse the RDF graph. The paper does not focus on the description of the ontology, but introduces the need of this ontology to provide Linked Open Data and implements web services, SPARQL endpoints, browsable features, and visualization on top of it.

SCOnt (Smart City Ontology) has been designed by Beseiso et al. and used in a semantic-based framework to manipulate smart city data [45]. However, the ontology has not been shared online [45] which hinders interoperability of smart city systems and the reuse of the ontology. The ontology reuses a population ontology, a geo-location ontology and the DBpedia ontology. Descriptions regarding the design of the ontology and semantic mapping are missing. The novelty in this work compared to existing smart city projects is not obviously explained. SCOnt is used to manipulate smart city data in an architecture comprising four layers: (1) Data scraping layer gathers and refines data since duplication and incompletion of meta-information and missing values issues are faced. (2) Data adaptation layer provides ontology modeling and semantic mapping. (3) Data management layer stores and indexes data within a NoSQL database. Semantic Web services are mentioned but neither link nor descriptions are provided or referenced. (4) Applications layer provides dashboards and APIs.

Conclusion:

We demonstrated in this section that smart city ontologies are regularly redesigned which hinders semantic interoperability. More ontologies related to smart cities can be found on the LOV4IoT and OpenSensingCity ontology catalogs, as explained in Section III. To encourage the ontology reuse, we define a set of criteria to compare smart city ontologies in the next section.

B. Criteria to Compare Smart Cities and IoT Ontologies

Based on our analysis of smart city ontologies in Section IV-A, we define a set of criteria to compare smart city ontologies which can also be applied to IoT ontologies. Those criteria are mainly focused on the reusability of the ontologies. We take as a basis some criteria explained in [46] and we add additional criteria as follows:

Ontology goal should be clearly explained. Usually, the ontology is designed for a project or an application.

Ontology size shows the depth of the ontology. Small or lightweight ontologies would be easier to reuse.

Ontology documentation reduces the learning curve to understand and integrate the ontology, and encourage its reusability. A popular practice is to provide an online HTML documentation. A publication, deliverable or any documentation is necessary to explain in detail the ontology and its impact.

Ontology availability is strongly encouraged. Ontology should be shared on the web to encourage semantic interoperability. Ontology designers should make an effort in integrating previous ontologies and being aware of the ontology limitations.

Ontology popularity demonstrates the impact of the ontology and its genericity when the ontology is used in other projects.

Ontology maintenance needs to be achieved. Usually, when the projects are finished, the ontology is not maintained. However, ontology designers might be responsive in case they continue to work on the same research topic.

Ontology metadata is preconized by [6], [36], [35]. It is mainly required for building automatic mechanisms. To assist IoT and smart cities ontology developers, Listing 1 shows an example of vocabulary description. See Table X for the list of ontology namespace required to implement ontology metadata.

All the namespaces are those available at http://prefix.cc/. Table X is a reminder for the most popular ontologies.

Conclusion:

We analyzed smart city ontologies and defined criteria to compare them. In the next Section V, we provide a methodology to enrich smart city catalogs with the new ontologies found.

V. G eneric M ethodology to U pdate O ntology C atalogs

We explain the methodology to enrich ontology catalogs in Section V-A. The methodology is used in our LOV4IoT catalog in Section V-B.

A. Methodology

Hereafter, we designed a generic methodology to enrich the ontology catalog with new ontology-based projects and the desired knowledge to deduce meaningful information from sensor data. This methodology encourages interoperability among applications by reusing ontologies. We have defined the following steps to update ontology catalogs as depicted in Figure 1 within Annexe Section X.

STEP 1: Investigating a new IoT application Domain. A new application domain can be integrated into the catalog if needed. For instance, we investigated the “energy”, “agricultural” and “smart city” domains for the needs of the projects that we are involved in. All projects having these keywords or synonyms that have already designed ontologies are being studied by browsing search engines (e.g., Google) or research papers catalogs (e.g., Google Scholar, IEEE library, ACM library, LNCS library).

STEP 2: Updating the dictionary to add the new domain. When a new domain needs to be added, we manually insert it within the M3 dictionary 28 implemented as an ontology. We also have mechanisms to automatically integrate a new domain as demonstrated here 29 , but a manual checking is preferred to handle synonyms, etc. LOV4IoT users can also suggest their ontologies through a suggestion form 30 where they can indicate a new domain. The application domain classification is a cornerstone component to automatically retrieve all ontologies, or compute the number of ontologies for a specific domain.

STEP 3: Updating RDF ontology catalog dataset. The repository is updated with a new RDF instance (See Listing 2 ).

STEP 4: Updating HTML ontology catalog. Both the HTML web page and the RDF dataset are updated with a new project. The authors of the paper are also contacted thanks to the bot sending emails to encourage them to share the domain knowledge on the Web (e.g., ontologies, rules, etc.). The ontologies can be classified and visualized with a table view.

STEP 5: Applications based on the ontology catalog datasets. Applications can be developed to validate ontologies referenced within the catalog, or visualized automatically. Other applications enable the making of statistics such as computing the number of ontologies per domain.

B. Use Case: Methodology applied to LOV4IoT

The methodology mentioned above has been used to design Linked Open Vocabularies for Internet of Things (LOV4IoT) ontology catalog. LOV4IoT enables reusing background knowledge and facilitating semantic-based IoT application development. The LOV4IoT methodology has been implemented and provides a set of tools. Section V-B1 explains the main reason why this ontology catalog has been built. Section V-B2 highlights that LOV4IoT is an extension of the LOV catalog. Section V-B3 describes the LOV4IoT HTML user interface. Section V-B4 explains the way LOV4IoT is being maintained. Section V-B5 provides explanations of the RDF dataset to build any applications to recommend ontologies or research projects. Section V-B6 provides an example to query the RDF dataset which is used to build some of our user interfaces.

1) The design of the LOV4IoT catalog:

We pursued a deeper analysis of domain knowledge involving sensors and semantic web technologies and came up with the following research questions: