SMC Browser is a web application developed in the context of CLARIN. It allows to explore the data space of the Component Metadata Infrastructure, a framework for harmonized creation and publication of language resources descriptions (metadata).
It visualizes the growing number of profiles, components and elements defined within CMDI and their semantic grounding in the data categories. As such, it targets the data modellers of projects, concerned with the task of developing new or reusing existing metadata schemas. It enables them to gain a better overview of the existing CMD components and their usage, thus helping to foster reuse and to prevent the component proliferation.
SMC - Semantic Mapping Component
Semantic interoperability has been one of the main concerns addressed by the CMDI and appropriate provisions were weaved into the underlying meta-model as well as all the modules of the infrastructure. Consequently, the infrastructure has also foreseen a dedicated module Semantic Mapping that exploits the mechanisms in place to enhance the search capabilities (improve recall) over the heterogeneous collection of resource descriptions - the joint CLARIN metadata domain.
Originally, the SMC module was meant as a service to be used by the exploitation tools (metadata browsers, catalogs) with two main functions: crosswalk lookup and query expansion. The crosswalk function delivers correspondences between data categories and/or fields in heterogeneous metadata schemas by combining information from Component Registry, Data Category Registries (ISOcat, dublincore) and Relation Registry. Building on top of that, the query expansion translates queries, by expanding the search indexes. The service is available as a prototype and will go online within 2014. Meanwhile the search engines in productive use (foremost the Virtual Language Observatory) developed they own internal solutions for mapping between equivalent fields, however still based on data categories as pivotal points for sharing semantics.
At the same time, the CMD data domain growing in size and complexity, the need has arisen for enhanced ways for its exploration, leading to the development of the SMC Browser. Based on the information collected by the SMC service, it visualizes the CMD entities (profiles, components, elements data categories) and their interrelations as an interactive graph. In particular, it enables the metadata modeller to examine the reuse of components or data categories in different profiles. The graph is accompanied by statistical information about individual nodes, e.g. counting how many elements a profile contains, or in how many profiles given data category is used.
For more information, please refer also to the published work on this topic, especially the thesis SMC4LRT, where also the interface of the SMC service is described.