Skip to main content
NIDM-Terms: Techniques and Controlled Vocabularies for Annotating Datasets to Maximize Findability and Reuse
Nazek Queder, Sanu Ann Abraham, Karl G. Helmer, Theo G.M. van Erp, Sebastian G.W. Urchs, Jean-Baptiste Poline, Jeffrey S. Grethe, Satrajit Ghosh, David Keator
Presenting author:
David Keator
Introduction:
Replication and reuse in neuroscience rely heavily on being able to search datasets and understand their contents. The NIDM-Terms effort aims to improve searches across publicly available data by providing core structures, developing annotation tools, and integrating conceptual annotations using community-driven vocabularies. Here we report on the core features of NIDM-Terms.

Methods:
We develop a controlled vocabulary, adopting existing terms and curating them with additional properties to improve clarity (e.g. units, valueType, ranges). The combined information are represented in the JSON-LD linked data format and validated using SHACL constraints language. To facilitate searches across datasets, we have added a property, “isAbout”, linking user-annotated study variables with broader concepts for query.

To make terminologies easily accessible and facilitate community involvement, we developed a Javascript user interface (UI), hosted on GitHub. The UI consists of a tree-view of terms, concepts and properties. The UI allows users to suggest new terms and export existing terms in various formats. New terms are sent as GitHub pull requests to the NIDM-Terms repository for community discussion, curation, and inclusion into the domain-specific controlled vocabulary. This infrastructure is reusable and extensible for projects beyond our original scope. We have used these tools to annotate datasets from OpenNeuro, ABIDE and ADHD200 studies, and used the query functionality of NIDM to demonstrate the feasibility of searching across datasets using the methods developed.

Conclusions:
We developed a set of procedures to manage controlled vocabularies along with associated tools to facilitate dataset annotation and query.

References
[1] NIDM-Terms. Available at: https://scicrunch.org/nidm-terms/about/project
[2] Shapes Constraint Language (SHACL). Available at: https://www.w3.org/TR/shacl/