Unified Biomedical Knowledge Graph

Glossary


Scope

This is a working glossary, and not a formal or exhaustive terminology. When the discussion of a term in the glossary refers to another term in the glossary, the other term will be in bold italic.


Assertion

An assertion establishes a relationship between two entities.

BioPortal

A repository of biomedical ontologies maintained by NCBO.

Although many of the ontologies published in BioPortal follow OBO principles, compliance is optional.

Code

The identifier for a concept in a vocabulary or ontology in the UMLS. A code is unique to the ontology. For example, both the SNOMED_CT and the NCI Thesaurus vocabularies have different codes to represent the concept of “kidney”; however, because these codes are cross-referenced to the same UMLS concept, the codes share a CUI.

Concept

An entity in the UBKG. A concept is represented with a code in a source ontology and a CUI in the overall UBKG.

Concept Unique Identifier (CUI)

A unique identifier for a concept in the UBKG. A CUI can be cross-referenced by codes from many ontologies, allowing for associations between entities in different ontologies.

For example, the CUI for the concept of methanol in UMLS is cross-referenced by codes in a number of ontologies and vocabularies, such as SNOMED_CT and NCI. The use of the CUI allows for questions such as “How many ways do all the ontologies in the UBKG refer to methanol?” As the following illustration shows, a knowledge graph can reveal that terms from different ontologies include “methanol”, “Methyl Alcohol”, and “METHYL ALCOHOL”.

image

How do all the ontologies in the UBKG refer to “methanol”?

Cross-reference

A link between a concept in one ontology and a concept in another. Cross-references can be described in one of two ways:

Because UMLS CUIs can be linked to codes in many ontologies, a cross-reference to a UMLS CUIs is likely to be more useful than a cross-reference to a single code in another ontology.

Edge

A synonym for a relationship, used primarily for knowledge graphs.

Encode

To represent a concept in an ontology with a code**. When possible, concepts should be encoded using codes from a standard, published source such as a vocabulary, a public database, or an **OWL file.

For example, in the Protein Ontology, the entity 5’-AMP-activated protein kinase subunit gamma-1 is encoded with code 000013225.

Entity

An entity represents a member of an ontology. Entities associate with other entities in an ontology via relationships.

An example of an entity is 5’-AMP-activated protein kinase subunit gamma-1, which is a protein in the Protein Ontology (PR), encoded with code 000013225.

In a knowledge graph, an entity is represented by a node.

Equivalence Class

A cross-reference between a concept in one ontology and a concept in another ontology. The idea of “equivalence class” is used in OWL.

Ingest files

A set of files that describe the entities and relationships of an ontology that is to be integrated into the UBKG.

Inverse relationship

A relationship in an ontology has a direction: it starts with one node and “goes toward” another–e.g.,

(5’-AMP-activated protein kinase subunit gamma-1)→isa→(protein)

A relationship is considered the inverse of another relationship if it can be used to link the same nodes of the relationship in the opposite direction. For example, the inverse_isa relationship for a concept can be used to identify those concepts that have an isa relationship with the concept.

(protein)←inverse_isa←(5’-AMP-activated protein kinase subunit gamma-1)

Inverse relationships can be ambiguously named: for example, the inverse relationship of has_gene_product is not inverse_has_gene_product, but gene_product_of. To obtain inverse relationships that are ambiguously named, the UBKG refers to the RO.

IRI

In an OWL file, an entity or a relationship can be described with an International Resource Indicator (IRI). An IRI is a PURL (Permanent Uniform Resource Locator) that refers to a published online resource.

UBKG recognizes entity IRIs in format OBO PURL/ontology identifier_code in ontology. For example, the IRI for a protein entity in PR is http://purl.obolibrary.org/obo/PR_Q68D20.

UBKG recognizes relationship IRIs formatted in the same manner as for entities, provided that the OBO PURL is from the Relationship Ontology (RO)–e.g., http://purl.obolibrary.org/obo/RO_0002160.

Knowledge Graph (database)

A knowledge graph is a representation of information characterized by the relationships between a set of entities. Whereas relational databases organize information into tables, rows, and fields, knowledge graph databases organize information into nodes (representing entities) and edges (representing relationships).

NCBO

The National Center for Biomedical Ontology (NCBO) maintains BioPortal, a repository of biomedical ontologies.

neo4j

The UBKG database is deployed in instances of the neo4j graph data platform. “A neo4j” or “the neo4j” is equivalent to “an instance of the UBKG database hosted in a neo4j server”.

Node

A synonym for an entity, used primarily in knowledge graphs.

Ontology

For the purposes of the UBKG, an ontology is the representation of a set of entities and the relationships between them.

OLS

The Ontology Lookup Service OLS is a repository of biomedical ontologies. The OLS allows for searches of relationship properties (relationships), including those from the RO.

OBO Foundry

The Open Biological and Biomedical Ontology Foundry OBO is a community that maintains a set of “interoperable ontologies for the biomedical sciences”, as well as a set of principles–i.e., best practices and standards for representing ontologies in OWL.

Many of the ontologies published in BioPortal follow OBO principles.

OWL files

The Web Ontology Language OWL is a language for representing ontologies in a standard format that can be interpreted by software applications. Biomedical ontologies can be published as OWL files in reference sites such as the NCBO BioPortal and the OBO Foundry.

Relationship

A relationship is an association between two entities in an ontology.

Types of relationships

Most of the relationships in biomedical ontologies can be characterized with one of two types:

In a knowledge graph, a relationship corresponds to an edge between two nodes.

Relationship Ontology (RO)

The Relationship Ontology (RO) is an ontology of the relationships that are used in other ontologies–i.e., how relationships themselves relate to one another.

A relationship between relationships that is important in the UBKG is the inverse relationship.

Relationships in RO can be reviewed in a number of ways, including:

SAB

The UBKG adopts the UMLS practice of identifying source ontologies with a Source Abbreviation (SAB). Examples of UMLS SABs include SNOMED_CT and UBERON. UBKG uses published acronyms for ontologies when possible–e.g., PATO.

Term (preferred, synonym)

A usually short text identifier for a code in an **ontology**. For example, a term for code 64033007 in SNOMED_CT is “kidney”.

A term can be a preferred term or a synonym.

Triple

A triple asserts a relationship** between two **entities (nodes) in an ontology.

A triple is in the format subject predicate object

subject and object are both entities (nodes). predicate is a relationship (edge).

Vocabulary

For the purposes of the UBKG, a vocabulary is similar to an ontology. Some vocabularies (e.g., SNOMED_CT) are also ontologies.

UMLS

The Unified Medical Language System UMLS consolidates a number of different biomedical ontologies, vocabularies, and standards.

The UBKG represents UMLS and other concept data with a set of nodes and edges, including nodes for

image