PubChem Ingest File format



Field Corresponding element in UBKG Accepted formats Examples
subject Code node PUBCHEM PubChem CID [PUBCHEM 9549299](
predicate relationships For hierarchical relationships, the IRI OR the string “isa”
    For non-hierarchical relationships, an IRI for a relationship property in RO
    Custom string drinks milkshake of
object Code node same as for subject  

Relationships (predicates)

The definition of relationships is the principle informatics task of assertion. An appropriate selection of concept in the node_dbxrefs field of nodes.tsv will associate cross-referenced assertions.

## An example for PUBCHEM 9549299:

An EGFR inhibitor inhibits the expression of EGFR (UNIPROTKB ID P00533), so a possible assertion is

subject predicate object
1 2 3

RO_0002449 = directly inhibits

Because UNIPROTKB is already integrated into the UBKG, any relationship with P00533 would also get the link to HGNC 3236:




Field Corresponding element in UBKG Accepted formats Examples
node_id Code node PUBCHEM PubChem CID [PUBCHEM 9549299](
node_label Term node, Preferred Term (PT) relationship Text string for the Compound Name EGFR Inhibitor
node_definition (optional) Definition node, DEF relationship Text string - IUPAC Name? N-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide
node_synonyms (optional) Term node; Synonym (SYN) relationship Pipe-delimited list of synonyms See example (pipes are also used to format table cells)
node_dbxrefs (optional) Cross-references Pipe-delimited list of references to cross-referenced concepts. Each cross-reference should be in format SAB:code or UMLS:CUI UMLS:C5574906

Example of synonyms for EGFR inhibitor


i.e., 2.1.1IUPAC Name|2.1.2InChI|2.1.3InChIKey|2.1.4Canonical SMILES