PubChem Ingest File format

edges.tsv

Fields

Field Corresponding element in UBKG Accepted formats Examples
subject Code node PUBCHEM PubChem CID [PUBCHEM 9549299](https://pubchem.ncbi.nlm.nih.gov/compound/9549299
predicate relationships For hierarchical relationships, the IRI http://www.w3.org/2000/01/rdf-schema#subClassOf OR the string “isa” http://www.w3.org/2000/01/rdf-schema#subClassOf
    For non-hierarchical relationships, an IRI for a relationship property in RO http://purl.obolibrary.org/obo/RO_0002292
    Custom string drinks milkshake of
object Code node same as for subject  

Relationships (predicates)

The definition of relationships is the principle informatics task of assertion. An appropriate selection of concept in the node_dbxrefs field of nodes.tsv will associate cross-referenced assertions.

## An example for PUBCHEM 9549299:

An EGFR inhibitor inhibits the expression of EGFR (UNIPROTKB ID P00533), so a possible assertion is

subject predicate object
PUBCHEM 9549299 http://purl.obolibrary.org/obo/RO_0002449 UNIPROTKB P00533
1 2 3

RO_0002449 = directly inhibits

Because UNIPROTKB is already integrated into the UBKG, any relationship with P00533 would also get the link to HGNC 3236:

image

nodes.tsv

Fields

Field Corresponding element in UBKG Accepted formats Examples
node_id Code node PUBCHEM PubChem CID [PUBCHEM 9549299](https://pubchem.ncbi.nlm.nih.gov/compound/9549299
node_label Term node, Preferred Term (PT) relationship Text string for the Compound Name EGFR Inhibitor
node_definition (optional) Definition node, DEF relationship Text string - IUPAC Name? N-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide
node_synonyms (optional) Term node; Synonym (SYN) relationship Pipe-delimited list of synonyms See example (pipes are also used to format table cells)
node_dbxrefs (optional) Cross-references Pipe-delimited list of references to cross-referenced concepts. Each cross-reference should be in format SAB:code or UMLS:CUI UMLS:C5574906

Example of synonyms for EGFR inhibitor

N-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide|1S/C21H18F3N5O/c22-21(23,24)14-3-1-4-15(9-14)27-18-11-19(26-12-25-18)28-16-5-2-6-17(10-16)29-20(30)13-7-8-13/h1-6,9-13H,7-8H2,(H,29,30)(H2,25,26,27,28)|YOHYSYJDKVYCJI-UHFFFAOYSA-N|C1CC1C(=O)NC2=CC=CC(=C2)NC3=NC=NC(=C3)NC4=CC=CC(=C4)C(F)(F)F

i.e., 2.1.1IUPAC Name|2.1.2InChI|2.1.3InChIKey|2.1.4Canonical SMILES