Unified Biomedical Knowledge Graph (UBKG)
Instructions for deploying the Docker neo4j v5 Distribution
Scope
A number of instances of various UBKG contexts are available as Docker distributions. These distributions are designed to allow for turnkey deployment of a complete, fully-indexed, standalone UBKG instance running in a Docker container on a local or networked machine.
Prerequisites
A UBKG instance will includes content from the Unified Medical Language System (UMLS), a repository of biomedical vocabularies that are maintained by the National Library of Medicine. The use of content from the UMLS is governed by the UMLS License Agreement.
Use of the UMLS content in the UBKG requires two licenses:
- The University of Pittsburgh distributes content originating from the UMLS by means of a distributor license.
- Consumers of the UBKG have access to UMLS content through the license that is part of their UMLS Technology Services (UTS) accounts.
Host machine
- Install Docker on the host machine.
- The host machine will require considerable disk space, depending on the distribution. As of December 2023, distribution sizes were around:
- HuBMAP/SenNet: 9GB
- Data Distillery: 20GB
Simple Deployment
This deployment uses default settings.
- Expand the Zip archive. The expanded distribution directory will contain:
- a directory named Data. This directory contains the UBKG neo4j database files.
- a script named build_container.sh. This script builds a Docker container hosting the UBKG instance of neo4j.
- container.cfg.example. This is an annotated example of the configuration file required by build_container.sh.
- Copy container.cfg.example to a file named container.cfg. The container.cfg file is required by the distribution scripts. The container.cfg.example file assigns default configuration values to the distribution, including the password for the neo4j user account. It is recommended that you change the password for the neo4j user account in container.cfg. Passwords must have at least 8 letters and have at least one alphabetic and one numeric character.
- Start Docker Desktop.
- Open a Terminal session.
- Move to the distribution directory.
- Execute
./build_container.sh
. - The build_container.sh will run for a short time (1-2 minutes), and will be finished when it displays a message similar to
[main] INFO org.eclipse.jetty.server.Server - Started Server@16fa5e34{STARTING}[10.0.15,sto=0] @11686ms
The build_container.sh will create a Docker container with properties that it obtains from container.cfg. Following are the default properties:
Property | Value |
---|---|
container name | ubkg-neo4j-<version> |
image name | hubmap/ubkg-neo4j |
image tag | current-release |
ports | 4000:7474 4500:7687 |
read-write | read-only |
- Open a browser window. Enter
http://localhost:<port>/browser/
, whereis the neo4j port specified in the configuration. - The neo4j browser window will appear. Enter connection information:
Property | Value |
---|---|
Connect URL | bolt://localhost: |
Database name | blank (the default). Note that this field may not appear on the page. |
Authentication Type | Username/Password |
Username | neo4j |
Password | password from container.cfg |
- Select Connect.
Custom Deployment
Changes to Docker configuration
To modify the Docker configuration, change values in the configuration file. Keeping the value commented results in the script using a default value.
Value | Purpose | Recommendation |
---|---|---|
container_name | Name of the Docker container | accept default |
docker_tag | Tag for the Docker container | accept default |
neo4j_password | Password for the neo4j user | minimum of 8 characters, including at least one letter and one number |
ui_port | Port used by the neo4j browser | number other than 7474 to prevent possible conflicts with local installations of neo4j |
bolt_port | Port used by neo4j bolt (Cypher) | number other than 7687 to prevent possible conflicts with local installations of neo4j |
read_mode | Whether the neo4j database is read-only or read-write | accept default (read-only) |
db_mount_dir | Path to the external neo4j database | accept default (/data) |
all others | Not used for deployment; values will be ignored | accept default |
Rename configuration file
To specify another configuration file, execute the command /.build_container.sh external -c <your configuration file>