The Bioregistry project (https://bioregistry.io, https://github.com/biopragmatics/bioregistry, https://www.nature.com/articles/s41597-022-01807-3) promotes data integration by cataloging resources that assign persistent identifiers to biomedical concepts.

It supports many linked open data and semantic web users by producing a harmonized and comprehensive prefix map and providing standardized tooling for working with prefixes, uniform resource identifiers (URIs), and compact URIs (CURIEs). The Bioregistry in turn is used by tools like LinkML, data standards like SSSOM, web applications like the EBI Ontology Lookup Service (OLS), and projects like the OBO Foundry and Monarch Initiative (see https://biopragmatics.github.io/bioregistry/usages/).

This two-part workshop will include a lecture and hackathon component.

First, we will give an introduction to the Bioregistry that includes the following:

  1. An overview of the data model, database, web application, and Python package
  2. A practical example of using the Bioregistry for data standardization and integration
  3. Maintenance of the Bioregistry
    1. Making a new prefix request
    2. Reviewing a prefix request
  4. An overview of curation tasks and guides for new contributors (https://biopragmatics.github.io/bioregistry/curation)

Second, we will host a hackathon open to veteran and new contributors. We will work together to address (some of) the following:

  1. Improve existing records, e.g., by adding contact people
  2. Resolve open issues for new prefixes and updating existing prefixes (https://github.com/biopragmatics/bioregistry/issues)
  3. Improve harmonization with other registries
    1. Integrate Wikidata properties for several domains (taxonomy, bibliometrics, chemistry, etc.)
  4. Pilot semi-automated new prefix suggestion workflow
  5. Address use case-specific data standardization and integration scenarios in a “bring your own data” setting

Based on this experience, we expect to write new contribution guidelines and tutorials that will enable additional contributors. The Bioregistry governance model stipulates that all material contributors to the resource are eligible for co-authorship on future papers. If we are able to make substantial contributions during this time, we would also like to write a short conference report and consider outlining an update paper as a follow-up to the original 2022 publication in Nature Scientific Data.

Recording