Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change auto-update of rare slim to include children of rare diseases #1217

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

allenbaron
Copy link
Collaborator

This updates the make rule that automatically adds diseases to the DO_rare_slim. The original version (still currently active) adds diseases with ORDO or GARD xrefs only. This change would also add any disease that is a child of those diseases.

While this seems theoretically correct, there are a surprising number of diseases added and it seems likely they are not all rare (particularly for cancers).

Instead of implementing this as is, it might be wise to first accomplish one or more of the following and then revise the theory of what is considered rare:

  • Formulate a model for identifying rare diseases based on global and regional criteria.
  • Review the diseases currently in the rare disease subset and remove those that are no longer considered rare by Orphanet and GARD.
  • Create an exclude list to avoid re-adding curated diseases that are not rare. This might be particularly relevant if we choose to keep Orphanet and/or GARD xrefs on diseases even after they deprecate them due to non-rarity in their region.

While theoretically a correct approach, this seems to be include
too many diseases in practice.
@allenbaron allenbaron added enhancement data engineering Indicates code development to improve data/build automation labels Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automation data engineering Indicates code development to improve data/build enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant