GestaltMatcher Database - a FAIR database for medical imaging data of rare disorders.

medRxiv : the preprint server for health sciences(2023)

引用 3|浏览40
暂无评分
摘要
The value of computer-assisted image analysis has been shown in several studies. The performance of tools with artificial intelligence (AI), such as GestaltMatcher, is improved with the size and diversity of the training set, but properly labeled training data is currently the biggest bottleneck in developing next-generation phenotyping (NGP) applications. Therefore, we developed GestaltMatcher Database (GMDB) - a database for machine-readable medical image data that complies with the FAIR principles and improves the openness and accessibility of scientific findings in Medical Genetics. An entry in GMDB consists of a medical image such as a portrait, X-ray, or fundoscopy, and machine-readable meta information such as a clinical feature encoded in HPO terminology or a disease-causing mutation reported in HGVS format. In the beginning, data was mainly collected by curators gathering images from the literature. Currently, clinicians and individuals recruited from patient support groups provide their previously unpublished data. For this patient-centered approach, we developed a digital consent form. GMDB is a modern publication medium for case reports that complements preprints, e.g., on medRxiv. To enable inter-cohort comparisons, we implemented a research feature in GMDB that computes the pairwise syndromic similarity between hand-picked cases. Through a community-driven effort, we compiled an image collection of over 7,533 cases with 792 disorders in GMDB. Most of the data was collected from 2,058 publications. In addition, about 1,018 frontal images of 498 previously unpublished cases were obtained. The web interface enables gene- and phenotype-centered queries or infinite scrolls in the gallery. Digital consent has led to increasing adoption of the approach by patients. The research app within GMDB was used to generate syndromic similarity matrices to characterize two novel phenotypes (CSNK2B, PSMC3). GMDB is the first FAIR database for NGP, where data are findable, accessible, interoperable, and reusable. It is a repository for medical images that cannot be included in medRxiv. That means GMDB connects clinicians with a shared interest in particular phenotypes and improves the performance of AI.
更多
查看译文
关键词
medical imaging database,rare disorders
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要