Algorithms for Comparing Pedigree Graphs
Clinical Orthopaedics and Related Research(2010)
摘要
Pedigree graphs, which represent family relationships, are often constructed
by collecting data from genealogical records to determine which pairs of people
are parent and child. This process is expensive, and small mistakes in data
collection--for example, one missing parent-child relationship--can cause large
differences in the pedigree graphs created. In this paper, we introduce a
simple pedigree definition based on a different type of data which is
potentially easier to collect. This alternative characterization of a pedigree
that describes a pedigree as a list of the descendants of each individual,
rather than a list of parent-child relationships. We then introduce an
algorithm that generates the pedigree graph from this list of descendants. We
also consider the problem of comparing two pedigree graphs, which could be
useful to evaluate the differences between pedigrees constructed via different
methods. Specifically, this could be useful to evaluate pedigree reconstruction
methods. We define the edit distance between two pedigrees and prove that
calculating this edit distance is APX-hard. Our new characterization of a
pedigree allows us to introduce a fast heuristic for the edit distance between
pedigrees. In addition we introduce several exact algorithms for calculating
distances in restricted and general cases.
更多查看译文
关键词
edit distance,data collection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络