WIT Press


Data Mining And Population Genetics Of Birth Defects: Preliminary Investigation

Price

Free (open access)

Volume

33

Pages

8

Published

2004

Size

368 kb

Paper DOI

10.2495/DATA040401

Copyright

WIT Press

Author(s)

B. Little

Abstract

Population level inbreeding can be estimated from mating patterns in marriage records. Birth defect registries separately collect information on the frequency and types of birth defects. The U.S. national census provides population structure data to the county or postal code levels in another separate database. State health departments maintain vital statistics information on marriage, reproduction, birth defects, and mortality in yet other databases. More than 6 million marriages in Texas were analysed to estimate inbreeding at county, regional, and state levels. Three types of birth defects were analysed: (1) one known to be associated with an autosomal recessive inheritance pattern (Ventricular Septal Defect – VSD); (2) one of unknown aetiology, speculated to be associated with a very rare autosomal recessive gene (Ebstein’s Anomaly), and (3) one known to not be related to parental consanguinity (Fetal Alcohol Syndrome - FAS). A significant relationship between estimated local consanguinity and VSD was found. Ebstein’s Anomaly, the birth defect of unknown aetiology but suspected to be recessively inherited, showed a strong relationship to estimated population level of inbreeding, suggesting a major recessive gene influence. In the case of a birth defect known to caused by environmental rather than genetic factors (FAS), no relation to estimated inbreeding was found. In conclusion, data mining population genetic data revealed patterns of birth defects in very large databases (VLDBs) when merged into a data structure suited to data mining. In this instance, a viable hypothesis was derived for the cause of an extremely rare birth defect of unknown aetiology. Hence, data mining of population genetic VLDBs can yield new information that may be useful to guide genomic and clinical research directions.

Keywords