Using Statistical and Computational Methods to Identify Genetic Variants in Large-scale Genomic Data
Recent advances in sequencing technologies make it possible to sequence a large number of subjects and test many genetic variants. Using statistical and computational methods, my goal is to identify regions of the genome that influence several disorders, which is often called “pleiotropy”. The term “pleiotropy” describes the phenomenon of a single genetic variant influencing multiple traits of an organism; identifying such variants can help us gain a better understanding of disease pathology. Given the importance of these functions, the identification and characterization of this pleiotropy are crucial for a comprehensive biological understanding of complex traits and disease states. Within this broad topic, I address three questions: a) which loci in the genome govern the co-occurrence of disorders? b) how to understand the mechanism that genetic variants influence pairs of traits? c) What statistical models are best suited to identify pleiotropic variants from large-scale genetic data?