|
|
Linkage disequilibrium
|
I have been interested for a long time in how to measure linkage
disequilibrium and in the variations of LD across the genome and
across populations.
Sabatti, C. and N. Risch (2002) "Homozygosity and linkage
disequilibrium," Genetics 160: 1707-1719. Preprint
Sabatti, C. (2002) "Measuring dependence with volume tests," The American Statistician
50: 191-195. Preprint
Ayers, K., C. Sabatti, and K. Lange (2006)
"Reconstructing ancestral haplotypes with a dictionary model,"
Journal of Computational Biology, 3, 3: 767-785.
Wang, H., C. Lin, S. Service, The international collaborative group on isolated populations, Y. Chen, N. Freimer, C. Sabatti (2006)
"Linkage disequilibrium and haplotype homozygosity in population
samples genotyped at a high marker density," Human Heredity ,
62 : 175-189.
Chen, Y., C. Lin, C. Sabatti (2006) "Volume measures for linkage
disequilibrium," BMC Genetics
7:54
|
Association mapping
|
We are generally interested in association mapping.
I have contributed to develop
a Bayesian method for haplotype mapping, and have been quite
interested in the problems of multiple comparison in association genomescans.
Furthermore, I am interested in how association mapping can be
combined with multiple phenotype analysis, and studies of population structures.
We have NIH funding for these projects, in cooperation with the laboratories of Professor Freimer.
Liu, J., C. Sabatti, J. Teng, B. Keats, and N. Risch (2001) "Bayesian analysis of haplotypes for linkage disequilibrium mapping," Genome Research 11: 1716-24. Preprint
Sabatti, C., S. Service, and N. Freimer (2003) "False discovery rates in linkage and association linkage genome screens for complex disorders," Genetics 164: 829-833. Reprint
Freimer, N. and C. Sabatti (2003) "The human phenome project,"
Nature Genetics 34: 15-21.
Reprint
Freimer, N. and C. Sabatti (2004) "Pedigree, sib-pair, and
association studies of common diseases; genetic mapping and
epidemiology," Nature Genetics 36:
1045-1051.
Reprint
Sabatti, C. (2006) "Comment on the `Likelihood-Based
Inference on haplotype effects in genetic association studies' by Lin
and Zeng," Journal of the American Statistical
Association 101: 104-106. (Invited contribution.)
Service, S., The international collaborative group on isolated
populations, C. Sabatti, N. Freimer (2007)
"Tag SNPs chosen from HapMap perform well in several population
isolates," Genetic Epidemiology, Epub ahead of print.
Freimer, N. and C. Sabatti (2007) "Human genetics: variants
in common diseases." Nature 445: 828-30. (Invited contribution.)
Ayers, K., C. Sabatti and K. Lange (2007) "A dictionary model for
haplotyping, genotype calling, and association mapping"
Genetic Epidemiology 31 : 672-683.
Currently we are investigating genetic epidemiology in the Northern Finland Birth Cohort
|
High density SNP genotyping
|
We are developing models for intensity values of the Affymetrix and
Illumina genotyping arrays to be used in genotype calls, linkage studies, and loss of heterozygosity studies.
We have NIH funding for these projects, in cooperation with Professors
Ken Lange, Stan Nelson, and Roel Ophoff.
Sabatti, C. and K. Lange (2005) "Bayesian Gaussian mixture models for high density genotyping arrays," UCLA
Stat preprint 421,
to appear in JASA.
Wang, H., Y. Lee, S. Nelson, and C. Sabatti (2005) "Inferring genomic loss and location of tumor suppressor genes from high density genotypes," UCLA
Stat preprint 423,
Journal of the French Statistical Society, 146:
153-171.
Wang, H., C. Lin, S. Service, The international collaborative group on isolated populations, Y. Chen, N. Freimer, C. Sabatti (2006)
"Linkage disequilibrium and haplotype homozygosity in population
samples genotyped at a high marker density," Human Heredity ,
62 : 175-189.
|
DNA copy number reconstruction
|
Raw data from high density genotyping arrays can be used to
reconstruct DNA copy number. We are involved both in method
development and data analysis projects.
Wang, H., J. Veldink, H. Blaw, R. Ophoff, C. Sabatti (2008) "Detecting copy
number variation using Illumina genotyping technology." UCLA
Stat Preprint 533
Stefansson, H. et al (2008) Large recurrent microdeletions associated
with schizophrenia, Nature 455 232-6.
|
High Throughput Screens
|
In collaboration with Koppany Visnyei and Harley Kornblum we are
developing methods for the analysis of high throughput screen
data. The statistics student Denise Ferrari has put together an R
software package that implements our suggested pre-processing.
Sabatti, C., K. Visnyei, H. Kornblum (2008) "Statistical
challenges in High-throughput Screens." UCLA
Stat Preprint 532
|
Gene expression array denoising
|
Gene expression arrays represent a formidable tool, as they allow
investigation of thousand of genes at the same time. However, in order
to exploit at best their potential, one has to be able to deal successfully with the statistical issue involved in their analysis.
We have suggested a de-noising approach based on thresholding.
Using a Bayesian hierarchical model and an approach to multiple
comparison that is inspired by the False Discovery Rate, we denoise the signal coming from multiple array experiments with the specific goal of identifying the genes that are up-regulated or down-regulated in a given condition.
Our model is flexible and can be used for clustering.
This project is partially founded by NSF and NASA.
Sabatti, C., S. Karsten, and D. Geschwind (2002) "Thresholding rules for recovering a sparse signal from microarray
experiments,"
Mathematical Biosciences 176: 17-34. Preprint
Erickson, S. and C. Sabatti (2005) "Empirical Bayes estimation of a sparse vector of gene expression," Statistical Applications in
Genetics and Molecular Biology, 4 :22.
|
Genomic scale identification of promoter binding sites
|
One of the best understood mechanism of transcription regulation is the action of regulatory proteins, that binding on the up-stream region of a gene act either as promoters of suppressors.
We have developed a stochastic dictionary model to identify the position of known binding sites on a genomewide scale. We use this information to improve the clustering of array experiments and to reconstruct the regulatory network.
Our model organism for these investigations has been E. Coli.
This project is in cooperation with the laboratories of Professor Lange and Liao.
It is partially funded by NSF and NASA.
Sabatti, C. and K. Lange (2002) "Genomewide motif identification using a dictionary model," IEEE Proceedings 90: 1803-1810. Preprint
Sabatti, C., L. Rohlin, K. Lange, and J. Liao (2005) "Vocabulon: a dictionary model approach for reconstruction and localization of transcription factor binding sites," Bioinformatics 21: 922-931. Preprint
|
Gene regulation networks
|
To recover the dynamic behavior of regulatory proteins and their
pathway of influence on cell behavior, we combine
sequence analysis with results of gene expression array
experiments.
We develop a sparse hidden component model to link transcription
factors activity to gene expression. We use the Vocabulon algorithm to
search for binding sites of regulatory proteins in the genome and
inform our prior distribution on network structure.
This project is in cooperation with the laboratories of professor Liao
and Roychowdhury and with Gareth James.
It is partially funded by NSF and NASA.
Sabatti, C., L. Rohlin, M. Oh, and J. Liao. (2002) "Co-expression pattern from DNA microarray experiments as a tool for operon prediction,"
Nucleic Acid Research 30: 2886-2893. Reprint
Liao, J., R. Boscolo, Y. Yang, L. Tran, C. Sabatti, and
V. Roychowdhury (2003) "Network component analysis: reconstruction of
regulatory signals in biological systems," Proceedings of the
National Academy of Science 100: 15522-15527. Reprint
Kao, K., Y. Yang, R. Boscolo, C. Sabatti, V. Roychowdhury, and J. Liao (2004) "Determination of multiple transcription regulator activities in Escherichia coli using network component analysis," Proceedings of the National Academy of Science 101: 641-646. Reprint
Sabatti, C. and G. James (2006)
"Bayesian sparse hidden components analysis for transcription regulation networks,"
Bioinformatics, 22 : 739-746.
|
|