Homework #1 Due: January 30th.
The purpose of this exercise is to download and become familiar with the dchip software.
1. Go to the Dchip webpage: www.dchip.org and register for the software. Download the Dchip software. Either download the manual or read over the web based version of the manual.
2. Find a dataset. The data files you will need for dchip are the .cel files, which contain the image information from the scan of the Affymetrix genechip. There are a number of sources for data. The dchip website has links to two public data sets. If you have your own data you are welcome to use it. If you have having trouble downloading the data from the internet or want a different data set, Dr. Elashoff will provide one. Note: you must have the .cdf file that corresponds to the type of chip being used to analyze the data in dchip.
3. Enter the data into dchip. Examine the image file and not any anomalies in the data. Normalize the data and run the model based expression analysis. You can output the expression values in excel format along with their standard errors and the absolute calls using the “Export” tab.
4. Now run “Compare Samples” or “Filter genes” to generate a list of “interesting” genes. You can decide what to define as “interesting”. Generally these genes will be ones that are differentially expressed across the samples in the data set. Experiment with the various parameters to identify genes. Modify the parameters such that you obtain a small number < 30 of genes. Attach the printout of the excel file for these genes.
5. Investigate one of these “interesting” genes and a gene that is labeled as an array outlier in one of the samples using the PM/MM data. Export the PM/MM data and attach a printout.
Write up your findings in a 1-2 page report. Discuss data quality, outliers, your experimentation with finding lists of “interesting” genes (including the method and justification for the final gene list) and comment on what is gained by looking at the PM/MM data.