FROM GENE EXPRESSION TO MOLECULAR PATHWAYS

2y ago

11 Views

3 Downloads

2.10 MB

156 Pages

Last View : 1m ago

Last Download : 3m ago

Upload by : Lilly Andre

Report this link

Download PDF

Transcription

FROM GENE EXPRESSION TOMOLECULAR PATHWAYST HESISSUBMITTED FOR THE DEGREE OF“D OCTOROFP HILOSOPHY ”BYDana Pe’erS UBMITTEDTO THES ENATE OF THE H EBREW U NIVERSITYN OVEMBER 2003

This work was carried out under the supervision ofNir Friedmanii

AbstractMolecular networks involving interacting proteins, RNA, and DNA molecules, underlie the majorfunctions of living cells. DNA microarrays probe how the gene expression changes to performcomplex coordinated tasks in adaptation to a changing environment at a genome-wide scale. In thisdissertation we address the challenge of reconstructing molecular pathways and gene regulationfrom gene expression data. Our goal is to automatically infer regulatory relations between genes,as well as other types of molecular interactions. To answer this challenge, we develop probabilisticgraphical models of the biological system. We offer three such models and algorithms to automatically learn these from gene expression data. Our models and learning algorithms are based onthe assumption that statistical correlation might indicate molecular or genetic interaction. We offersystematic evaluation for each of the methods presented culminating in experimental validation ofnovel predictions, automatically generated by one of our models.

AcknowledgementsI would like to express my deepest gratitude to my mentor, Nir Friedman. Nir is a genuine rolemodel, and while I have had many teachers, Nir’s mark is the most profound. Nir initiated meinto the discipline of machine learning in graphical models and continuously taught me the mostimportant scientific skills: how to dive deep into messy data and surface with simple models thataddress the question at hand, always striving to understand the connection between data, model andreality. Few have these skills and I was privileged to learn from a true master, I leave Nir with muchyet to learn. Nir’s contribution to the research in this thesis is fundamental, from the basic idea ofcoupling Bayesian networks with gene networks to little comments that made my presentation somuch clearer.I have spent a total of ten terrific years as a student at the Hebrew University and this has been asignificant chapter in my life. During this time, many teachers have molded me into the researcherI am today. I would like to thank Avi Wigderson for patiently teaching me the rigors of problemsolving. Avi is a mental giant and I was most privileged to brainstorm with him and learn how hetakes a hard problem apart into little bits he can understand. I would like to thank Shmuel Peleg forteaching me that resarch should first and foremost be fun. Shmuel taught me that if one does notfind enjoyment and passion in the problem at hand, it is probably the wrong problem to be workingon. I rarely left his office without a smile. I was especially fortunate to an adopting “mother” and“father”, Daphna Weinshall and Noam Nisan, in the Computer Science department. While never myofficial mentors, they took me under their wing, providing guidance, many rewarding discussionsand emotional support. In addition, I would like to thank Noam for bringing to my attention αmodular functions and their connection to the MinReg algorithm. I would like to thank Daphna foractively fighting to make my years at the university more comfortable, be it easing the prerequisiteswhen I transferred from mathematics or easing my TA workload as a new mother.Good science is always the joint effort of many people and the research in this thesis is noexception. This thesis could never have happened without Aviv Regev, my scientific partner, biologytutor and dearest friend. My research is the result of a close and synergistic collaboration with Aviv,working with whom is an absolute joy and pleasure. Aviv transformed me from a naı̈ve computerscientist to semi-biologist teaching me so much more than biology along the way. In addition tosharing her wisdom and many unique insights, Aviv gave me endless support and backing. Duringthe toughest and lowest points, Aviv was always there to stop me from quitting by infecting me withher energetic enthusiasm and leading me to believe in myself. There are simply no words to expressmy gratitude to her.I would like to give many thanks to all my co-authors on the works presented here. MichalLinial, the first biologist who dared believe our ideas might have merit. Iftach Nachman, whoshared with me the first steps of this research. Gal Elidan, who brought order and efficiency to thechaos in which I was used to be working in. It was a wonderful pleasure to work with Amos Tanayon our ‘underground’ MinReg project, and the speed in which he programmed some of our ideasnever ceased to surprise me. Amos has great scientific vision and I cherish the many hours we spentii

brainstorming over coffee.My intense collaboration with Eran Segal has been very fruitful and lead to great science. Eran,I very much admire your ability and stamina. I feel very priviliged to have worked with DaphneKoller, a brilliant scientist; I learned much from our many insightful discussions.Lots of thanks to all my lab mates at the Computational Biology group and the Machine Learning group at the Hebrew university. It was marvelous to belong to a group with such great academiccooperation and social atmosphere; Full of seminars, reading groups, or just hallway discussions;Beach parties, dinners, and hiking trips. Specifically, I would like to thank my office mate MatanNinio, who fed me well, almost as often as he distracted me. Matan was always helpful from thecountless times he aided me with system related issues, to the laborious work of printing this thesisand submitting it for me.I would also like to thank the many people who gave me the support and technical backingso I could focus on my research. I thank the Ministry of Science, Israel, for the Eshkol fellowshipawarded to me and the Higher Education Council, Israel, for additional financing. I thank the Systemgroup at the Computer Science Department for the consistently providing the best and most reliablecomputer support possible. I thank the administrative staff at the Computer Science department forall their help and support, shielding me from the bureaucratic jungle that laid beyond our department.I would also like to thank Laura Garwin. Some times help comes unexpectedly, when my laptopcrashed at critical stages of writing this thesis, Laura (at the time a stranger) out of pure kindnessand generosity, lent me her personal laptop and hosted me in a wonderful office at the Bauer Centerfor Genomic Research.During the course of my PhD. studies, the two most important events of my life occurred, theBirths of Inbar and Carmel. I would like to thank my two most beloved daughters for distractingme and granting me joy and happiness of a magnitude I never knew before. I apologize to them,it is Inbar and Carmel that have paid the heaviest price for this thesis, during the endless hours Iworked away from them. I hope you understand and forgive. I thank Rocha, my mother-in-law forthe countless hours she took care of the girls, giving me more time to work. While Rocha was withmy girls, I could peacefully work, knowing they were getting the best of love and care. I thank mybrother Michael for caring so much and for his constant reminders that there is so much more to lifethan research. I would like to thank Bat-Sheva for being available at any hour of the day or nightfor a relaxing walk and an opportunity to wind down.I am grateful to both my parents, Mara and Aaron, for being such wonderful and supportiveparents. They nurtured my curiosity, creativity and passion for understanding from the earliest age.I started my studies in the Mathematics department at the Hebrew university where both my parentsmet and the completion of this thesis gives me a great feeling of fulfillment. Dad, thank you forattempting to teach me Cantor’s diagonal proof from preschool (that was a wee bit early), carefullycorrecting the English for this entire thesis and everything in between. Mom, you have and willalways be my role model, you are my very inspiration to excel, I aspire to be like you.Last, I dedicate this thesis to my better half, Itsik. I am endlessly indebted and grateful to Itsikiii

for everything. My love, thank you for helping me in all aspects of my research. I thoroughlyenjoyed our scientific discussions that occurred at all times of the day and in all forms of dress.Many of your comments have been invaluable to my work. Thank you for your help with all mymanuscripts including this one. Thank you for your unconditional love in my worse moments andfor being a strong pillar of support in most desperate moments. Thank you for making my victoriesmore memorable by sharing them with you, this victory could have never happened without all yourencouragement and help.iv

Contents12Introduction11.1Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11.2Microarrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31.3Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41.4Our approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51.5Road Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6Bayesian Networks Primer82.1Model Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .92.2The Graph structure: Independence, Dependence and Causality . . . . . . . . . . .122.2.1d-separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .122.2.2Equivalence Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .142.2.3Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17Learning Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .182.3.1Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192.3.2Structure Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .222.33Bayesian Network Models for Biological Interactions283.1Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .283.2Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .293.3Extracting Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .303.4In Silico Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .323.5Biological Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .343.5.1Gene Mates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .343.5.2Separators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .383.5.3Hubs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39v

3.63.73.84. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .413.6.1Constructing Subnetworks . . . . . . . . . . . . . . . . . . . . . . . . . .423.6.2Biological Subnetworks . . . . . . . . . . . . . . . . . . . . . . . . . . .44Systematic Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .463.7.1Statistical Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . .473.7.2Comparison to Literature . . . . . . . . . . . . . . . . . . . . . . . . . . .493.7.3Comparison to Other Methods . . . . . . . . . . . . . . . . . . . . . . . .51Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54Computational Methods for Learning Bayesian Networks564.1The “Sparse Candidate” Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . .564.1.1Outline of Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . .574.1.2Choosing Candidate Sets . . . . . . . . . . . . . . . . . . . . . . . . . . .584.1.3Learning with Small Candidate Sets . . . . . . . . . . . . . . . . . . . . .604.1.4Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64Modeling Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .644.2.1Modeling an Intervention . . . . . . . . . . . . . . . . . . . . . . . . . . .654.2.2Scoring with Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . .674.2.3Inferring causality with mutational data . . . . . . . . . . . . . . . . . . .694.2.4Empirical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .724.2.5Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .764.25SubnetworksFocusing on Regulation - MinReg775.1A Regulation Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .775.2Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .795.2.1Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . .805.2.2MinReg Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .82Technical Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .835.3.1Performance Guarantee . . . . . . . . . . . . . . . . . . . . . . . . . . . .835.3.2MinReg Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . .855.4Annotating Regulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .875.5Biological Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .895.6Systematic Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .935.6.1Robustness and Cross Validation . . . . . . . . . . . . . . . . . . . . . . .935.6.2The Importance of Candidate Regulators . . . . . . . . . . . . . . . . . .955.3vi

5.6.35.76Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .96Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97Module Networks - Reconstructing Regulatory Modules986.1From Bayesian Network to Module Network996.2From Module Network to Regulatory Module . . . . . . . . . . . . . . . . . . . . 1006.2.16.36.46.56.66.77. . . . . . . . . . . . . . . . . . . .Algorithmic Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102Biological Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1046.3.1Selected Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1046.3.2Global View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1076.3.3Experimental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 110Definition and Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.4.1Formal Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.4.2Bayesian Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.4.3Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.4.4Priors and the Bayesian Score . . . . . . . . . . . . . . . . . . . . . . . . 116Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176.5.1Structure Search Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176.5.2Module Assignment Search Step . . . . . . . . . . . . . . . . . . . . . . . 1186.5.3Algorithm Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216.5.4Learning with Regression Trees . . . . . . . . . . . . . . . . . . . . . . . 121Systematic Evalution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.6.1Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.6.2Gene Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Discussion1277.1Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1277.2Comparing the Methods7.3From Gene Expression to Transcriptional Regulation . . . . . . . . . . . . . . . . 1307.4Future Prospects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128Bibliography138vii

viii

Chapter 1IntroductionMolecular networks involving interacting proteins, RNA, and DNA molecules, underlie the majorfunctions of living cells. Different metabolic, signaling and transcriptional levels are integratedto maintain a working cell. Deciphering the organization of molecular networks, their functionand behavior under different conditions is a major goal of molecular cell biology. The availabilityof complete genomic sequences, combined with robotics, computing and material sciences, haslead to the development of high-throughput assays that probe cells at a new, genome-wide, scale.For instance, DNA microarrays [55, 90] can measure the mRNA levels of an entire genome in asingle experiment. A major promise of such high-throughput methods, is that they will enable us toreconstruct how tens of thousands of genes and proteins work together in interconnected networksto orchestrate the basic functions of life.In this dissertation we address the challenge of reconstructing molecular pathways and generegulation from gene expression data. Our goal is to automatically infer regulatory relations between genes, as well as other types of molecular interactions. To answer this challenge, we developprobabilistic models of the biological system. A model is a simplification of the underlying systemthat captures the primary phenomena we are interested in and explains how these lead to the observations we make through our assays. We focus on probabilistic models that use stochasticity toaccount for measurement noise, variability in the biological system, and aspects of the system thatare not captured by the model. In this thesis we formulate a number of such models, develop algorithms to learn the structure of these models from data and provide a systematic biological analysisfor our resulting models.1.1 Biological BackgroundWe begin with a brief overview of the basic concepts of molecular biology - the interested readeris referred to molecular biology textbooks [2] for more information. Cells are the fundamentalworking units of every living system. To a large extent, cells are made of proteins, which determinethe shape and structure of the cell. In addition, other proteins serve as machines that perform many1

CHAPTER 1. INTRODUCTION2Figure 1.1: The central dogma of molecular biologyof life’s functions, including molecular recognition and catalysis.DNA is the organism’s blueprint, it contains the instructions for the synthesis and regulationof proteins. Instructions for a particular protein is coded on a segment of DNA called a gene. Thecentral dogma of molecular biology states that information flows from DNA through RNA to protein(see Figure 1.1). Thus, protein is synthesized from DNA in the following two step process:1. DNA RNA: Transcription is the process by which RNA polymerase copies a gene untomRNA (messenger RNA) sequence using the DNA sequence as a template. This process bywhich a genes are transcribed into mRNA, present and operating in the cell, is termed geneexpression.2. RNA Protein: In the subsequent process, called translation, a protein factory call ribosome, synthesizes the protein according the information coded in the mRNA.A key observation is while each cell contains the same copy of the organism’s DNA, the geneexpression(and subsequently protein expression) can drastically vary, both temporally and spatially.To control gene expression, specialized proteins called transcription factors bind to the DNA andeither enhance or inhibit the transcription of specific genes. These transcription factors often worktogether in different combinations, to ensure the correct amount of each gene is being transcribed.We note that transcription factions are themselves proteins and are thus subject to transcriptionalcontrol.Transcription factors are by no means the only control over gene expression. Biological regulation is extremely diverse and involves different mechanisms at many layers: Before transcription

1.2. MICROARRAYS3occurs, proteins regulate the structure of the DNA itself and determine whether a transcription factor can bind to the gene specific regulatory sites or not. Once the mRNA molecule is transcribed,other mechanisms regulate its editing and transport to the ribosome, thus controlling whether itgets translated into protein or not. For a given gene, the total amount of mRNA is regulated notonly by transcription (creation) of mRNA, but also regulated by the degradation of mRNA. Regulation continues even after the protein is translated: a large part of biological regulation is viapost-translational modifications that determine a protein’s activity.1.2 MicroarraysIn recent years, technical breakthroughs in spotting hybridization probes and advances in genomesequencing lead to development of DNA microarrays, which consist of many species of probes, either oligonucleotides or cDNA, that are immobilized in a predefined organization to a solid surface.By using DNA microarrays researchers are now able to measure the abundance of thousands ofmRNA targets simultaneously [26, 64], providing a “genomic” viewpoint of gene expression.Microarray technology is based on DNA hybridization: a process in which a DNA strand bindsto its unique complementary strand. A set of probes (known sequence) are fixed to a surface andare placed in interaction with a set of fluorescently tagged targets (unknown sequences). Afterhybridization, the fluorescently lit spots indicate the identity of the targets and the intensity of thefluorescence signal is in correlation to the quantitative amount of each target. Due to differenthybridization affinities between clones and the fact that an unknown amount of cDNA is fixedfor each probe, we cannot directly associate the hybridization level with a quantitative amountof transcript. Instead cDNA microarray experiments compare a reference pool and a target pool.Typically, green is used to label the reference pool, representing the baseline level of expressionand red is used to label the target sample in which the cells were treated with some condition ofinterest. We hybridize the mixture of reference and target pools and read a green signal in case ourcondition reduces expression level and a red signal in case our condition increases expression level(see Figure 1.2).A genome wide measurement of transcription is called an expression profile and provides uswith a complete list of genes whose transcription level is effected in our condition. Biologicallyspeaking, what we measure is how the gene expression of each gene changes to perform complexcoordinated tasks in adaptation to a changing environment. In our context, while transcriptionalregulation directly changes the measured mRNA levels, other factors such as proteins and theiractivity, are not observed by microarrays. Furthermore, due to biological variation and a multi-stepexperimental protocol, these data are very noisy, and fluctuate up to two-fold between repeatedexperiments.In order to obtain a wide variety of profiles, reflecting different active pathways, various perturbations (e.g. mutations [51]) and treatments (e.g. heat shock [40]) are employed. The outcomeis a matrix associating for each gene (row) and condition (column), the expression level. In our

CHAPTER 1. INTRODUCTION4Reference DNATarget DNALabelHybridizeFigure 1.2: An image of a microarray. Each spot represents a different gene.setting, this expression matrix contains thousands of gene and hundreds of conditions. Our goal isto uncover molecular interactions, most notably regulation, from these data.1.3 Previous workThe first attempts to analyze these data identified a list of differentially expressed genes for eachcondition or treatment. Since current technology is very noisy and typical datasets contain only 2-5repeats of each condition, even this simple task is not trivial. Early works [49] defined differentialexpression as a two-fold or greater change in expression. Developing statistically robust tests todetermine which genes are differentially expressed remains an active area of research.Currently, the most popular analysis method is clustering. Clustering of the genes is used toidentify sets of genes that behave similarly (i.e. have similar expression patterns) over a set of experiments [3, 30] (see Figure 1.3). Clustering provides an intuitive way to organize and visualize ofthe data. Furthermore, clustering facilitates in the functional annotation of uncharacterized genes.If an uncharacterized gene belongs a cluster dominated by genes of some function, the unknowngene could possibly have a similar function. While clustering has successfully expanded our understanding in important biological processes (including cell cycle [30], cancer [3], metabolism [51]),it does not address our challenge to uncover the underlying gene network of interactions.Previously, a number of regulatory models have been suggested. The most realistic of suchmodels are stochastic networks [68]. While these directly model many of the actual details of theregulatory machinery, they are extremely complex and can only deal with small scale networks. Formore global applications, simplified and abstract models are required. A few such models have beensuggested, all based on the following basic idea: The regulatory network is a directed graph G. Eachnode in G corresponds to a specific gene that behaves according to some deterministic function ofits parents in G. These include: Boolean network models [89, 1], where each gene is either on or

Genes1.4. OUR APPROACH5clusteringExperimentsFigure 1.3: Clustering gene expression data: Each row corresponds to a gene andeach column corresponds to a microarray sample, i.e., all the spots on the microarray in Figure 1.2 appear as a column in this figure. To the left is the unclusteredinput matrix. To the right is the matrix after clustering reordered the rows andcolumns.off depending on some boolean function of its parents. Linear models [99, 27], where each gene ismodeled as a continuous linear function of its parents. In order to simplify the complexity of suchmodels, it is typically assumed that G is acyclic and of bounded indegree. While these methodshave had partial success on simulated data, none of them have had any success when applied to realbiological data.1.4 Our approachRecall, our goal is to reconstruct molecular networks representing processes such as gene regulation. To answer this challenge, we adopt a systems perspective of the cell and its components, andattempt to build models of this system. Our measurements observe the system at different states,which can be defined in terms of the concentration of active proteins and metabolites in the variouscompartments, the concentration of different mRNA molecules in the cytoplasm, etc. Our basicassumption is that the components in the cell do not work in isolation. Rather they effect each otherthrough a wide variety of interactions. The key point being, that the components effect each otherin a consistent fashion, Thus, if we consider a random sampling of the system, some states are moreprobable than others. For example, Gal4 is a transcription factor which strongly activates the galactose pathway genes, therefore if Gal4 is overexpressed in some state, it is likely that other galactosepathway genes are also overexpressed.We treat measurements of the cell’s components (e.g. gene expression measurements) as random variables and thus the likelihood of a cell state can be specified by the joint probability distribution on these variables. By representing measurements as random variables, to account for

6CHAPTER 1. INTRODUCTIONmeasurement noise, variability in the biological system, and aspects of the system that are not captured by the model.In this dissertation, due to issues of data availability, we only observe the level of mRNA expression for each of the genes. Therefore, we resort to a partial view which projects the activity ofthe entire cell onto gene expression profiles. In our model, each gene is associated with a randomvariable that represents the measurement of its expression. We use the term genes, interchangeably,to represent both the biological genes and the random variables that represent them in our model.We stress that the basic approach described here for gene expression data can be easily extendedto other data types (e.g. protein levels) as these become available. For example, when more direct measurements of transcription factor activity become available, these be easily incorporated asrandom variables in the model and can greatly enhance the resulting reconstruction.Our goal is to estimate the joint probability distribution over gene expression and understandits structural features from data. Our reconstruction of pathway structure is based on the following idea: molecular interactions between the genes sometimes generate corresponding statisticaldependencies between the random variables that represent them. Using Gal4 as an example: Gal4activates the transcription of other galactose genes, thus creating a correlation in their expression.The learning algorithms presented in this dissertation detect consistent statistical dependenciesand reconstruct a model that explains them, i.e., a model that could have generated the observeddata. Our approach is global: we fit a model to data by studying the joint probability distributionover the entire gene set. Once we define such a model, its interpretation is as important an issueas the learning algorithm. An important question that will be repeatedly addressed throughout thedissertation is: What type of molecular relations create statistical dependencies in gene expressionprofiles?A large part of this dissertation focuses on regulatory relations. Our ability to detect regulatoryrelations relies on the assumption that the

important scientic skills: how to dive deep into messy data and surface with simple models that address the question at hand, always striving to understand the connection between data, model and reality. Few have these skills and I was privileged to lear

Related Documents:

Unit 5 Control in Cells & Organisms DNA & Gene Expression ...

AQA GCE Biology A2 Award 2411 Unit 5 DNA & Gene Expression Unit 5 Control in Cells & Organisms DNA & Gene Expression Practice Exam Questions . AQA GCE Biology A2 Award 2411 Unit 5 DNA & Gene Expression Syllabus reference . AQA GCE Biology A2 Award 2411 Unit 5 DNA & Gene Expression 1 Total 5 marks . AQA GCE Biology A2 Award 2411 Unit 5 DNA & Gene Expression 2 . AQA GCE Biology A2 Award 2411 .

109 Views

3y ago

Guide to Performing Relative Quantitation of Gene ...

Gene Expression 1. TaqMan Gene Expression Assays 2. Custom TaqMan Gene Expression Assays 3. TaqMan MicroRNA Assays 4. Use of Primer Express Software for the Design of Primer and Probe Sets for Relative Quantitation of Gene Expression 5. Design of Assays for SYBR Green I Applications Section IV.

45 Views

3y ago

ExpiSf Expression System - Thermo Fisher Scientific

Vector are conveniently included in the ExpiSf Expression System Starter Kit for expression of your gene of interest in ExpiSf9 cells. pFastBac 1 Expression Vector pFastBac 1 Expression Vector is a non-fusion donor plasmid that is used to clone your gene of interest using restriction enzyme digestion and ligation. Gene expression

28 Views

3y ago

Level 2 Biology (91159) 2013 - NZQA

Level 2 Biology, 2013 91159 Demonstrate understanding of gene expression 9.30 am Friday 22 November 2013 Credits: Four Achievement Achievement with Merit Achievement with Excellence Demonstrate understanding of gene expression. Demonstrate in-depth understanding of gene expression. Demonstrate comprehensive understanding of gene expression.

24 Views

3y ago

4A. Control of Gene Expression 4B. Biotechnology 4C ...

gene expression can be regulated by modulating the degree to which the transcript is protected. 1. Initiation of transcription. Most control of gene expression is achieved by regulating the frequency of transcription initiation. 3. Passage through the nuclear membrane. Gene expression can be regulated by controlling access to or efficiency of .

25 Views

3y ago

Control of Eukaryotic Gene Expression (Learning Objectives)

3. Identify the main mechanism for turning on gene expression. Explain why control of gene expression in eukaryotic cells is like a “dimmer switch”, an “ON” switch that can be fine tuned. 4. Identify the major switch and all the fine-tuning steps that can modulate eukaryotic gene expression. 5.

26 Views

3y ago

Emerging Transcriptomic Databases and Their Use in Gene ... - PBGworks

Main purposes of this tutorial ! Provide an updated list of plant gene-expression . expression profiles ! Review considerations relevant to the use of gene expression databases ! Use web-based tools for visualization of transcriptomic data . Background ! Expression databases hosting microarray -derived data have been fundamental to study gene .

10 Views

1y ago

Chapter 17—From Gene to Protein - Weebly

One Gene-One Enzyme Hypothesis (Beadle & Tatum) The function of a gene is to dictate the production of a specific enzyme One Gene—One Enzyme but not all proteins are enzymes those proteins are coded by genes too One Gene—One Protein but many proteins are composed of several polypeptides, each of which has its own gene One Gene—One Polypeptide

52 Views

2y ago

Recent Views

AUTOMOTIVE INDUSTRY ANALYSIS REPORT and GUIDE

3.1 General Outlook of the Automotive Industry in the World 7 3.2 Overview of the Automotive Industry in Turkey 10 3.3 Overview of the Automotive Industry in TR42 Region 12 4 Effects of COVID-19 Outbreak on the Automotive Industry 15 5 Trends Specific to the Automotive Industry 20 5.1 Special Trends in the Automotive Industry in the World 20

1y ago

86 Views

Automotive Pathway Automotive Services Fundamentals

Automotive Pathway Automotive Services Fundamentals Course Number: IT11 Prerequisite: None Aligned Industry Credential: S/P2- Safety and Pollution Prevention and SP2- Mechanical and Pollution Prevention Description: This course introduces automotive safety, basic automotive terminology, system & component identification, knowledge and int

2y ago

228 Views

Articulation Agreements: College of Applied Technologies .

Hernando High School FL Automotive . Central Nine Career Center IN Automotive Elkhart Area Career Center IN Automotive . Kokomo Area Career Center IN Automotive North Lawrence Vo-Tech IN MLR Porter County Career Center IN Automotive Richmond High School IN Automotive Southeastern Career

2y ago

376 Views

Automotive Basics - Auto Upkeep

Automotive Basics - Course Description "Automotive Basics includes knowledge of the basic automotive systems and the theory and principles of the components that make up each system and how to service these systems. Automotive Basics includes applicable safety and environmental rules and regulations. In Automotive Basics, students will gain

1y ago

197 Views

Automotive Automotive Automotive - HSBC Bank Malaysia

This Merchant list is subject to change from time to time. Merchant(s) who are terminated from the Instalment program after the published date might still be reflected in this list. HSBC Cardholder(s) are advised to confirm the availability of HSBC Card Instalment Plan with the merchant. Automotive Automotive Automotive

1y ago

173 Views

On the Road: U.S. Automotive Parts Industry Annual Assessment

Table 12: Acquisitions of U.S. Automotive Parts Companies (SIC 3714) Table 13: Automotive Parts Exports, 2000-2010 Table 14: Automotive Parts Imports, 2000-2010 . Automotive parts consumption is linked to the demand for new vehicles, since roughly 70 percent of U.S. automotive parts production is for Original Equipment (OE) products. .

10m ago

72 Views

EMC TEST SYSTEMS FOR AUTOMOTIVE

AUTOMOTIVE EMC TEST SYSTEMS FOR AUTOMOTIVE ELECTRONICS AUTOMOTIVE EMC TEST SYSTEMS FOR AUTOMOTIVE ELECTRONICS Step 1 Step 2 Step 3: Set the parameters Step 4: Active test. Load dump pulses have high pulse energy, which can be highly destructive to electrical or electronic equipment. The LD 200N series simulates these pulses with high energy in a range of up to 1.2 seconds. The LD 200N .

3y ago

266 Views

Automotive Manufacturing - Select Georgia

Jobs created by Georgia’s automotive-related locations Toyo Tire North America Manufacturing and expansions in the last three years 32,000 Automotive-related engineers and production workers in Georgia Sources: EMSI 2020.3, press releases and Automotive Database, Georgia Power Community & Economic Development, 2020 Automotive Manufacturing

2y ago

166 Views

#1 OSAT for Automotive Packaging and Test

We Know Automotive Amkor has extensive experience with automotive process requirements shipping billions of units every year for automotive applications. Our packages meet or exceed automotive quality, reliability, burn-in and safe launch plan criteria. Amkor also has failure analysis, tri-temp test and statistical process capability in all .

1y ago

145 Views

Ipsos Automotive Center of Excellence

Global Automotive Center of Excellence -2014 Ipsos Automotive 9 Automotive Center of Excellence As global automotive markets get more sophisticated, they require vehicle manufacturers to offer the most relevant market propositions to match consumer needs. There is greater value than ever before for a global research partner, who understands

1y ago

126 Views

All about automotive engineering in a pocketbook The 8th edition has .

Automotive Automotive Handbook Handbook All about automotive engineering in a pocketbook The 8th edition has been revised and extended. Automotive Handbook Reference handbook for academic and personal use. ISBN 978--7680-4851-3 Contents - central themes Basic principles: physics, materials, machine parts, joining and bonding techniques

1y ago

135 Views

Brochure: Advanced Flash Storage Solutions for Automotive Applications

iNAND Automotive Embedded Flash Drives (EFDs) are designed to support the harsh environments, high reliability and quality required by the automotive industry. The automotive iNAND product portfolio supports both UFS and e.MMC interfaces in a small 11.5x13mm package with a wide range of capacities to provide automotive OEMs and Tier-1

1y ago

161 Views

Industry Skills Forecast and Proposed Schedule of Work Automotive

Executive summary The Automotive Retail, Service and Repair (AUR) and Automotive Manufacturing (AUM) Training Packages are critical elements in the Vocational Education and Training (VET) system, playing central roles in the training of learners that engage in the automotive industries. A productive and valuable Automotive Training

1y ago

134 Views

Automotive Programs Student Handbook - SCCIowa

include a basic knowledge of all facets of the automotive repair industry, followed by classroom practice and drills of basic skills utilized in the automotive repair industry. The curriculum includes an internship experience in an automotive repair business. The curriculum is evaluated and revised as automotive repair needs change in the industry.

10m ago

72 Views

automotIve

automotive manufacturers worldwide. Those companies that take a forward-thinking approach will gain a competitive advantage and secure a leadership position in a realigned automotive value chain. At Seco, we partner with OEMs and other vehicle-based organisations around the globe to help automotive manufacturers overcome their

3y ago

145 Views

FROM GENE EXPRESSION TO MOLECULAR PATHWAYS

It looks like you're using an ad-blocker