Word Files
Reference for Downloading Educational Files

Inference of gene regulatory networks from Microarray time series data by dynamic Bayesian networks

Number of pages: 87 File Format: word File Code: 31027
Year: 2012 University Degree: Master's degree Category: Computer Engineering

Tags/Keywords: artificial intelligence - Bayesian networks - Dynamic Bayesian networks - Genetic regulatory networks - Microarray - Microarray Time Series Data - Network - Networks of gene regulation

Part of the Content
Contents & Resources

Summary of Inference of gene regulatory networks from Microarray time series data by dynamic Bayesian networks

Master's Thesis in Computer Engineering-Artificial Intelligence

Abstract

Deduction of Gene Regulation Networks from Microarray Time Series Data by Dynamic Bayesian Networks

Genetic regulatory networks are a set of gene-gene relationships that establish cause and effect relationships in gene activities. Our knowledge about these networks plays a very effective role in understanding biological processes and can lead to the discovery of new methods for the treatment of complex diseases and the production of effective drugs. Many methods have been proposed to detect genetic regulatory networks. In the meantime, dynamic Bayesian networks have special advantages that have attracted a lot of attention. Despite the research done in this field, reverse engineering of gene regulatory networks by dynamic Bayesian networks is by no means obvious. Often, the number of samples available for training the model is much less than the number of unknowns of the problem. Also, the high complexity of these models and their accuracy are among their most important shortcomings.

One of the main methods used to increase the accuracy of inferred networks is the use of basic knowledge about gene regulatory networks. One of the major sources of this basic knowledge is our knowledge about the overall structure of gene regulatory networks. The researches show that the number of edges in these networks is small. Also, many evidences have been obtained that show that the distribution of the output degree in gene regulation networks follows the power law. In fact, these networks are scale-free in output degree. Despite this evidence, the methods of learning dynamic Bayesian networks consider such networks as networks with a random structure or only control the complexity of the network. The proposed method has a polynomial time complexity and can be used to infer networks with a large number of nodes. The experiments that have been conducted to compare the ability of the proposed algorithm with previous network learning methods show that the proposed algorithm, when used to infer networks that are scale-free, is able to significantly increase the quality of the inferred network, especially when the training data is insufficient.

Key words: Bayesian dynamic networks, gene regulation networks, Scale-Free structure

Chapter One

Introduction

In every cell of a living organism, every moment, thousands of genes are connected to make complex biological processes possible. Genetic regulatory networks [1] are a set of DNA parts in the cell that are indirectly (by RNA or protein production) connected with each other and other substances inside the cell, thereby controlling the transcription speed [2] from genes to form mRNA. Each mRNA molecule produces a specific protein with a specific function. Some proteins are used only to turn genes on or off. Such proteins are called transcription factors[3] and play the main role in the gene regulation network. In other words, the genetic regulatory network is a set of gene-gene connections that creates a cause and effect relationship in gene activities. Our knowledge about these networks plays a very effective role in understanding biological processes and can lead to the discovery of new methods for the treatment of complex diseases and the production of effective drugs. Therefore, the detection and reverse engineering of genetic regulatory networks has become one of the most important research fields [1]. Microarray is a technology that has created the ability to simultaneously measure the expression level [4] of mRNA related to thousands of genes and can provide us with information about the relationship of genes at the genome level [2]. But there is no simple solution to detect genetic regulatory networks from microarray data. In most cases, the number of unknowns is very large. This is despite the fact that we have a small amount of data.. Also, in many cases, the error rate in existing measurements is high, or we are facing the problem of lack of measurement for some variables.

Microarray data can be divided into two types: static[5] and time series[6]. The first mode is an image of the expression of genes in a specific moment and condition. In the second case, the expression of genes in an intracellular process is measured over time. These time series reflect intracellular dynamic processes. Most of the early methods that were used to analyze microarray time series data were actually methods that were designed for static data. In the last few years, methods for working with time series data have been specifically proposed, which are able to solve the problems that are specific to time series data, and also use the unique features of this type of data. However, working with time series data requires more subtlety and precision than static data, and the reverse engineering of genetic regulatory networks is more difficult in these cases.

Many methods have been proposed to detect genetic regulatory networks, the most important of which are: Boolean networks [3], random Boolean networks [4], differential equations [5] and Bayesian networks [7] [6]. In the meantime, Bayesian networks, which are able to express the cause and effect relationship between variables based on probabilistic relationships, have attracted a lot of attention. Due to the noise of Microarray data, the use of probabilistic models can greatly increase the efficiency of the model. Despite the relative success of Bayesian networks, the impossibility of loops [8] in these networks limits their efficiency in many cases because feedback loops [9] are common in real genetic regulatory networks. Therefore, when dealing with time series data, dynamic Bayesian networks become a suitable option for modeling [7,8,9]. Dynamic Bayesian networks are a more general form of Bayesian networks that can model data with time delays. Dynamic Bayesian networks have special advantages that have made this model attract a lot of attention. First, in this type of model, we are able to directly show cause and effect relationships between variables and use the information available in this case. The second advantage of this model is its random nature. The processes related to gene regulation are random processes, and even if these processes themselves are inherently deterministic, the large amount of error in the measurements made makes the processes seem random from our point of view. The third thing that makes this model superior is the ability of these networks to follow the change of variables over time.

Despite these features, reverse engineering of gene regulation networks from time series data by dynamic Bayesian networks is by no means obvious. Often, the number of samples available for training the model is much less than the number of unknowns of the problem [10]. Also, there is a lot of error in the measured values, and in some cases, measurements were not made for some variables. Currently, they are mostly used in experiments with a small number of genes or simulated data. The high complexity of these models as well as their low accuracy are among their most important shortcomings. More research is needed in this field to obtain models to work with high-volume data and increase the efficiency of the generated models.

One of the main methods used to increase the accuracy of inferred networks and compensate for the lack of training data during the network learning process is to use basic knowledge about gene regulatory networks [11]. One of the main sources of this basic knowledge is the information obtained about the general structure of gene regulatory networks. The conducted researches show that these networks are quiet in terms of communication[10]. In other words, the number of edges in these networks is small. Also, many evidences have been obtained that show that the output degree distribution in gene regulatory networks follows the power law [11] [12,13]. In fact, these networks are scale-free at the output level. This is while the degree of input in them follows the Poisson distribution with a low mean [14,15,16].
Contents & References of Inference of gene regulatory networks from Microarray time series data by dynamic Bayesian networks

List:

Chapter One: Introduction 1

The need to do the work 6

Overview of the thesis chapters 6

Chapter Two: Research background 8

2-1- Introduction 9

2-2- Biological basics 9

2-2-1- Genes 9

2-2-2- Gene expression 10

2-2-3- Gene regulatory networks 11

2-3- Methods of learning gene regulatory networks 12

2-3-1- Methods based on clustering 12

2-3-2- Methods based on regression 13

2-3-3- Methods based on mutual information 14

2-3-4- Method 14

2-3-5- Methods based on system theory 14

2-3-6- Bayesian methods 15

Chapter three: Proposed method 18

3-1- Introduction 19

3-2- Dynamic Bayesian networks 20

3-3- Learning dynamic Bayesian networks 22

3-3-1- Bayesian scoring methods 23

3-3-1-1- Scoring by K2 method 25

3-3-1-2- Scoring by BDe method 26

3-3-2- Scoring methods based on information theory 26

3-3-2-1- Scoring by log-likelihood (LL) method 27

3-3-2-2- Scoring by BIC method 27

AIC scoring method 28

3-3-2-4- MIT scoring method 28

- Time complexity of learning dynamic Bayesian networks 29

3-4- Random networks and scale-free networks 31

3-5- Proposed method 35 Chapter 4: Experimental results 44 4-1 Introduction 45 4-2 Scale-free network generation methods 46 4-3 Accuracy measurement methods for inferred networks 50
4-4- The first experiment: using the full search method 52

4-5- The second experiment: a closer look at the performance of the proposed method 54

4-6- The third experiment: Using the greedy search 57

4-7- The fourth experiment: Recovering a part of the gene regulation network in Yeast 60

4-8- Experiment Fifth: The performance of the presented method in recovering random networks

Chapter Five: Summary 67

5-1- Conclusion 68

5-2- Suggestion for future work 69

Research sources 70

English 74 Source: English [1] Sima, Chao, Jianping Hua, and Sungwon Jung. "Inference of gene regulatory networks using time-series data: a survey." Current genomics 10, no. 6 (2009): 416.

[2] Pham, Tuan D., Christine Wells, and Denis Crane. "Analysis of microarray gene expression data." Current bioinformatics 1, no. 1 (2006): 37-53.

[3] Akutsu, Tatsuya, Satoru Miyano, and Satoru Kuhara. "Identification of genetic networks from a small number of gene expression patterns under the Boolean network model." In Pacific Symposium on Biocomputing, vol. 4, pp. 17-28. Maui, Hawaii: World Scientific, 1999.

[4] Shmulevich, Ilya, Edward R. Dougherty, Seungchan Kim, and Wei Zhang. "Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks." Bioinformatics 18, no. 2 (2002): 261-274.

[5] De Hoon, Michiel, Seiya Imoto, Kazuo Kobayashi, Naotake Ogasawara, and Satoru Miyano. "Inferring gene regulatory networks from time-ordered gene expression data of Bacillus subtilis using differential equations." In Biocomputing 2003: Proc. Pacific Symposium, vol. 8, pp. 17-28. 2002.

[6] Friedman, Nir, Michal Linial, Iftach Nachman, and Dana Pe'er. "Using Bayesian networks to analyze expression data." Journal of computational biology 7, no. 3-4 (2000): 601-620.

[7] Perrin, Bruno-Edouard, Liva Ralaivola, Aurelien Mazurie, Samuele Bottani, Jacques Mallet, and Florence d'Alche-Buc. "Gene networks inference using dynamic Bayesian networks." Bioinformatics 19, no. suppl 2 (2003): ii138-ii148.

[8] Zou, Min, and Suzanne D. Conzen. "A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data." Bioinformatics 21, no. 1 (2005): 71-79.

[9] Kim, Sun Yong, Seiya Imoto, and Satoru Miyano. "Inferring gene networks from time series microarray data using dynamic Bayesian networks." Briefings in bioinformatics 4, no. 3 (2003): 228-235. [10] Husmeier, Dirk. "Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks." Bioinformatics 19, no. 17 (2003): 2271-2282.

[11] Hecker, Michael, Sandro Lambeck, Susanne Toepfer, Eugene van Someren, and Reinhard Guthke. "Gene regulatory network inference: Data integration in dynamic models—A." Biosystems 96 (2009): 86-103.

[12] Sandy Shaw, Evidence of Scale-free Topology and Dynamics in Gene Regulatory Networks, Proceedings of the ISCA 12th International Conference on Intelligent and Adaptive Systems and Software Engineering, Vol. 0 (2003), pp. 37-40

[13] Featherstone, David E., and Kendal Broadie. "Wrestling with pleiotropy: genomic and topological analysis of the yeast gene expression network." Bioessays 24, no. 3 (2002): 267-274.

[14] Babu, M. Madan, Nicholas M. Luscombe, L. Aravind, Mark Gerstein, and Sarah A. Teichmann. "Structure and evolution of transcriptional regulatory networks." Current opinion in structural biology 14, no. 3 (2004): 283-291.

[15] Klemm, Konstantin, and Stefan Bornholdt. "Topology of biological networks and reliability of information processing." Proceedings of the National Academy of Sciences of the United States of America 102, no. 51 (2005): 18414-18419.

How To Access The File

Evaluation of the performance of intelligent neurophasic models and artificial neural networks in predicting and simulating the quality parameter of TDS of rivers (Case study of the Shirin River)

Number of pages: 117 Category: Civil Engineering

Master's thesis in the field of civil engineering - hydraulic structures Abstract: Rivers are one of the most important and common sources of drinking, agricultural and industrial water supply. These resources have many qualitative fluctuations due to passing through different platforms and direct connection with the surrounding environment. Therefore, predicting the quality of ...

Stock market pattern prediction using perceptron multilayer artificial neural networks

Number of pages: 81 Category: IT Information Technology Engineering

Master's Thesis of Information Technology Engineering, Information Systems Management, Abstract In today's world, due to the change in lifestyle, people are looking for a way to improve and improve their economic situation, one of the most important ways to improve their financial situation is to increase their income. One of the easiest ways is investment, which has different ...

Identifying overlapping entities in dynamic networks

Number of pages: 82 Category: Computer Engineering

Master's Thesis in Computer Engineering-Artificial Intelligence Abstract Identifying Overlapping Organizations in Dynamic Networks Many complex natural and social structures can be considered as networks [1]. Roads, Internet sites, social networks, organizational communication, kinship relationships, electronic mail exchange, telephone calls and financial transactions are just a ...

The use of artificial neural networks to recognize the model of horizontal wells in oil reservoirs using well test data

Number of pages: 126 Category: Chemical - Petrochemical Engineering

Master's thesis in the field of chemical engineering (gas engineering trend) in recent years, many horizontal wells have been drilled around the world. The main reason is the ability to increase the level of the tank in contact with the well, which increases the utilization of the well. Well testing is used to identify the models of hydrocarbon reservoirs and to identify their ...

Renewing the arrangement of radial distribution networks in order to reduce losses and improve the voltage profile in the presence of micro-networks

Number of pages: 80 Category: Electrical Engineering

Dissertation for Master's Degree (M.Sc.) Orientation: Power Abstract In recent years, with the advances made in data processing and transmission technology, distribution companies are more and more interested in using distribution automation systems. One of the most effective users of automation is the renewal of the distribution network, which is often analyzed as an ...

About the sleep timing of nodes in wireless sensor networks

Number of pages: 118 Category: Electronic Engineering

Dissertation for Master's degree (M.Sc) Abstract: A sensor network consists of a large number of sensor nodes that are widely distributed in an environment and collect information from the environment. Since the nodes are powered by batteries, an important issue that is considered in sensor networks is the issue of energy consumption. One of the methods that are very common in ...

Node modeling and calculation of processing power consumption of wireless sensor networks with the help of neural network

Number of pages: 106 Category: Electronic Engineering

Dissertation for M.Sc. Abstract A wireless sensor network is a network consisting of many small nodes. The node receives information about the environment through sensors. The energy consumption of the nodes is usually provided by batteries, which in most cases cannot be replaced. Therefore, the consumption power of nodes is an important issue in these networks. And it is very ...

Optimization of link prediction in social networks with the help of fuzzy logic

Number of pages: 88 Category: Computer Engineering

Non-continuous master's degree in computer engineering Abstract Today, the popularity of social networking sites among people is undeniable, sites that provide users with many possibilities for communication between people. One of the basic problems in analyzing these types of networks is predicting new connections between people in the network. Fuzzy method, as one of the ...

Optimization of energy consumption in wireless sensor networks by ant colony algorithm

Number of pages: 47 Category: Computer Engineering

Master's Thesis in Computer Engineering, Major: Current abstract software using wireless sensor networks (Wireless Sensor Network) in its expanded form. Due to the predominant use of batteries to supply the energy consumption of these sensors and also the lack of easy access to sensors in many of these applications, engineers and researchers are encouraged to design routing ...

Modeling of gas transmission pipes with artificial neural networks in order to detect their defects

Number of pages: 109 Category: Electrical Engineering

Dissertation for Master's Degree in Mechatronics Engineering Persian Abstract The purpose of this project is to introduce a new approach for troubleshooting gas transmission pipelines using mechanical waves, which is much cheaper and easier than other methods. who are currently working. These lines are usually located in difficult environmental conditions and far away and in ...

Inference of gene regulatory networks from Microarray time series data by dynamic Bayesian networks

Summary of Inference of gene regulatory networks from Microarray time series data by dynamic Bayesian networks

Contents & References of Inference of gene regulatory networks from Microarray time series data by dynamic Bayesian networks