Identifying overlapping entities in dynamic networks

Number of pages: 82 File Format: word File Code: 31077
Year: 2013 University Degree: Master's degree Category: Computer Engineering
  • Part of the Content
  • Contents & Resources
  • Summary of Identifying overlapping entities in dynamic networks

    Master's Thesis in Computer Engineering-Artificial Intelligence

    Abstract

    Identifying Overlapping Organizations in Dynamic Networks

    Many complex natural and social structures can be considered as networks[1]. Roads, internet sites, social networks, organizational communication, kinship relationships, electronic mail exchange, telephone calls and financial transactions are just a few examples of these networks. Today, network analysis is one of the most popular and widely used research branches in the world, and it has attracted the attention of researchers from many different fields, including computer science and social science. The results of these researches provide useful tools and information for use in various fields, including: communications, security and business. One of the most widely used topics in the field of network analysis is the identification of organizations [2] in the network. Each formation can be considered as a dense mass of vertices which are connected with other formations through few edges. For example, people with similar interests and tendencies in social networks, pages with related content in the web space, articles with similar fields in the database of articles, are all examples of organizations in different networks. In recent years, a lot of work has been done in the field of identifying formations, and many algorithms and tools have been presented, but efforts are still being made to provide better methods from different perspectives, including speed, accuracy, and scalability.

    In this thesis, two proposed methods, one to increase the efficiency of detecting overlapping formations in static networks and the other for dynamic networks, along with the results of numerous tests conducted to evaluate their efficiency, are presented.

    Vocabulary Key words: dynamic networks, social networks, overlapping formation detection, label propagation method. Chapter 1: Introduction. Also, our ability to infer and understand the surrounding environment depends on a network of billions of nerve cells [2] in our brain. These complex systems play very important roles in various aspects of our lives. Understanding, describing, predicting and controlling these systems are among our biggest challenges in the modern world.

    Usually behind each of these complex systems, there is a huge network that defines the interactions between the components of these systems. For example: chemical interactions inside the body of living organisms, interactions between nerve cells of the brain, friendship, kinship and social relationships, the global Internet network, financial transactions, power transmission and distribution lines, land, air and sea communication routes, are all part of the things that can be described as a network. It can also be said that networks are the heart of many revolutionary technologies of today. Search engines [3], virtual social networks [4], global computer networks, global telecommunications network and mobile phones are just a few of these examples.

    Despite the huge difference and diversity in the nature, size, application, behavior and various characteristics of these systems and networks, whether natural or man-made, certain and similar principles and rules can be observed among them. For example: the network of chemical reactions whose components are very small molecules, the World Wide Web [5] in which web pages are connected to each other by web links [6], social networks that consist of relationships between people and many other cases, all can be described with similar structures and rules, and this is considered a great advantage. Because all these different natural and artificial systems can be described by similar mathematical and modeling tools.

    Given that many of these systems have been known for years, such as biological structures and reactions, communication ways, social relations, and the like, as well as the knowledge of studying systems and networks, the question that may be raised is why the importance of this issue has been revealed only in the last few decades? The answer that can be given is that in the past there were no suitable tools to collect, maintain and process this information, but today with the significant development of technologies such as computers and digital communication networks, it has become possible to collect, combine, share and analyze this information with ease, speed and high accuracy and at low cost. use them to better understand complex systems. In general, four characteristics can be listed for this knowledge, which are briefly mentioned below (1):

    Interdisciplinary nature: According to the way of dealing with the problem in the field of studying networks, this knowledge is not limited to a specific branch of science and can be used in various sciences such as: social sciences, biology, computer, physics, chemistry, information, economics, security and many other cases. For example, the method that is used in the field of social sciences to identify people and groups affecting the society, may be used in computer networks to manage network traffic.

    Practism and focus on data: Unlike graph theory, which focuses more on the abstract and mathematical aspects of problems, this knowledge focuses more on the field of practical application and problem data. For this reason, the tools and methods presented in this field are tested on real data and problems to determine their capability and efficiency. Quantitative and mathematical expression: The study of networks uses mathematical tools and methods to describe and study better and more accurately. For example: graph theory, statistics and probabilities, data mining[8], information theory[9], control and statistical physics are among the sciences that are used in this field.

    Processing and calculations: Since most of the issues raised in this field include a huge amount of information, an important part of the work is directed to the design and application of methods that can handle the required heavy calculations. For this purpose, the design of algorithms, databases and data mining are part of the software tools that are used a lot.

    Applications of network knowledge

    The efficiency and impact of each branch of science, in addition to theoretical achievements, is also examined in the field of practical applications. In this section, we briefly mention some of the applications of network knowledge.

    Economic applications

    Many of the most successful companies of today provide services and technologies that are based on networks. For example, we can refer to Google [10], Facebook [11], Cisco [12] and Apple [13]. Google company performs huge activities in the field of world wide web map mining and many of its services and products are based on this network. Also, Facebook, as the world's largest virtual social network, together with other companies such as Twitter[14], annually earn a lot of income from advertising activities.

    Health applications

    The Human Genome Study Project[15] was completed in 2001 and was able to provide a list of all human genes (1). However, work continues on a detailed map that can show the interaction between genes and other cellular structures. From breaking down food in cells to sensing environmental changes, it is done based on this molecular network. Any disturbance in this network can cause human diseases. For this purpose, the researches of a branch of biology, under the title bio-network[16], are dedicated to this issue. Also, in this direction, other researches are being conducted in pharmaceutical science under the name of network pharmacy[17]. The main goal pursued in this field is to produce drugs that have few side effects in addition to treating diseases. Today, networks play an important role in new pharmaceutical developments and large investments are made in this field. Security Applications Terrorism [18] is one of the most important security issues of the current century that has involved the world community. Network thinking and its study are increasingly used by security institutions to disrupt the financial activities of terrorist organizations and identify people and their facilities. Although most of the studies and activities in this field are classified, sometimes cases of the success of these methods are published.

  • Contents & References of Identifying overlapping entities in dynamic networks

    List:

    Chapter One: Introduction.. 1

    Introduction.. 1

    Network knowledge.. 2

    Applications of network knowledge.. 3

    Economic applications.. 3

    Health applications.. 4

    Security applications.. 5

    Applications in accidents. General.. 6

    Applications in research on the brain.. 6

    Management applications.. 6

    Research applications.. 7

    Other applications.. 8

    History.. 9

    Basic concepts.. 10

    Motive for doing this thesis.. 13

    Overview of the chapters of the thesis.. 14

    Chapter Two: Background of the research.. 16

    Introduction.. 16

    Static networks and dynamic networks.. 17

    Non-overlapping formations and overlapping formations.. 18

    Problem definition.. 19

    Methods available for Detection of overlapping formations in static networks. 21

    Catch penetration method.. 21

    Graph extraction and edge classification method.. 22

    Local expansion and optimization method.. 23

    Fuzzy detection method.. 24

    Dynamic and agent-based algorithms method.. 25

    Other methods.. 26

    Comparison of methods Detection of overlapping formations in static networks. 26

    Data set.. 27

    Evaluation criteria.. 29

    Test results.. 30

    Analysis of results.. 37

    Identification of overlapping formations in dynamic networks.. 38

    Summary.. 38

    Chapter three: presentation of solutions and methods Proposal.. 42

    Introduction.. 42

    A closer look at the label propagation method.. 42

    Algorithm.. 43

    Time complexity analysis.. 45

    Improving the efficiency of the label propagation method.. 46

    Algorithm.. 46

    Algorithm based on label propagation for detection Overlapping formations in dynamic networks. 48

    Algorithm.. 48

    Chapter Four: Experiments and Results.. 52

    Introduction.. 52

    Improving the efficiency of label propagation method in static networks. 52

    Implementation of the basic method.. 52

    Implementation of the proposed method.. 53

    Data set.. 53

    Evaluation criteria.. 54

    Test results.. 54

    Analysis of results.. 57

    Analysis of time complexity.. 58

    Recognition of overlapping formations in dynamic networks.. 58

    Data set.. 59

    Evaluation criteria.. 60

    Test results.. 60

    Analysis of results.. 63

    Analysis of time complexity.. 64

    Chapter five: discussion and conclusion.. 66

    Conclusion.. 66

    Suggestions for future work.. 67

    Resources.. 69

    Source:

    1. The Sequence of the Human Genome. al., J.C. Venter et. s.l. : Science, 2001, Vol. 291.

    2. Ronfeldt, J. Arquilla and D. Networks and Netwars: The Future of Terror, Crime, and Militancy. Santa Monica, CA: s.n., 2001.

    3. Seasonal transmission potential and activity peaks of the new influenza A(H1N1): a Monte Carlo probability analysis based on human mobility. D. Balcan, H. Hu, B. Goncalves, P. Bajardi, C. Poletto, J. J. Ramasco, D. Paolotti, N. Perra, M. Tizzoni, W. Van den Broeck, V. Colizza, and A. Vespignani. s.l. : BMC Medicine, 2009, Vol. 7.

    4. Understanding the spreading patterns of mobile phone viruses. P. Wang, M. Gonzalez, C. A. Hidalgo, and A.-L. Barab?si. s.l. : Science, 2009, Vol. 324.

    5. Mining Face-to-Face Interaction Networks using Sociometric Badges: Predicting Productivity in an IT Configuration Task. L. Wu, B. N. Waber, S. Aral, E. Brynjolfsson, and A. Pentland. s.l. : http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1130251.

    6. Community detection in graphs. Fortunato, S. s.l. : PHYSICS REPORTS, 2010, Vol. 486.

    7. [Online] http://en.wikipedia.org/wiki/Graph_theory.

    8. Barab?si, A.-L. Network Science. 2012.

    9. Overlapping Community Detection in Networks: the State of the Art and Comparative Study. J. Xie, S. Kelley and B. K. Szymanski. s.l. : ACM Computing Surveys, 2013.

    10. Uncovering the overlapping community structure of complex networks in nature and society. Palla, G., Der´enyi, I., Farkas, I., and Vicsek, T. 2005, Nature.

    11. Weighted network modules. Farkas, I.,., ´Abel, D., Palla, G., and Vicsek, T. s.l. : New J. Phys., 2007, Vol. 180.

    12. Link communities reveal multiscale complexity in networks. Ahn, Y.-Y., Bagrow, J. P., and Lehmann, S. s.l. : Nature, 2010, Vol. 466.

    13. Line graphs of weighted networks for overlapping communities. Evans, T. and Lambiotte, R. s.l. : Eur. Phys. J., 2010, Vol. 256.

    14. Detecting the overlapping and hierarchical community structure of complex networks. Lancichinetti, A., Fortunato, S., and Kert´esz, J. 2009. Phys.

    15. Finding communities by clustering a graph into overlapping subgraphs. Baumes, J., Goldberg, M., Krishnamoorthy, M., Magdon-Ismail, M., and Preston, N. 2005. IADIS.

    16. Fuzzy overlapping communities in networks. Gregory, S. s.l. : J. Stat. Mech, 2011.

    17. Fuzzy communities and the concept of bridgeness in complex networks. Nepusz, T., Petr´oczi, A., N´egyessy, L., and Baz´o, F. 2008. Phys.

    18. Near linear time algorithm to detect community structures in large-scale networks. Raghavan, U. N., Albert, R., and Kumara, S. s.l. : Phys. Rev., 2007, Vol. 76.

    19. Finding overlapping communities in networks by label propagation. Gregory, S. 2010, Phys.

    20. SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. Xie, J., Szymanski, B. K., and Liu, X. 2011. ICDM Workshop.

    21. A game-theoretic framework to identify overlapping communities in social networks. Chen, W., Liu, Z., Sun, X., and Wang, Y. 2010. Data Min. Knowl. Discov.

    22. Parallel community detection on large networks with proximity dynamics. Zhang, Y., Wang, J., Wang, Y., and Zhou, L. 2009. SIGKDD Conf.

    23. Optics: ordering points to identify the clustering structure. Ankerst, M., Breunig, M. M., Kriegel, H.-P., and Sander, J. 1999. SIGKDD Conf.

    24. Community detection algorithms: a comparative analysis. Lancichinetti, A. and Fortunato, S. 2009, Phys.

    25. A fast and reasonable method for community detection with adjustable extent of overlapping. Wu, Z., Lin, Y., Wan, H., and Tian, ??S. 2010. ISKE Conf.

    26. Finding statistically significant communities in networks. Lancichinetti, A., Radicchi, F., Ramasco, J. J., and Fortunato, S. 2011. PLoS ONE.

    27. Detecting highly overlapping community structure by greedy clique expansion. Lee, C., Reid, F., McDaid, A., and Hurley, N. 2010. SNAKDD Workshop.

    28. Detecting highly overlapping communities with model-based overlapping seed expansion. McDaid, A. and Hurley, N. 2010. ASONAM Conf.

Identifying overlapping entities in dynamic networks