Identifying hidden organizations based on links and content

Number of pages: 82 File Format: word File Code: 31078
Year: Not Specified University Degree: Master's degree Category: Computer Engineering
  • Part of the Content
  • Contents & Resources
  • Summary of Identifying hidden organizations based on links and content

    Master's Thesis in Artificial Intelligence

    Abstract

    Today, social networks such as Facebook have become very popular, because it allows people all over the world to communicate with their friends without physical contact, leave them messages and express their opinions on various topics. Identifying organizations in social networks is very useful in different fields, so this topic is a very interesting research field among researchers of many disciplines. Previous studies only used the structural information and links in the network and neglected other useful information that existed in the network. While in many social networks, there are very useful data produced by users, such as the content of texts produced by each user. By placing this information next to the network link structure, interactions and communications between users can be interpreted. In this study, using the above information, it is shown that users who have close links are placed in a similar field of work. More specifically, in this research, a model for discovering organizations is presented, which initially tries to identify organizations based on the network link structure using a Bayesian approach. Then, by using Manaf navigation tools, if the texts attributed to a user have many similarities with the titles of documents attributed to an organization, that user will be transferred to the new organization. Therefore, people who are in the same organization are also in a similar field of work. The results indicate the ability of the proposed method to discover organizations that are meaningful in terms of meaning. Key words: social networks, organization, identification of organizations, text navigation. First chapter: Introduction 1-1- Social networks. Human-computer interaction [1] has always been of interest since the creation of the first computers and includes the study, planning and design of the relationship between users. and computers. HCI is usually referred to as the intersection of computer science, behavioral science [2], design science and several other fields. This term was proposed for the first time by Kard and his colleagues in the book "Psychology of Human-Computer Interaction" and it implies that the computer has countless applications that are applied without boundaries between it and the user [1].

    Experts in this field were initially looking for a solution to produce hardware with proper ergonomics. During the 1980s, the main focus was on the production of user-friendly software, but it didn't take long for a new perspective to emerge in the 1990s, in which the computer was seen as a tool for creating human interactions. According to this approach, Internet social networks became the cause of interaction between people in the virtual space and became very important [2]. Today, due to the widespread growth of the Internet and communication and information technologies, we are witnessing the formation of a virtual space alongside the real world, which has changed the traditional patterns. This space has features such as being beyond time, impossibility, not being limited by laws, being on space, freedom from physical and sexual identity, and having cultural, economic, and political spaces. Today, virtual social networks play a very important role in creating this virtual space. These spaces, along with positive features, can bring a lot of psychological and political damage to a society. Also, some researchers believe that social networks increase sociability, while others are against this definition and believe that the current social network reduces communication with family [3]. 

    According to the definition provided in Wikipedia, social networks[3] are social structures that are made up of actors who are connected to each other through a certain type of dependence such as friendship, kinship, business, inspiration, ideas, web links, disease transmission (epidemiology), an airline route, or common interests. In other words, social networks are a set of actors that are connected to each other in some way.In recent years, the expansion of the use of digital media to communicate between people, the concept of social networks has entered the computer world, and due to the large number of users in these networks, their analysis has become one of the topics of interest in most fields. Normally, social networks can be displayed in the form of a graph [4], in which the nodes [5] are equivalent to the users of the social network and the edges of the graph [6] indicate the relationship between the actors. According to the structure of the social network and the one-way or two-way connection, the corresponding graph can be directed [7] or undirected [8]. Also, if the weight of communication between people in social networks is not the same, the graph corresponding to the network will be a weighted graph[9], where the weight of each edge corresponds to the weight of the connection [4]. 

    1-2- Division of social networks

    Social networks are divided into two categories: virtual networks and non-virtual networks. Non-virtual networks are operated by groups of interconnected users in social environments. Virtual social networks are collections of websites [10] that provide the possibility of communication for their users regardless of time and place. By using these websites, users use a search engine and add additional features such as voice transmission and visualization, friendly conversation [11], e-mail [12] and . They can share their interests, thoughts and activities with hundreds and even thousands of people around the world in a second.  

    Weblogs[13], Facebook[14], Twitter[15] and YouTube[16] are virtual social networks[5].

    1-3- The importance of social networks

    Today, social networks are of great interest for many reasons and they are important, which we will explain for two main reasons:

    Increasing growth Social networks and their number of users

    Although there are no reliable statistics of the number of users of social networks online [17] [6], but commercial research shows that the population of members of these networks is increasing worldwide. This has encouraged many companies to invest in this sector. Of course, the online social network YouTube, which allows its users to upload and watch short videos, has announced on its statistics site[18] that it currently has more than 800 million unique visitors per month. Millions of members are added to this network every day. In 2011, this network has been localized in 43 countries of the world and is accessible in 60 different languages ??[7].

    Changing the structure of social communication with the entry and expansion of social networks

    Some of the effects of this change include: publishing many important and popular news on social networks instead of using traditional tools such as newspapers, television, etc.

    The widespread effects of social networks on The formation of a new structure in the relationships between people has led many researchers, sociologists and even politicians to look at social networks as one of the most important tools to influence the public mind [3]. Analysis of social networks in many applications [19] including social network management, analysis of market trends, identification of influential people and so on. can be used Business requirements have caused a lot of attention to the analysis of social networks in the academic dimension in recent years. Today, this powerful tool is not only interested in information technology specialists, but also researchers in other fields such as educational sciences, biology, communication sciences, economics, etc. Social network analysis is used as a key technique [5].

    Different criteria and software are used for network analysis. Social network analysis software is used to identify, visualize and simulate vertices and edges. Network analysis tools allow researchers to examine networks of different sizes. These softwares, which by providing various tools allow the application of mathematical and statistical procedures on the network model, help a lot to understand and analyze the results with visual representations of social networks. 1-5- Networks and their characteristics Social networks [20] [8], technical networks [21] (such as the Internet [9]) and biological networks (such as

    neural networks [22] [10]) are examples of networks.

  • Contents & References of Identifying hidden organizations based on links and content

    List:

    Chapter 1- Introduction. 7

    1-1- Social networks. 7

    1-2- Division of social networks. 9

    1-3- The importance of social networks. 10

    1-4- Analysis of social networks. 11

    1-5- Networks and their characteristics 11

    1-6- Organizations in social networks. 13

    1-7- The importance of identifying organizations 16

    1-8- The motivation for doing this thesis. 17

    1-9- An overview of the treatise chapters. 19

    Chapter 2- The second chapter: an overview of the work done. 21

    2-1- Introduction. 21

    2-2- Presented methods 22

    2-3- Link-based methods. 22

    2-3-1- Optimizing a global goal. 22

    2-3-2- without optimizing any criteria. 27

    2-3-3- Model-based methods. 27

    2-4- Content-based method 29

    2-4-1- CUT method. 29

    2-4-2- LTCA method. 30

    Chapter 3- Presentation of proposed solutions and methods. 32

    3-1- Introduction. 32

    3-2- SBM method. 34

    3-3- LDA method. 37

    3-4- Suggested method. 40

    3-4-1- CDBLC method. 41

    3-5- Conclusion. 51

    Chapter 4 - Results. 53

    4-1- Introduction. 53

    4-2- Data set 54

    4-2-1- Cora data set. 54

    4-2-2- Twitter dataset 55

    4-3- Evaluation criteria. 56

    4-3-1- Modularity criterion. 57

    4-3-2- Normalized Mutual Information criterion. 58

    4-3-3- Perplexity criterion. 59

    4-4- Results and analysis 60

    4-4-1- Cora dataset. 61

    Chapter 5- Discussion and conclusion. 67

    5-1- Conclusion. 67

    5-2- Suggestions for future work. 71

    List of sources. 72

     

     

    Source:

    [1] B. A. Myers, “A brief history of human-computer interaction technology,” interactions, vol. 5, no. 2, pp. 44–54, 1998.

    [2] R. Harper, T. Rodden, Y. Rogers, and A. Sellen, “Being Human: HCI in the Year 2020,” 2008.

    [3] N. B. Ellison, “Social network sites: Definition, history, and scholarship,” J. Comput. Commun., vol. 13, no. 1, pp. 210–230, 2007.

    [4] S. Fortunato, “Community detection in graphs,” Phys. Rep., vol. 486, no. 3, pp. 75–174, 2010.

    [5] H. Zhang, B. Qiu, C. L. Giles, H. C. Foley, and J. Yen, “An LDA-based community structure discovery approach for large-scale social networks,” in Intelligence and Security Informatics, 2007 IEEE, 2007, pp. 200–207.

    [6] A. Celisse, J.-J. Daudin, and L. Pierre, "Consistency of maximum-likelihood and variational estimators in the stochastic block model," Electron. J. Stat., vol. 6, pp. 1847–1899, 2012.

    [7] X. Cheng, C. Dale, and J. Liu, “Statistics and social network of youtube videos,” in Quality of Service, 2008. IWQoS 2008. 16th International Workshop on, 2008, pp. 229–238.

    [8] S. Wasserman and K. Faust, “Social network analysis in the social and behavioral sciences,” Soc. Netw. Anal. Methods Appl., vol. 1994, pp. 1–27, 1994.

    [9] M. Faloutsos, P. Faloutsos, and C. Faloutsos, “On power-law relationships of the internet topology,” in ACM SIGCOMM Computer Communication Review, 1999, vol. 29, no. 4, pp. 251–262.

    [10] D. J. Watts and S. H. Strogatz, “Collective dynamics of 'small-world' networks,” Nature, vol. 393, no. 6684, pp. 440–442, 1998.

    [11] S. Milgram, “The small world problem,” Psychol. Today, vol. 2, no. 1, pp. 60-67, 1967. [12] J. Guare, Six degrees of separation: A play. Random House LLC, 1990.

    [13] R. Albert, H. Jeong, and A.-L. Barab?si, "Internet: Diameter of the world-wide web," Nature, vol. 401, no. 6749, pp. 130–131, 1999.

    [14] M. E. J. Newman, “The structure and function of complex networks,” SIAM Rev., vol. 45, no. 2, pp. 167–256, 2003.

    [15] M. E. J. Newman, S. H. Strogatz, and D. J. Watts, “Random graphs with arbitrary degree distributions and their applications,” Phys. Rev. E, vol. 64, no. 2, p. 26118, 2001.

    [16] M. E. J. Newman, “Detecting community structure in networks,” Eur. Phys. J.B-Condensed Matter Complex Syst., vol. 38, no. 2, pp. 321–330, 2004.

    [17] L. Danon, A. Diaz-Guilera, J. Duch, and A. Arenas, “Comparing community structure identification,” J. Stat. Mech. Theory Exp., vol. 2005, no. 09, p. P09008, 2005.

    [18] S. E. Schaeffer, “Graph clustering,” Comput. Sci. Rev., vol. 1, no. 1, pp. 27-64, 2007.

    [19] J. Leskovec, D. Huttenlocher, and J. Kleinberg, "Predicting positive and negative links in online social networks," in Proceedings of the 19th international conference on World wide web, 2010, pp. 641–650.

    [20] T. Schank and D. Wagner, Approximating clustering-coefficient and transitivity. Universit?t Karlsruhe, Fakult?t für Informatik, 2004.

    [21] M. Girvan and M. E. J. Newman, “Community structure in social and biological networks,” Proc. Natl. Acad. Sci., vol. 99, no. 12, pp. 7821–7826, 2002. [22] P.-O. Fj?llstr?m, “Algorithms for graph partitioning: A survey,” Link?ping Electron. Arctic. Comput. Inf. Sci., vol. 3, no. 10, 1998.

    [23] M. E. J. Newman, “Modularity and community structure in networks,” Proc. Natl. Acad. Sci., vol. 103, no. 23, pp. 8577–8582, 2006. [24] S. Zhang, R.-S. Wang, and X.-S. Zhang, "Identification of overlapping community structure in complex networks using fuzzy c-means clustering," Phys. A Stat. Mech. its Appl., vol. 374, no. 1, pp. 483–490, 2007.

    [25] M. E. J. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Phys. Rev. E, vol. 69, no. 2, p. 26113, 2004.

    [26] U. Brandes and T. Erlebach, Network analysis: methodological foundations, vol. 3418. Springer, 2005.

    [27] B. W. Kernighan and S. Lin, "An efficient heuristic procedure for partitioning graphs," Bell Syst. Tech. J., vol. 49, no. 2, pp. 291–307, 1970.

    [28] M. E. J. Newman, “Spectral methods for community detection and graph partitioning,” Phys. Rev. E, vol. 88, no. 4, p. 42822, 2013.

    [29] G. Palla, I. Derényi, I. Farkas, and T. Vicsek, “Uncovering the overlapping community structure of complex networks in nature and society,” Nature, vol. 435, no. 7043, pp. 814–818, 2005. [30] H.-W. Shen, X.-Q. Cheng, and J.-F. Guo, "Exploring the structural regularities in networks," Phys. Rev. E, vol. 84, no. 5, p. 56111, 2011.

    [31] Z. Yin, L. Cao, Q. Gu, and J. Han, “Latent community topic analysis: Integration of community discovery with topic modeling,” ACM Trans. Intel. Syst. Technol., vol. 3, no. 4, p. 63, 2012.

    [32] T. Yang, Y. Chi, S. Zhu, Y. Gong, and R. Jin, “Detecting communities and their evolutions in dynamic social networks—a Bayesian approach,” Mach. Learn., vol. 82, no. 2, pp. 157–189, 2011.

    [33] E. M. Airoldi, D. M. Blei, S. E. Fienberg, E. P. Xing, and T. Jaakkola, “Mixed membership stochastic block models for relational data with application to protein-protein interactions,” in Proceedings of the international biometrics society annual meeting, 2006, p. I5.

    [34] P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic block models: First steps,” Soc. Networks, vol. 5, no. 2, pp. 109–137, 1983.

    [35] J. M. Hofman and C. H. Wiggins, “Bayesian approach to network modularity,” Phys. Rev. Lett., vol. 100, no. 25, p. 258701, 2008.

    [36] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.

    [37] D. Li, B. He, Y. Ding, J. Tang, C. Sugimoto, Z. Qin, E. Yan, J. Li, and T. Dong, “Community-based topic modeling for social tagging,” in Proceedings of the 19th ACM international conference on Information and knowledge management, 2010, pp. 1565–1568.

    [38] S. Wasserman, Social network analysis: Methods and applications, vol. 8. Cambridge university press, 1994.

    [39] M. E. J. Newman, "The mathematics of networks," New Palgrave Encycl. Econ., vol. 2, pp. 1–12, 2008. [40] Y. Gong and W.

Identifying hidden organizations based on links and content