Word Files
Reference for Downloading Educational Files

Development of web mining techniques in order to personalize information in search engines

Number of pages: 190 File Format: word File Code: 31060
Year: 2014 University Degree: Master's degree Category: Computer Engineering

Tags/Keywords: Personalization of information - Personalization of information in search engines - Search engine - Search engine optimization - Snack - Web mining

Part of the Content
Contents & Resources

Summary of Development of web mining techniques in order to personalize information in search engines

Computer Software Engineering Master's Thesis (M.Sc)

Abstract

The dynamic nature of the World Wide Web and its growing dimensions have made accurate information retrieval difficult. Incorrect answers returned by search engines, especially for query terms with different meanings, have caused the dissatisfaction of web users who need accurate answers to their information requests. Today, search engines try to find out what users are asking for by studying their search history or even involving users in the search process in order to clarify what they really need. This process is part of search engines' efforts for personalization.

One of the well-defined and well-built personalized search engines is Snect [1], which uses user participation for the personalization process. In this research, based on the personalized Snect algorithm, an architecture of the new personalized search engine proposed in this thesis called PSEFiL is presented, which by user intervention and filtering links, provides answers with the least amount or absence of subject deviation in order to enrich the answer collection. In addition, the answer set is robust because every link in the result set is either highly ranked by other search engines or has minimal subject deviation through a careful manual scanning process. Additionally, each link is clearly categorized for each available subjective meaning of a query phrase. One of the goals of PSEFiL is to prepare and deliver accurate answers, not to deliver a set of answers with more links whose content may be less accurate or not accurate.

Keywords

Search engine, search engine optimization, search engine personalization, web mining structure, web mining content

Chapter 1

Overview

Web, a vast, diverse and dynamic environment is that many users decide to publish their documents in it. Due to the vast amount of information and with the development of information systems, data has become one of the most important resources of organizations. Therefore, in recent years, the methods and techniques of efficient access to data, sharing data and extracting information from data are highly needed by the information society and its users. The importance of effective management and classification of various types of data in order to use and analyze them efficiently for general users as well as academic staff[2] is not hidden from anyone. Meanwhile, the nature of the web includes many challenges that make it difficult to categorize and manage data. Among them, it is difficult to find the required information on the web due to the low analytical accuracy of search engines, the lack of privacy of information, the long response time perceived by the user, the user's dissatisfaction with the quality of the received response, the variety of data available on the web, and so on. pointed out.

In the search engine [3], the user enters a keyword and the search module searches in its database and will display sites related to your topic. When the user uses a search engine to express his request, the results provided by the search engine do not lead to a list of results, but most search engines offer the user other features in addition to those results, which can be very useful in bringing the user to his real request.

Different methods are used to retrieve information, which are mainly based on content and structure and use different algorithms for this purpose. Studies show that query words are short and different, and each user has a specific meaning for a similar query. In fact, the results presented are not always what the user expects, users have different tastes, and the search engine offers the same result for all of them. If users' preferences can be used in the search, more satisfactory results will certainly be obtained. In fact, in such a structure, two users receive different results from the same query. One of the popular and popular topics in information retrieval is recognizing the user's behavior [4] and using his behavioral history in viewing past web pages, so that the results from the search engine are as close as possible to the user's tastes and cause more user satisfaction.One of the prominent and popular topics in information retrieval is recognizing the user's behavior [4] and using his behavioral history in viewing past web pages so that the results from the search engine are as close as possible to the user's tastes and cause more user satisfaction. In fact, the process of personalization [5] of the search engine and improving the results of the user's search is one of the open research fields in this field that has attracted many researchers and evokes valuable results until today.

Web mining [6] as a specialized sub-branch of data mining knowledge refers to the process of discovering unknown and useful information and knowledge from web data, which is used in various fields, and in recent years, along with the development of the web, this branch has been the focus of many researchers. Web mining not only means the use of data mining techniques [7] for the data stored in web pages, but its algorithms are modified in order to respond to the demands of users from the web in terms of response time and web analysis power.

In this thesis, first, the process of web mining, personalization of the search engine, the methods and tools used in them will be described, and then by using the combination of mining structure and content mining and by examining the Snect search engine to The personalization of the search engine is paid to achieve better results. 1-2 Statement of the problem and its importance The expansion of the World Wide Web leads to the production of a large amount of data in such a way that it will be impossible to access them effectively if the data is not properly organized and managed. Therefore, the use of web mining techniques in the World Wide Web is currently the focus of many researchers. Web mining is the process of discovering unknown information and knowledge from the data available on the web. It has turned the Internet environment into a practical environment so that users can find the information they need faster and more easily. This technique includes the discovery and analysis of data, documents and multimedia data from the web environment. Web mining uses the details and contents of the document and the structure of hyperlinks so that the user can have the information he needs. Offline data mining and web mining are done online. Web mining turns data into knowledge during the steps of retrieving the desired documents on the web, selecting information and pre-processing, generalization by automatically discovering common patterns in one or more web sites, and analysis, in which the patterns obtained in the previous step are validated and interpreted. [41]

Web mining methods are divided into three categories based on data type exploration:

Web content mining [8]: The process of extracting useful information from the content of web documents. This content can include text, image, video, sound, or structured records such as lists and tables. Among the related algorithms are decision trees and neural networks.

Web structure mining [9]: Web It can be represented as a graph where the nodes are the documents and the edges are the links between the documents. Web structure mining is the process of extracting structural information from the web. Application of web mining [10]: The application of data mining techniques to discover web usage patterns, in order to better understand and meet users' needs. In fact, it is a method to predict user behavior when interacting with the web. Web usage exploration includes pre-processing steps, pattern discovery and pattern analysis [39,41]. searches in a document or database. In the Internet, it is called a web-based program that searches for keywords in files, while some search engines search for World Wide Web documents, newsgroups, and FTP archives [11] [55].

Different methods are used to retrieve information, which are mainly based on content and structure and use different algorithms for this purpose. Studies show that query words are short and different, and each user has a specific meaning for a similar query, in fact, the results presented are not always what the user expects, users have different tastes, and the search engine provides the same result for all of them. If users' tastes can be used in the search, more satisfactory results will certainly be obtained. This thesis seeks to investigate the methods of personalizing the search engine using web mining methods [2].

The importance and necessity of conducting research

In recent years, the growth of the World Wide Web has been greater than expected and the remarkable variety of web applications has made the retrieval of useful content a difficult process.
Contents & References of Development of web mining techniques in order to personalize information in search engines

List:

Abstract..1

Chapter one (General)..2

Introduction..3

Statement of the problem and its importance.
2-2 Web mining..10

2-3 Historical evolution of web mining.11

2-4 Problems of users in using the web.13

2-5 Similarities and differences between web mining and data mining.14

2-6 Web mining algorithms.15

2-7 Classification Web mining.16

2-7-1 Web mining content.17

2-7-1-1 Web mining content views.17

2-7-1-2 Web mining content data. 17

2-7-1-3 Approaches and techniques of web content mining. 18

2-7-1-4 Types of web content mining.................. 19

2-7-2 Web structure mining. 20

2-7-2-1 Web mining structure categories based on structural data type. 21

2-7-2-2 Web structure representation models. 21

2-7-2-3 Web structure analysis applications. 23 2-7-3 Application of web mining 25 2-7-3-1 Phases of application of web mining 25 2-7-3-2 Data types of application mining
2-9 Challenges of web mining.30

2-10 Search engine..31

2-11 History of search engines.31

2-12 Search engines in terms of financial support and manpower.32

2-12-1 Experimental search engines.32

2-12-2 Search engines Commercial.33

2-13 General architecture of search engines and their operation.33

2-13-1 Inside crawler.34

2-13-2 Control inside crawler.35

2-13-3 Page storage.35

2-13-4 Index module 35 2-13-5 Collection Analysis module 2-13-6 Utility Index 2-13-7 Query engine 2-13-8 Ranking module 2-14 Importance of search engines 37
2-15 Problems of search engines in providing results.37

2-16 Search engine optimization.38

2-17 The purpose of SEO..39

2-18 The advantage of website optimization for search engines.39

2-19 Search engine optimization process.40

2-20 Results 41

Chapter 3 (personalization of search engines). 42

3-1 Introduction..43

3-2 The reason for search engine personalization. 43

Definition of personalization. 44

Personalization steps. 44

3-4-1 User recognition. 45

3-4-1-1 Methods to help users search the web. 45 3-4-1-1 Web-ready code clustering
3-4-1-2-1 flat clustering.47

3-4-1-2-1-1 single words and flat clustering.47

3-4-1-2-1-2 sentences and flat clustering.47

3-4-1-2-2 hierarchical clustering.48

3-4-1-2-2-1 Single words and hierarchical clustering. 48

3-4-1-2-2-2 Sentences and hierarchical clustering. 48

3-4-1-3 Introduction of Snect. 50

3-4-1-4 Description of Snect architecture. 51

3-4-1-4-1 sentence selection and ranking. 52

3-4-1-4-2 Hierarchical clustering. 55

3-4-1-4-3 personalization of search results. 57

3-4-1-5 browsing hierarchy documents to extract information. 59

3-4-1-6 Hierarchy document review to select results. 59

3-4-1-7 Query modification. 59

3-4-1-8 Personalized ranking. 61

3-4-1-9 Personalized web mediation. 62

3-4-1-10 Experimental results. 63

3-5-1-10-1 User surveys...................64

3-4-1-10-2 Snect data collection and anecdotal evidence..............65

3-4-1-10-3 Snect evaluation......................66

3-4-1-10-3-1 Advantages of using DMOZ. ..............67

3-4-1-10-3-2 Advantages of using strong text index.............67

3-4-1-10-3-3 Advantages of using multiple engines..............68

3-4-1-10-3-4 Advantages of using spaced sentences as folder tags...69

3-4-1-10-3-5 Number of codes Web ready available in3-4-2 User Modeling
3-4-2-2 -1-1-1 Personal recovery model. 76

3-4-2-2 -1-1-2 Personal presentation style. 76

3-4-2-2-1-1-3 Personal interest topic. 77

3-4-2-2 -1-2 System implementation. 79

3-4-2-2 -1-2 -1 Ranking. 81

3-4-2-2 -1-2-2 Hierarchical classification of web pages retrieval Done. 83

3-4-2-2-1-3 User study. 86

3-4-2-2 -1-3 -1 Test. 86

3-4-2-2 -1-3 -2 Test 2.87

3-4-2-2 -3 Personalization of page ranking algorithm. 88

3-4-2-2 -4 LTIL algorithm. 89

3-4-2-2-5 Method IA. 89.3-4-3 implementation of personalization system.91

3-4-3-1 deterministic method.91

3-4-3-2 fuzzy method.91

3-4-3 personalization of search engines using fuzzy conceptual networks and data mining tools.91

3-4-3-3-1 Background. 91

3-5-3-3-2 Proposed method. 95

3-3-4-3-3 System evaluation and review of the obtained results. 97

3-5 Conclusion. 100

Chapter four (Proposed model for search engine personalization and results obtained from experiments). 101

4-1 Introduction. 102

4-2 Description of experiments and problem analysis. 102 4-3 Conclusion. 154 Chapter 5 (search engine user interface). 159

5-4 Conclusion. 159

Chapter Six (Conclusion). 160

6-1 Introduction. 161

6-2 Review of previous chapters. 161

6-3 PSEFiL personalized search engine. 161

6-4 Conclusion. 164

6-5 Proposals and future studies. 164

Articles extracted from the thesis. 165

List of sources. 166

English abstract.172

Source:

Persian sources

[1] Arzanian, B., Moradi Dolatabadi, P., Akhlikian, F., 2018, "Personalization of search engines using fuzzy conceptual networks and data mining tools", 3rd data conference Mining, pp. 1-6. [2] Bostan, S., Qasimzadeh, M., 2013, "A review of search engine personalization algorithms using users' interests", Khavaran Institute of Higher Education, pp. 1-7.

[3] Saniei Abadeh, M., Mahmoudi, S., Taher Paror, M., 2013, "Applied Data Mining", Niaz Danesh Publications, Chapter 1, p. 19 to 42. [4] Kamijani, A., 1381, "Indexing structure in web search engines", Journal of Information Processing and Management, Volume 17, No. 3 and 4, p. 44.

[5] Melkian, A., 1358, "Principles of Internet Engineering", Nass Publications, p. 482 to 487

[6] Yaqoubi, M. Mohammadzadeh, M., 1390, "Review on the personalization of search engine results with intelligent methods", the first regional conference of modern approaches in computer engineering and information technology, pp. 1-6.

How To Access The File

Cluster optimization using evolutionary algorithms for web personalization

Number of pages: 79 Category: Computer Engineering

Master's Thesis Field: Computer Engineering Major: Software Abstract Expensiveness of information is a major problem in the current web. To deal with this problem, web personalization systems have been provided that adapt the content and services of a website to people based on their interests and browsing behavior. A fundamental component of any web personalization system is ...

Optimizing the link importance detection method in the link database and its application in the architecture of search engines

Number of pages: 119 Category: Computer Engineering

Master's Thesis of Computer-Software Engineering (M.Sc) Abstract In the age of information, the web has become one of the most powerful and fastest means of communication and interaction between people. Search engines as web applications automatically navigate the web and receive a set of available documents. The process of receiving, storing, classifying and indexing is done ...

Creating a recommender system on the web using user profiles and machine learning methods

Number of pages: 85 Category: Computer Engineering

Computer Engineering Master's Thesis Abstract Web development that lacks an integrated structure creates many problems for users. Not finding the information needed by users in this huge warehouse is one of the problems of web users. In order to deal with these problems, web personalization systems have been provided, which by finding the behavior patterns of users without their ...

Detection of web spam using data mining techniques

Number of pages: 95 Category: Computer Engineering

Master's thesis (M.sc) Abstract: Nowadays, spam [1] is one of the main problems of search engines, because they make the quality of search results unfavorable. In recent years, there have been many advances in detecting fake pages, but new spamming techniques have also emerged in response. It is necessary to improve anti-spam techniques to overcome these attacks. A common ...

Identifying overlapping entities in dynamic networks

Number of pages: 82 Category: Computer Engineering

Master's Thesis in Computer Engineering-Artificial Intelligence Abstract Identifying Overlapping Organizations in Dynamic Networks Many complex natural and social structures can be considered as networks [1]. Roads, Internet sites, social networks, organizational communication, kinship relationships, electronic mail exchange, telephone calls and financial transactions are just a ...

Using users with high predictive accuracy in collaborative filtering systems

Number of pages: 91 Category: Computer Engineering

Computer Engineering (Artificial Intelligence) Master's Dissertation Abstract Recommender systems are software tools and techniques that introduce items according to the user's needs. Content-oriented methods and shared filtering are successful solutions in recommender systems. The content-oriented method is defined based on the characteristics of the items. This method checks ...

Optimization of link prediction in social networks with the help of fuzzy logic

Number of pages: 88 Category: Computer Engineering

Non-continuous master's degree in computer engineering Abstract Today, the popularity of social networking sites among people is undeniable, sites that provide users with many possibilities for communication between people. One of the basic problems in analyzing these types of networks is predicting new connections between people in the network. Fuzzy method, as one of the ...

Consensus clustering on heterogeneous distributed data

Number of pages: 120 Category: Computer Engineering

Master's Thesis in Computer Engineering - Software Orientation Abstract Clustering can be considered one of the most important steps in data analysis. Many clustering methods have been developed and presented so far. One of these methods that has been studied in recent studies is consensus clustering method. The goal of consensus clustering is to combine several initial ...

Optimizing the execution and response of C2C and B2C programs in the cloud with distribution, sharing and pre-processing methods, a case study of Engine X and Varnish systems.

Number of pages: 134 Category: IT Information Technology Engineering

Master's Thesis of Information Technology Engineering Abstract In today's world, the Internet and its most important service, the Web, has caused many changes and transformations in human life. The Internet provides all the needs of people to communicate with each other, to obtain information in any field, play and entertainment, education and any field that comes to mind. The ...

Analytical and numerical study of propulsion vector orientation by non-aligned fluid method

Number of pages: 147 Category: Facilities - Mechanics

Dissertation for Master's Degree in Energy Conversion Mechanical Engineering Abstract: Fluid propulsion vector orientation has emerged as an important technology for high performance air vehicles. This technology can improve the maneuverability of the aircraft by changing the nozzle flow and its deviation from its axial direction. The purpose of this study is to investigate the ...

Development of web mining techniques in order to personalize information in search engines

Summary of Development of web mining techniques in order to personalize information in search engines

Contents & References of Development of web mining techniques in order to personalize information in search engines