Using users with high predictive accuracy in collaborative filtering systems

Number of pages: 91 File Format: word File Code: 31026
Year: 2013 University Degree: Master's degree Category: Computer Engineering
  • Part of the Content
  • Contents & Resources
  • Summary of Using users with high predictive accuracy in collaborative filtering systems

    Computer Engineering Master's Thesis (Artificial Intelligence)

    Abstract

    Recommender systems are software tools and techniques that introduce items according to the user's needs. Content-oriented methods and collaborative filtering are successful solutions in recommender systems. The content-oriented method is defined based on the characteristics of the items. This method checks what features the user's favorite items have, then suggests items with similar features. The shared filtering method works based on the determination of similar items or similar users, which are called item-based and user-based shared filtering, respectively. In this thesis, a combined method of collaborative and content-oriented filtering methods is presented. This method can be considered as user-based shared filtering method. In this way, in order to find users with similar tastes to the active user as users with high prediction accuracy, it uses the features related to the content of the items to increase the impact of the points assigned by users to similar items. In other words, two users are similar if the points they have assigned to items that are similar in terms of content are the same. For this purpose, when measuring the similarity of two users, a weight is assigned to the score assigned to each font, according to its similarity to the target font.

    Chapter One

    Introduction

    Foreword

    The emergence of the Internet and the World Wide Web [1] has caused that there is a huge amount of information in relation to every imaginable subject that users [2] can use to find their needs. Fix your information. The ever-increasing increase in information has caused the problem of information overload [3] and users are not able to meet their needs alone. . Because users had to search online [4] all the pages to find the part they need. For this reason, search engines[5] were created so that users can use them to access the information they want without having to check a large number of pages. When the user enters his request in the form of keywords into the search engine, the search engine searches among billions of web pages and helps the user find the information he is looking for. By using this tool, the speed and accuracy of the search increased a lot and users were able to get the best results simply and in the shortest time. Many types of search engines have been made by different companies, the most famous of which are Bing [6], Yahoo [7] and Google [8] (Figure No. 1).

    Search engines are divided into two general categories: crawling search engines [9] and manual completion lists [10]. Combined search engines [11] are also the result of the combination of the above two types. There are also new types of search engines called super search engines [12], which we will briefly explain each of these items below. Then users search for what they want from this information. If changes are made to the web page, the search engines will automatically find them and apply the said changes to the listings. Examples of navigation search engines are Google and Yahoo.

    1-2- 2- Manual completion lists

    Manual completion lists depend on the users who complete it. Either the user registers the desired page in the list along with a short description, or this is done by the editors for that list. In this case, the search is performed only on the registered description, and if there is a change on the web page, it will not be changed in the list.An example of manually completed lists is Open Directory[13]. In addition, they can give priority to the results of one type. For example, the MSN search engine prioritizes results from manually completed listings. But for complex requests, it also checks the results of a survey search.

    1-2-4- Super search engines

    This new type of search engine combines and shows the results of several search engines. In other words, it searches the user's request in several search engines, then combines the results found and provides a general result to the user. For example, the dogpile search engine [14] combines the results of Google, Yahoo, MSN, and ASK search engines and presents them to the user.

    Recommender systems

    Recent studies have shown that most search engines have a low success rate. This rate is determined by the amount of receiving relevant results, compared to the average searcher users. For example, in one of the studies [1], more than 20,000 search requests were examined and it was determined that on average in 48% of cases, the user found at least one item related to his search worth choosing in the results presented to him. In other words, in 52% of cases, the user does not select any of the items returned as search results. Of course, this problem depends as much on the search engine as it does on the knowledge of the searching user on how to search. Because the search request may lead to ambiguity and rarely can clearly express the searcher's need. In these cases, the user is faced with a list of results that cannot satisfy his information needs. In this situation, he usually changes or modifies his request to get the desired result.

    In [2] it has been shown that 10% of the income of those who work with information is lost due to wasting their time in searching. Also, in the worst case, a significant percentage of searchers may fail to find the information they need. These issues show that web search is much more inefficient than expected. Also, in addition to the increase in the number of web pages, the number of Internet users also increased sharply. Users wanted to satisfy their information needs and wanted to produce and share their information, interests and needs. Therefore, social networks such as Facebook and Twitter were established. Also, sites like YouTube were launched, which is a place to share videos and view shared videos.            

    In the meantime, recommender systems were created to solve the inefficiencies of search engines and the needs of users.

    Recommender systems have played a significant role in selecting and providing information needed by users. These systems can suggest a number of items to the user or provide him with the information he needs even without a search request. Items can be movies, music, web pages, etc. be (Table No. 1). The user will also receive suggestions through a smart search. Therefore, it has a considerable effect on saving time and achieving the desired goal of the user. Because in this way, he can have the part he needs out of this high volume. In this way, the user is prevented from getting confused when making a decision.

    With the increasing amount of information, the need for the existence of these systems has been felt more. These systems generate suggestions using various types of knowledge and data collected about users and items, as well as examining transactions such as feedback [15] that users have created in the past. In the simplest form, these suggestions will be presented to the user in the form of a list arranged according to his interests and needs. In [3], recommender systems are classified based on shared filtering [16], content [17], statistics [18], profit [19], knowledge [20], and hybrid [21].

  • Contents & References of Using users with high predictive accuracy in collaborative filtering systems

    List:

    Chapter 1: Introduction..1

    1-1- Preface..2

    1-2- Search engines.2

    1-2-1- Navigational search engines.3

    1-2- 2- Manual completion lists.3

    1-2-3- Combined search engines.4

    1-2-4- Supersearch engines. 4

    1-3- Recommender systems. 5

    1-3-1- Recommender system based on shared filtering. 7

    1-3-2- Recommender system based on content. 8

    1-3-3- Recommender system based on statistics. 8

    1-3-4- Recommender system based on Profit. 9

    1-3-5- Recommender system based on knowledge. 9

    1-3-6- Combined recommender system. 9

    1-4- MovieLens website review. 10

    1-5- Objectives of the thesis. 13

    1-6- Structure of the thesis. 14

    Chapter 2: Filtering method 15

    2-1- Preface.. 16

    2-2- An overview of the work done in this direction. 16

    2-3- Basics of shared filtering. 21

    2-4- Tasks of shared filtering. 22

    2-4-1- Suggestion..23

    2-4-2- Prediction..23

    2-5- Classification of shared filtering methods. 23

    2-5-1- Memory-based shared filtering. 24

    2-5-1-1- Memory-based shared filtering with prediction based on users. 25

    2-5-1-2- Memory-based shared filtering with prediction based on items. 25

    2-5-1- 3- The difference between shared filtering based on users and based on items. 26

    2-5-2- Model-based shared filtering. 26

    2-6- How to identify users' interests. 27

    2-6-1- Identifying interests explicitly. 27

    2-6-2- Identifying interests implicitly. 27

    2-7- Calculation of similarity.28

    2-7-1- Pearson correlation criterion.28

    2-7-2- Cosine measurement criterion.29

    2-8- Neighbor selection.30

    2-8-1- Use of threshold limit.30

    2-8-2- Selection of fixed number of neighbors.30

    2-9- Predicting and estimating rank.31

    2-9-1- Use of raw scores.31

    2-9-2- Use of normalized scores.31

    2-10- Shared filtering problems.32

    2-10-1- Scattering of data.32

    2-10-2- Scalability.32

    2-10-3- Similar items.33

    2-10-4- Greeship..33

    2-11- Examining how the Amazon website works.33

    Chapter 3: Content-based method.36

    3-1- Preface..37

    3-2- Content-based method work process.37

    3-2-1- Content analyzer.38

    3-2-2- Profile learner.39

    3-2-3- Filtering component.42

    3-3- Advantages of content-oriented method.42

    3-3-1- User independence.42

    3-3-2- Transparency..42

    3-3-3- New font..43

    3-4- Disadvantages of the content-based method.43

    3-4-1- Lack of content.43

    3-4-2- Additional privatization.43

    3-4-3- New user..44

    Chapter 4: The proposed method.45

    4-1- Foreword..46

    4-2- An overview of the work done in this direction.46

    4-3- Introduction to the proposed method.48

    4-4- The proposed method.48

    4-4-1- Pre-processing.49

    4-4-1-1- Pre-processing on the MovieLens database.49

    4-4-1-2- Preprocessing on EachMovie database.50

    4-4-2- Weighting items.51

    4-4-3- Selection of neighborhood.53

    4-4-4- Prediction..54

    Chapter 5: Experiments and results.56

    5-1- Database Data used.57

    5-2- How to implement the proposed method on the MovieLens database.57

    5-3- How to implement the proposed method on the EachMovie database.58

    5-4- Evaluation criteria.58

    5-4-1- Average absolute error.58

    5-4-2- Accuracy and recall.59

    5-4-3- Evaluation criteria F1.60

    5-5- Evaluation of the proposed method by the introduced criteria.61

    Chapter 6: Discussion and conclusion.66

    6-1- Discussion..67

    6-2- Conclusion..67

    6-4- Suggestions..68

    References..69

     

    Source:

     

    M. Coyle and B. Smyth, "Information recovery and discovery in collaborative web search", In Proceedings of the European Conference on Information retrieval, pp. 356–367, 2007.

     

    S. Feldman and C. Sherman, “The High Cost of Not FindingSherman, "The High Cost of Not Finding Information", In (IDC White Paper), IDC Group, 2000.

     

    R. Burke, "Hybrid recommender systems: Survey and experiments", Journal of User Modeling and User-Adapted Interaction, vol. 12, no. 4, pp. 331-370, 2002.

    R. Bell and Y. Koren, "Lessons from the Netflix prize challenge", Journal of SIGKDD Explorations, vol. 9, no. 2, pp. 75–79, 2007.

     

    http://www.Movielens.org/

     

    D. Goldberg, D. Nichols, BM. Oki and D. Terry, "Using collaborative filtering to weave an information tapestry", Journal of Communication of the ACM, vol. 35, no. 12, pp. 61-70, 1992. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl, "GroupLens : An open architecture for collaborative filtering of netnews", Proceedings of the 1994 Conference on Computer Supported Cooperative Work, USA, pp. 175-186, 1994. J. Constan, B. Miller, D. Maltz, J. Herlocker, L. Gordon and J. Riedl, "Applying collaborative filtering to Usenet news", Journal of communications of the ACM, vol. 40, no. 3, pp.77-87, 1997.

     

    U. Shardanand and P. Maes, "Social information filtering: Algorithms for automating word of mouth", Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems ACM Press, USA, pp. 210–217, 1995.

     

    W. Hil, L. Stead, M. Rosenstein and GW. Furnas, "Recommending and evaluating choices in a virtual community of use", Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems ACM Press, USA, pp. 194–201, 1995.

     

    J. Bannett and S. Lanning, "The Netflix Prize", Proceedings of KDD Cup and Workshop, USA, pp.8-25, Aug 2007.

     

    JS. Breeze, D. Heckermand and C. Kadie, "Empirical analysis of predictive algorithms for collaborative filtering", Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, USA, pp. 43-52, May 1998. DM. Pennock, E. Horvitz, S. Lawrence and CL. Giles, "Collaborative filtering by personality diagnosis: A hybrid memory- and model-based approach", Proceedings of the Sixteenth Annual Conference on Uncertainty in Artificial Intelligence, USA, pp. 473–480, 2000.

     

    I. Im and BH. Kim, "Personalizing the Settings for CF-Based Recommender Systems", Proceedings of the fourth ACM conference on Recommender systems, New York, pp. 248-254, 2010. Y. Ge, H. Xiong, A. Tuzhilin and Q. Liu, "Collaborative Filtering with Collective Training", Proceedings of the fifth ACM conference on Recommender systems, New York, pp. 281-284, 2011.

    .

    R. Hu and P. Pu, "Enhancing Collaborative Filtering Systems with Personality Information", In Proceedings of the fifth ACM conference on Recommender systems, New York, pp. 197-204, 2011.

    J. Bobadilla, F. Ortega, A. Hernando and J. Bernal, "Generalization of recommender systems: Collaborative filtering extended to groups of users and restricted to groups of items", Journal of Expert Systems with Applications, vol. 39, no. 1, pp. 172-186, 2012. J. Comput and S. Technol, “JacUOD: A New Similarity Measurement for Collaborative Filtering”, vol. 27, no. 6, pp. 1252-1260, 2012

    .

    JF. Huete, J.M. Fern?ndez-Luna, LM. de Campos and MA. Rueda-Morales, "Using past-prediction accuracy in recommender systems", Journal of Information Science, vol. 199, no. 7, pp. 78-92, 2012. M.A. Ghazanfar, A. Prugel-Bennett and S. Szedmak, "Kernel-Mapping Recommender system algorithms", Journal of Information Science, vol. 208, pp. 81-104, 2012. G. Tac'acs, I. Pil'a szy, B. N'emeth and D. Tikk, "Major components of the gravity recommendation system", ACM SIGKDD Explorations Newsletter, vol. 2, pp. 80-83, 2007. Y.

Using users with high predictive accuracy in collaborative filtering systems