Contents & References of Diagnosing pelagism using graphs in Persian texts
List:
Introduction. 2
1-1 Explanation of the problem. 5
1-2 solutions 6
1-3 problems in algorithm implementation. 6
1-4 thesis structure. 7
Research background. 9
2-1 Diagnosis of pelarism. 9
2-2 dimensions of pelarism diagnosis. 12
2-2-1 Grammar-based method. 12
2-2-2 Meaning-based methods 13
2-2-3 Combined methods. 14
2-2-4 The method of detecting external pelargism. 14
2-3 methods for calculating the degree of similarity of graphs 15
2-3-1 The method of the largest common subgraph - the smallest common supergraph. 15
2-3-2 Method based on state space search. 17
2-3-3 Possible methods. 18
3-1 Diagnosis of pelargism. 23
3-1-1 matching n grams. 23
3-1-2 Expression weighting. 23
3-1-3 Generalization of the expression. 24 3-2 Dependency graphs. 25
3-2-1 dependencies 26
3-3 graph editing interval. 26
3-3-1 Editing operation. 26
3-3-2 The issue of attribution. 27
3-3-3 Cost matrix. 28
3-3-4 Assignment algorithms. 29
4-1 Architecture. 32
4-2 Text preprocessing. 32
4-2-1 Finding sentences. 33
4-2-2 rooting words. 34
4-2-3 Formation of dependency graph. 40
4-3 Candidate extraction 44
4-3-1 Sentence indexing. 45
4-3-2 Extraction of candidate sentences 45
4-4 Analysis of details. 45
4-4-1 Distance algorithm for editing two graphs. 48
4-4-2 Detection of plagarism based on GED provided in this project 49
5-1 Detection of plagarism of word displacement and sentence structure change. 55
5-1-1 10% structural changes. 56
5-1-2 50% structural changes. 57
5-2-2 100% structural changes. 59
5-2 Recognizing semantic pelagism. 60
5-2-1 semantic changes of 10 percent. 60
Conclusions and suggestions. 64
References. 67
Source:
Fankhauser, S., K. Riesen, and H. Bunke. Speeding up graph edit distance computation through fast bipartite matching. Graph-Based Representations in Pattern Recognition, (2011)
Suchomel, S., J. Kasprzak, and M. Brandejs (2012). Three way search engine queries with multi-feature document comparison for plagiarism detection. See Forner et al. (2012).
Grman, J. and R. Ravas Improved implementation for _nding text similarities in large sets of data - notebook for PAN at clef 2011. See Petras et al. (2011).
Asim M. El Tahir Ali, Hussam M. Dahwa Abdulla, and V´aclav Sn´a?sel Overview and Comparison of Plagiarism Detection Tools, Dateso 2011, pp. 161{172, ISBN 978-80-248-2391-1.
A. S. Bin-Habtoor and M. A. Zaher "A Survey on Plagiarism Detection Systems", International Journal of Computer Theory and Engineering Vol. 4, No. 2, April 2012
Sindhu.L, Bindu Baby Thomas, Sumam Mary Idicula A Study of Plagiarism Detection Tools and Technologies, IJART, Vol. 1 Issue 1, 2011,64-70.
Schleimer, S., Wilkerson, D. and Aiken, A. (2003) Winnowing: Local Algorithms for Document Fingerprinting. SIGMOD 2003, San Diego, 9-12 June 2003, 76-85.
J.A. Malcolm and P.C.R. Lane, Tackling the PAN'09 External Plagiarism Detection Corpus with a Desktop Plagiarism Detector, 3rd PANWORKS-HOP. UNCOVERING PLAGIARISM, AUTHORSHIP AND SOCIAL SOFTWARE MISUSE, 2009, p. 29. C. Basile, G. Cristadoro, D. Benedetto, E. Caglioti, and M. Degli Es-posti, A plagiarism detection procedure in three steps: selection, matches and "squares", 3rd pan workshop. Uncovering plagiarism, authorship and social software misuse, 2009, p. 19.
Adam Shenker Horste Bunke, Mark Last and Abraham Kandle Graph Theoretic Techniques For Web Content Mining, Published by World Scientific Publishing, USA 2005
Ahmed Hamza Osman, Naomie Salim and Mohammed Salem Binwahlan, Plagiarism Detection Using Graph-Based Representation, Journal Of Computing,
Adam Shenker Horste Bunke, Mark Last and Abraham Kandle Graph Theoretic Techniques For Web Content Mining, Published by World Scientific Publishing, USA 2005
Ahmed Hamza Osman, Naomie Salim and Mohammed Salem Binwahlan, Plagiarism Detection Using Graph-Based Representation, Journal Of Computing, Volume 2, Issue 4, Issn 2151-9617 , April 2010.
H. Bunke, On a relation between graph edit distance and maximum common subgraph, Pattern Recognition Letters (1997) H. Bunke and K. Shearer, A graph distance metric based on the maximal common subgraph, Pattern Recognition Letters, Vol. 19, 1998 J. T. L. Wang, K. Zhang, and G.-W. Chirn, Algorithms for Approximate Graph Matching, Information Sciences, Vol. 82, 1995
R. C. Wilson and E. R. Hancock, Structural Matching by Discrete Relaxation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 6, June 1997
R. Myers, R. C. Wilson, and E. R. Hancock, Bayesian Graph Edit Distance, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vo.!22, No. 6, June 2000. Papineni, K., S. Roukos, T. Ward, and W. Zhu (2002). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics, pp. Association for Computational Linguistics. Stamatatos, E. Plagiarism detection using stopword n-grams. Journal of the American Society for Information Science and Technology (2011) Jones, K. A statistical interpretation of the term specificity and its application in retrieval. Journal of documentation (1972) Marcus, M., M. Marcinkiewicz, and B. Santorini Building a large annotated corpus of English: The penn treebank. Computational linguistics(1993).
Riesen, K. and H. Bunke Approximate graph edit distance computation by means of bipartite graph matching. Image and Vision Computing (2009).
Porter, M. F. An algorithm for suffix stripping. Program, pp. 137-130. (1980).
Megerdoomian, K. (2004). Finite-state morphological analysis of Persian. In Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, University of Geneva, Iran.
Sheykhzadegan, J. and M. Bijankhan (2006). The speech databases of Persian language. In Proceedings of the 2nd Workshop on Persian Language and Computing, the University of Tehran, Tehran, Iran, pp. 247-261.
Taghva, Beckley and Sadeh. A stemming algorithm for the Farsi language. IEEE ITCC, pp. 158 - 162. 2005.
Anvari, H. & Ahmadi Givi, H. (2006). Persian Language Grammar (2nd Ed.). Tehran: Fatemi Publication. A. A. Sharifloo, and M. Shamsfard, "A Bottom up Approach to Persian Stemming", Proceedings of the Third International Joint Conference on Natural Language Processing, 2008.