There is no wealth like Knowledge
                            No Poverty like Ignorance
ARPN Journals

ARPN Journal of Engineering and Applied Sciences >> Call for Papers

ARPN Journal of Engineering and Applied Sciences

A fingerprinting structure model for Arabic document plagiarism detection

Full Text Pdf Pdf
Author Yahya Ali Adelrahman Ali
e-ISSN 1819-6608
On Pages 1140-1151
Volume No. 19
Issue No. 17
Issue Date December 10, 2024
DOI https://doi.org/10.59018/092444
Keywords similarity, Arabic language, preprocessing plagiarism detection, fingerprinting, hash-value precision, recall, F-measure.


Abstract

Plagiarism, which is a significant problem in the academic world worldwide, is particularly challenging to detect in Arabic due to the language's complex structure. Methodology: The ADPDM model and framework for detecting plagiarism in Arabic documents are presented in this dissertation. It is designed to detect plagiarism within academic contexts. By organizing documents logically into paragraphs, sentences, and words, the model seeks to establish a robust system that can identify duplicated content and search for similar documents within identifying corresponding sets. In particular, the study examines preprocessing techniques such as stop word removal, stemming, and rootage processing, followed by content-based methods utilizing fingerprinting and heuristic algorithms that are tailored to Arabic language features. To aid in efficient detection, the BKDR hash function is used for chunk having. To optimize computation time, heuristic algorithms are implemented at different levels of document representation, using metrics such as Longest Common Substring (LCS). To evaluate the ADPDM system, a corpus of 100 documents is utilized, which includes datasets from AraPlagDet and the Decision Support System (DSS). The performance of ADPDM is compared with other plagiarism detection methods using WCopyFind, but the latter has a higher computational speed than ADBDM. Its recall, precision, and F-measure values of 0. 78035, 0. 994264, and 0. 865688 respectively (ADPDM) are particularly notable for its ability to detect plagiarized content in Arabic documents. ADPDM is a successful anti-pluralist solution for Arabic text, even though it requires a longer processing time than WCopyFind.

Back

GoogleCustom Search



Seperator
    arpnjournals.com Publishing Policy Review Process Code of Ethics

Copyrights
© 2024 ARPN Publishers