RNfinity
Research Infinity Logo, Orange eye of horus, white eye of Ra
  • Home
  • Submit
    Research Articles
    Ebooks
  • Articles
    Academic
    Ebooks
  • Info
    Home
    Subject
    Submit
    About
    News
    Submission Guide
    Contact Us
  • Login/sign up
    Login
    Register

Humanities and Arts

Automatic Detection of Plagiarism in Writing

rnfinity

info@rnfinity.com

orcid logo

Mahshad Davoodifard

Mahshad Davoodifard


  Peer Reviewed

copyright icon

© attribution CC-BY

  • 0

rating
338 Views

Added on

2024-12-25

Doi: https://doi.org/10.52214/salt.v21i2.9058

Related Subjects
History
Music
Language
Philosophy
Classics
Art

Abstract

This paper reports on preliminary steps to create an external plagiarism detection tool. I used the PAN-PC-11 data sets and extracted tf-idf scores of text documents and cosine similarity measures between source and suspicious documents to find text overlap. The model was able to successfully create vectors and measure the similarity metrics. However, the algorithm was not extended further to automatically retrieve related documents to follow on the pipeline (converting texts to n-grams for detailed analysis and revealing the best match as a source of plagiarism and evaluating the accuracy of the model). The model produced a matrix of cosine similarity for all the documents, which I used to manually retrieve documents and check for overlap using online tools. While extending the algorithm based on the suggested pipeline would allow for a more accurate evaluation of the model, manual comparison of sample documents provided some validity of the model developed for the present study. 

Key Questions about Plagiarism Detection in Writing

What methods were used to detect plagiarism in the study?

The study utilized tf-idf (term frequency-inverse document frequency) scores to represent text documents and cosine similarity measures to assess the similarity between source and suspicious documents. This approach enabled the identification of text overlap, indicating potential plagiarism.

What were the outcomes of the plagiarism detection model?

The model successfully created vectors and measured similarity metrics, producing a matrix of cosine similarity for all documents. This matrix facilitated manual retrieval of documents to check for overlap using online tools. While the algorithm was not extended to automatically retrieve related documents, the manual comparisons provided some validation of the model's effectiveness.

What are the limitations of the current plagiarism detection model?

The primary limitation is that the algorithm was not extended to automatically retrieve related documents for detailed analysis. Implementing a pipeline that converts texts to n-grams for in-depth analysis and evaluating the accuracy of the model could enhance its effectiveness in detecting plagiarism.

How can the plagiarism detection model be improved?

To improve the model, extending the algorithm to automatically retrieve related documents and converting texts to n-grams for detailed analysis would be beneficial. This approach would allow for a more accurate evaluation of the model and facilitate the identification of the best match as a source of plagiarism.

What is the significance of this study in the field of plagiarism detection?

This study contributes to the field by providing preliminary steps towards creating an external plagiarism detection tool. It highlights the potential of using tf-idf scores and cosine similarity measures in identifying text overlap, offering a foundation for developing more advanced plagiarism detection systems.

By addressing these questions, the study offers insights into the development and evaluation of plagiarism detection tools, emphasizing the importance of accurate and efficient methods in maintaining academic integrity.

Summary Video Not Available

Review 0

Login

ARTICLE USAGE


Article usage: Dec-2024 to May-2025
Show by month Manuscript Video Summary
2025 May 51 51
2025 April 70 70
2025 March 71 71
2025 February 63 63
2025 January 71 71
2024 December 12 12
Total 338 338
Show by month Manuscript Video Summary
2025 May 51 51
2025 April 70 70
2025 March 71 71
2025 February 63 63
2025 January 71 71
2024 December 12 12
Total 338 338
Related Subjects
History
Music
Language
Philosophy
Classics
Art
copyright icon

© attribution CC-BY

  • 0

rating
338 Views

Added on

2024-12-25

Doi: https://doi.org/10.52214/salt.v21i2.9058

Related Subjects
History
Music
Language
Philosophy
Classics
Art

Follow Us

  • Xicon
  • Contact Us
  • Privacy Policy
  • Terms and Conditions

5 Braemore Court, London EN4 0AE, Telephone +442082758777

© Copyright 2025 All Rights Reserved.