karimmerhom/Detecting-Redundancies-in-Code
Code cloning refers to the duplication of source code, which is the most common way of reusing existing code fragments across projects between software developers, as some developers refer to copy-paste programming instead of implementing a new code from scratch. The copy-paste method saves a lot of time and energy and helps spread good coding practices and patterns. However, if a bug is detected, then all the replicated codes should be checked for the same bug. A clone detection tool is required to save time and energy, and ease the process of reusing source code between software developers. The tool also detects the clones across singular or multiple projects, which eases the process of fixing bugs and overall software maintenance. The main purpose of this paper is to gain insight into the research available in the area of clone detection and management, as well as identify the research gaps to work upon. Despite a decade of active research, there is a noticeable lack of clone detectors that measure to very large repositories of source code, specifically for detecting near-miss clones where noteworthy-editing activities may take place in the cloned code. This paper presents a token-based clone detection tool that targets all the clone types using Python Sequence Matcher that is tested to detect clones across a single project, as well as across multiple projects using different data sets. The output of the tool has shown the correct detection of cloned methods whether they are syntactically identical, similar or dissimilar clones across a project or several projects according to a preset threshold. VI
PythonGPL-3.0
No issues in this repository yet.