PyReprism is a Python framework that helps researchers and developers the task of source code preprocessing. With PyReprism, you can easily match, extract, count, and remove comments, whitespaces, operators, numbers and other language specific constructs from over 150 programming languages and file extensions.
pip install PyReprism
Use case 1: Removing comments
from PyReprism.languages import Python
# from PyReprism.languages import Java
source = """
# single line comment
x = 5 + 6
'''
multiline
comment
'''
print(x)
"""
source = Python.remove_comments(source)
# expected output
x = 5 + 6
print(x)
Use case 2: Removing whitespaces
from PyReprism.utils.normalizer import Normalizer
source = """
x = 5 + 6
print(x)
"""
source = Normalizer.remove_whitespaces(source)
# expected output
x=5+6
print(x)
Read the docs for more usage examples.
NB: The beta versions of PyReprism is still unstable, but we are working 24/7 to ensure the tool is usable.
We invite you to help us build this tool and make it more extensive. Contribution is open to OSS community.
$ git clone https://github.com/unlv-evol/PyReprism.git
$ cd PyReprism
(Optional) It is suggested to make use of virtualenv. Therefore, before installing the requirements run:
$ python3 -m venv venv
$ source venv/bin/activate
Then, install the requirements:
$ pip install -r requirements.txt
For more information on how to contribute, read our contributing guidelines.
If you experience any issue, feel free to report it.