attardi/wikiextractor

--json flag is unrecognised

odebroqueville opened this issue · 3 comments

I'm using wikiextractor 3.0.4.

Screenshot 2021-10-13 at 17 52 26

Btw, is this project still in the works or has it been abandoned?

@odebroqueville
You can modify the code to to Json = True in the Extractor() class at extract.py.

class Extractor():
    """
    An extraction task on a article.
    """
    ##
    # Whether to preserve links in output
    keepLinks = False

    ##
    # Whether to preserve section titles
    keepSections = True

    ##
    # Whether to output text with HTML formatting elements in <doc> files.
    HtmlFormatting = False

    ##
    # Whether to produce json instead of the default <doc> output format.
    toJson = True

    ##
    # Whether to expand templates
    expand_templates = True
...

modify the code to to Json = True in the Extractor() class at extract.py seems like did not work

change the variable name Tojson into to_json = True, then it works