LLNL/scraper

Call in Python not command line

stuchalk opened this issue · 5 comments

Can you add documentation how to call scraper in Python code rather than the command line? This would be very helpful for a new project I am working on that goes even further than this - which is awesome by the way...:)

I don't know how useful or generally desired that is as the tool is intended to be called from the command line.

If you want to call it from another python program, you can look at the main method in the package and invoke the parts you would like to use,

Thanks for quick reply. Took a quick look at main and I will see what I can do. Would you consider a PR to add this functionality?

Definitely, I think a PR that refactors this to be more program callable would be nice. It should be possible to refactor this and keep the command line functionality the same.

OK, I will fork the repo and work on it.

@stuchalk just to chime in with @leebrian -- yes, would definitely consider a PR that changes things around, but at a glance if you didn't want to do that, I think this is the meat you'd be looking for:

code_json = code_gov.process_config(config_json)
code_gov.force_attributes(code_json, config_json)
logger.info("Number of Projects: %s", len(code_json["releases"]))
output_filepath = args.output_filename
if output_path is not None:
output_filepath = os.path.join(output_path, output_filepath)
with open(output_filepath, "w", encoding="utf-8") as fp:
logger.info("Writing output to: %s", output_filepath)
fp.write(code_json.to_json())