materialsproject/fireworks

update_launchpad_data does not apply to tasks collection

janosh opened this issue · 5 comments

Currently update_launchpad_data applies only to ["launches", "fireworks", "workflows"] but omits the "tasks" collection. Is this a deliberate decision? Any reason "tasks" should not be changed?

def update_launchpad_data(lp, replacements, **kwargs):
"""
If you want to update a text string in your entire FireWorks database with a replacement, use this method.
For example, you might want to update a directory name preamble like "/scratch/user1" to "/project/user2".
The algorithm does a text replacement over the *entire* BSON document. The original collection is backed up within
the database with extension "_xiv_{Date}".
:param lp (LaunchPad): a FireWorks LaunchPad object
:param replacements (dict): e.g. {"old_path1": "new_path1", "scratch/":"project/"}
:param kwargs: Additional arguments accepted by the update_path_in_collection method
"""
for coll_name in ["launches", "fireworks", "workflows"]:
print("Updating data inside collection: {}".format(coll_name))
update_path_in_collection(lp.db, coll_name, replacements, **kwargs)
print("Update launchpad data complete.")

The tasks collection is not created or managed by FireWorks since it's something created by atomate for the purpose of managing VASP calculations etc., FireWorks is agnostic about what other collections are present in the database since it's a general-purpose code.

The traditional way we've set up our databases when using FireWorks has been to have one database used exclusively by FireWorks, and another database for simulation artifacts like tasks, but more recently have been running them out of a single database for convenience.

I remembered after creating this issue that lpad itself doesn't write to the 'tasks' collection. So given the function name update_launchpad_data, it makes sense. But would you be open to adding a kwarg to handle additional collections? Also, I could add regex support while I'm at it.

@mkhorton I built and tested a version of update_launchpad_data that takes arbitrary collection names and also allows for regex replacements. Let me know if you'd like a PR for that. If not, this can be closed.

Not my decision @janosh, @computron is the maintainer here :-)

I think adding the kwarg sounds sensible.