iterative/dvc

Allow forced freeze/unfreeze operation in dvc.

legendof-selda opened this issue · 4 comments

I have a stage that uses an env variable to download files from a data source. I dont want this to run at times so I would like to freeze it.
If i use dvc freeze I will get this error

> dvc freeze data/dvc.yaml:download
ERROR: failed to freeze 'data/dvc.yaml:download' - cannot dump a parametrized stage: 'data/dvc.yaml:download'

Currently I resolve this by manually setting the freeze flag to True in the yaml file.

Can we have a -f argument so that we override the ERROR?

@legendof-selda I don't think our CLI is intended (neither it can) to provide a full list of operations to manipulate with dvc.yaml in all possible way. dvc.yaml is like GitHub actions file, for example, has complex structure and it would be impossible to provide helpers for all cases. It is expected that users edit and modify it with there IDEs / editors.

Is there a strong reason you would prefer an additional helper like that vs just editing the file? Is it part of some automation? Can you try to do this with sed and bash, for example?

Also, please share the stage - how does it look like? how is it parametrized?

Duplicate of #6070

The freeze command should just work based on dvc.yaml rather than the internal implementation logic of a Stage. But this is going to be a substantial change on the implementation of freeze.

hey
apologies for the delay. was travelling.

This is the stage that I am working with.

    download_parquet:
    desc: Download the exported parquet files from azure blob storage
    cmd: >
      DATA_DIR="\${DATA_DIR:<default_path>}"
      mkdir -p data/watchtower_filtered_output &&
      AZCOPY_AUTO_LOGIN_TYPE="\${AZCOPY_AUTO_LOGIN_TYPE:-DEVICE}" azcopy sync
      https://<blob_url>$DATA_DIR --recursive=true
      --delete-destination=true data/output/; fi
    wdir: ..
    always_changed: true
    outs:
    - data/output:
        persist: true
    frozen: true

@shcheklein
Yes currently I am using a script to toggle the frozen parameter in the yaml file. Although it would be better if we can do this via dvc freeze -f as it's easier and also it would conform with the yaml styling used by dvc. currently dvc freeze changes the parameter in the yaml file but gives an explicit error if it detects the command is parameterized. Users should be given the choice to explicitly freeze/unfreeze a stage and the error returned can instead be a warning. Also this can provide future compatibility with future releases of dvc