oVirt/ovirt-imageio

ovirt-img: Machine readable progress

aesteve-rh opened this issue ยท 9 comments

Support machine readable progress, usable by program running ovirt-img.

When running ovirt-img with a command line option, it will output progress info to stdout
in jsonlines format:

{"transferred": 249561088, "size": 261095424, elapsed: 1.234567, "description": "Copying disk"}
{"transferred": 256901120, "size": 261095424, elapsed: 1.345678, "description": "Copying disk"}
{"transferred": 261095424, "size": 261095424, elapsed: 1.456789, "description": "Finalizing transfer"}

Program using ovirt-img will read stdout line by line, parsing the json message from each line.

Fields:

  • size: int64, optional. Imageio know the size only after connecting to imageio server. During the first few seconds when the size is unknown, the system cannot report any progress value.
  • transferred: int64, required
  • elapsed: float, required
  • description: string, required

MTV use case

It can work like this:

  • Add progress info to the CRD related to the import job

    Example yaml fragment:

    status:
      progress:
          transferred: 249561088
          size: 261095424
          elapsed: 1.234567
          description: "Copying disk"
    
  • The UI watches the CR during the transfer

  • The program running ovirt-img read progress line and update the import CR status

  • The UI update progress bar based on CR change events

nirs commented

@bennyz @yaacov please check.

I think the little Go problem running ovirt-img can read the json log lines from stderr
and update the progress info in the cr status. The UI can watch the cr status and show
progress.

The UI can watch the cr status and show
progress.

yes, the easiest way for the UI to track progress will be using the cr status (via some condition or a specific status field)

An actual sample of an early implementation:

./ovirt-img  download-disk -c engine --json <disk_id> download.raw
{"progress": 0, "description": "creating transfer", "elapsed": "0.00s", "data": "0 bytes", "rate": "0 bytes/s"},
{"progress": 0, "description": "downloading image", "elapsed": "1.82s", "data": "0 bytes", "rate": "0 bytes/s"},
{"progress": 0, "description": "downloading image", "elapsed": "1.87s", "data": "0 bytes", "rate": "0 bytes/s"},
{"progress": 0, "description": "downloading image", "elapsed": "1.87s", "data": "960.00 KiB", "rate": "512.84 KiB/s"},
{"progress": 2, "description": "downloading image", "elapsed": "1.87s", "data": "124.56 MiB", "rate": "66.54 MiB/s"},
{"progress": 6, "description": "downloading image", "elapsed": "1.87s", "data": "380.44 MiB", "rate": "203.22 MiB/s"},
{"progress": 10, "description": "downloading image", "elapsed": "1.87s", "data": "628.81 MiB", "rate": "335.89 MiB/s"},
...
{"progress": 98, "description": "downloading image", "elapsed": "6.99s", "data": "5.88 GiB", "rate": "861.86 MiB/s"},
{"progress": 99, "description": "downloading image", "elapsed": "7.14s", "data": "5.99 GiB", "rate": "858.72 MiB/s"},
{"progress": 100, "description": "downloading image", "elapsed": "7.15s", "data": "6.00 GiB", "rate": "858.77 MiB/s"},
{"progress": 100, "description": "finalizing transfer", "elapsed": "7.19s", "data": "6.00 GiB", "rate": "854.81 MiB/s"},
{"progress": 100, "description": "download completed", "elapsed": "16.41s", "data": "6.00 GiB", "rate": "374.40 MiB/s"},
{"progress": 100, "description": "download completed", "elapsed": "16.41s", "data": "6.00 GiB", "rate": "374.40 MiB/s"}

Printed to stderr. How does that look?

nirs commented
./ovirt-img  download-disk -c engine --json <disk_id> download.raw
{"progress": 0, "description": "creating transfer", "elapsed": "0.00s", "data": "0 bytes", "rate": "0 bytes/s"},

This is API for a program, so we don't need to use human readable size and rate. It is better to report values in bytes and bytes per second. The program consuming this info can convert
the values to human readable values when displaying to the user.

I suggest this format:

  • size: number of bytes to transfer, available only once the client get the size of the remote image. (int)
  • description: description of current step (str)
  • done: number of byes transferred (int)
  • elapsed: seconds since start (float)

The program consuming this API can compute the rate (done/elapsed) or the progress (done / size * 100). This way the program can control the way values are displayed. For example, display the progress in integer percents or float, size in bytes/s, KiB/s, or GiB/s based on the number, or using always the same unit (engine style).

nirs commented

Some notes I missed in the previous comment:

  • Output should go to stdout, not stderr
  • No "," at the end of the line, we want to use jsonlines format
nirs commented

@bennyz can you check this issue and make sure it will work for you?

Trying to use this before we release will be best.

I agree with #154 (comment)

  • use int instead of strings for bytes and time
    "data": "5.88 GiB" may be easier to parse if it's "dataBytes": "6313601925"
  • "data" may also imply "totalDataToTransfer" , I prefer more verbose name maybe "transferredBytes" ?
nirs commented
  • "data" may also imply "totalDataToTransfer" , I prefer more verbose name maybe "transferredBytes" ?

This style is not consistent with the project style. We use the term "transferred"
when reporting number bytes transferred in GET /tickets/ticket-id. Using the same
term may work best.

The k8s resource can have more verbose names matching other names in k8s. The tool
running ovirt-img and publishing updates to the CR can transfer the names.

I'd vote for transferred too, partially for consistency, partially because I disfavor camel case :P

But I agree in that data was missing information and could be misinterpreted.