what is the canonical way to mimic commands like `gpg --decrypt test.xz.gpg | xz -d`?

Question

what is the canonical way to mimic commands like `gpg --decrypt test.xz.gpg | xz -d`?

braindevices opened this issue 2 years ago · 1 comments

I have some very large gpg file > 20GB, usually I pipe the output to other program. For example, gpg --decrypt test.xz.gpg | xz -d, if I want to decrypt them. In python code, I would like to generalize this, thus it requires to decrypt to a file-like buffer object. Then I can read that file-like object with my other function in a different thread/subprocess. In gpgme decrypt(), it accept a sink argument which can be any file-like object. The simple example is like following:

with open("/tmp/test.dec", "wb") as _f:
    p = subprocess.Popen(['xz', '-d', '-'], stdin=subprocess.PIPE, stdout=_f)
    with open("/tmp/test.xz.gpg", "rb") as _f:
        deret = ctx.decrypt(_f, sink=p.stdin)
        p.stdin.close()
p.wait()

However, in python-gnupg, I did not see any possible way to provide a file object as output.

I wonder, is there any canonical way to achieve the same behaviour? I knew I can use namedpipe in some cases:

import gnupg
import subprocess
ctx = gnupg.GPG(gnupghome="/tmp/gpgtmp")
import os
namedpipe = "/tmp/gpgdecrypt.pipe"
os.mkfifo(namedpipe)
try:
    with open("/tmp/test.dec", "wb") as _out:
        p = subprocess.Popen(['xz', '-d', namedpipe, '--stdout'], stdout=_out)
    with open("/tmp/test.xz.gpg", "rb") as _in:
        ctx.decrypt_file(_in, output=namedpipe)
    p.wait(timeout=10)
finally:
    os.unlink(namedpipe)

However, on windows there is no mkfifo. It would be nicer if gnupg support file-like object as output, wouldn't it?

Answer 1 · 2022-10-05T22:27:34.000Z

You can set the on_data attribute of a GPG instance to a callable that gets passed chunks of gpg's output. Search the documentation for all instances of on_data to see how it is used, but the key bits are:

New in version 0.4.1: Instances of the GPG class now have an additional on_data attribute, which defaults to None. It can be set to a callable which will be called with a single argument - a binary chunk of data received from the gpg executable. The callable can do whatever it likes with the chunks passed to it - e.g. write them to a separate stream. The callable should not raise any exceptions (unless it wants the current operation to fail).

Changed in version 0.4.4: The on_data callable will now be called with an empty chunk when the data stream from gpg is exhausted. It can now also return a value: if the value False is returned, the chunk will not be buffered within python-gnupg. This might be useful if you want to do your own buffering or avoid buffering altogether. If any other value is returned (including the value None, for backward compatibility) the chunk will be buffered as normal by python-gnupg.