python vs. python3 in line 96 of /detector/server.py
AndrewBarfield opened this issue · 5 comments
This is a simple problem. Just posting so others are aware.
To get the Web-based GPT-2 Output Detector to work I had to change "python" to "python3" in line 96 of /detector/server.py. See:
gpt-2-output-dataset/detector/server.py
Line 96 in 12459ab
System:
OS: Ubuntu 19.10 eoan
Kernel: x86_64 Linux 5.3.0-19-generic
Uptime: 13d 6h 1m
Packages: 2125
Shell: bash 5.0.3
Resolution: 2560x1440
DE: GNOME
WM: GNOME Shell
WM Theme: Adwaita
GTK Theme: Yaru-dark [GTK2/3]
Icon Theme: Yaru
Font: Ubuntu 11
CPU: Intel Core i7-8809G @ 8x 4.2GHz [27.8°C]
GPU: AMD VEGAM (DRM 3.33.0, 5.3.0-19-generic, LLVM 9.0.0)
RAM: 6278MiB / 32035MiB
Behavior before the change:
~/Projects/AI/gpt-2-output-dataset/detector$ python3 -m server detector-large.pt
Loading checkpoint from detector-large.pt
Starting HTTP server on port 8080
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named torch
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/drew/Projects/AI/gpt-2-output-dataset/detector/server.py", line 120, in
fire.Fire(main)
File "/home/drew/.local/lib/python3.7/site-packages/fire/core.py", line 138, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/home/drew/.local/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
target=component.name)
File "/home/drew/.local/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/home/drew/Projects/AI/gpt-2-output-dataset/detector/server.py", line 96, in main
num_workers = int(subprocess.check_output(['python', '-c', 'import torch; print(torch.cuda.device_count())']))
File "/usr/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/usr/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['python', '-c', 'import torch; print(torch.cuda.device_count())']' returned non-zero exit status 1.
Behavior after (is as expected)
~/Projects/AI/gpt-2-output-dataset/detector$ python3 -m server detector-large.pt
Loading checkpoint from detector-large.pt
Starting HTTP server on port 8080
[] Process has started; loading the model ...
[] Ready to serve
[] "GET / HTTP/1.1" 200 -
[] "GET /favicon.ico HTTP/1.1" 200 -
[] "GET /?This%20is%20an%20online%20demo%20of%20the%20GPT-2%20output%20detector%20model.%20Enter%20some%20text%20in%20the%20text%20box;%20the%20predicted%20probabilities%20will%20be%20displayed%20below.%20The%20results%20start%20to%20get%20reliable%20after%20around%2050%20tokens. HTTP/1.1" 200 -
Good catch - thanks! I'll replace that with sys.executable
so that it is not dependent on the executable name.
@jongwook even when using sys.executable
in case of virtualenv and aliases it will not work. On macOS you will get python
, while I'm running with python3
.
Unfortunately we cannot use argv[0]
in this case...
Hmm.. I haven't thought about the case of virtualenv; in the conda
environments that we're using the executable has always been python
.
I assume you're not doing multi-GPU training since you're on a Mac, so you may simply use:
if torch.cuda.is_available():
num_workers = int( torch.cuda.device_count() )
The whole subprocess fiddle was to avoid a CUDA error that may happen in multi-process multi-GPU training (see #13 for details).
@jongwook ah yes that one is a way, thank you!
Question. num_workers
refers to gpu only? I mean, when in TF I can do like:
N_CPU = multiprocessing.cpu_count()
# OMP_NUM_THREADS controls MKL's intra-op parallelization
# Default to available physical cores
os.environ['OMP_NUM_THREADS'] = str( max(1, N_CPU) )
tf.ConfigProto(
device_count={ 'GPU' : 1, 'CPU': N_CPU },
intra_op_parallelism_threads = 0,
inter_op_parallelism_threads = N_CPU,
allow_soft_placement=True
)
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.6
so that I can use at least 8 core parallelism on macOS, etc.
Yeah in single-node CPU training you shouldn't need to do multiprocessing, since the multithread capability from the OMP/MKL backend should be sufficient.