How to use webscreenshot from inside a python script?
Opened this issue ยท 11 comments
The documentation states:
pip install webscreenshot and then directly use webscreenshot
How does one directly use webscreenshot?
My python script contains:
import webscreenshot
Now, how do I call webscreenshot directly from the script? The documentation doesn't provide any examples. It does for calling the script from the commandline and passing arguments, but I want to call it directly from inside my python script.
webscreenshot.take_screenshot(list_of_urls)
doesn't seem to work.
Hello,
You indeed need to call that function.
But before that you need a proper options
variable with parameters specified inside: launch the tool with -vv
option and you will see the structure of that variable here
Cheers.
Hello,
Here below a more precise answer:
import argparse
from webscreenshot.webscreenshot import *
# url list to screenshot
url_list = ['http://google.fr', 'http://google.com']
# defining options manually
options = argparse.Namespace(URL=None, cookie=None, header=None, http_password=None, http_username=None, input_file=None, log_level='DEBUG', multiprotocol=False, no_xserver=False, output_directory='/tmp/screenshots', port=None, proxy=None, proxy_auth=None, proxy_type=None, renderer='phantomjs', renderer_binary=None, ssl=False, timeout=30, verbosity=2, window_size='1200,800', workers=4)
# actually launching the function
take_screenshot(url_list, options)
I admit that this use case deserves a better approach.
Cheers
I'm getting this error on using the above code snippet
`[+] 2 URLs to be screenshot
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.8/site-packages/webscreenshot/webscreenshot.py", line 421, in craft_cmd
output_format = options.format if options.renderer == 'phantomjs' else 'png'
AttributeError: 'Namespace' object has no attribute 'format'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/aditya/GIT/Web/test.py", line 11, in
take_screenshot(url_list, options)
File "/usr/lib/python3.8/site-packages/webscreenshot/webscreenshot.py", line 525, in take_screenshot
taken_screenshots = [r for r in pool.imap(func=craft_cmd, iterable=izip(url_list, itertools.repeat(options)))]
File "/usr/lib/python3.8/site-packages/webscreenshot/webscreenshot.py", line 525, in
taken_screenshots = [r for r in pool.imap(func=craft_cmd, iterable=izip(url_list, itertools.repeat(options)))]
File "/usr/lib/python3.8/multiprocessing/pool.py", line 865, in next
raise value
AttributeError: 'Namespace' object has no attribute 'format'
`
@ss2sfcollege, have you followed indications.
If yes, it's weird, as the format
option is declared in the code sample.
Hello,
I'm getting the following error if executed the above program:
C:\Users\sandeep\PycharmProjects\sparkflow_validation\venv\Scripts\python.exe C:/Users/sandeep/PycharmProjects/sparkflow_validation/take_screenshot.py
[+] 2 URLs to be screenshot
[+] 2 URLs to be screenshot
[+] 2 URLs to be screenshot
[+] 2 URLs to be screenshot
[+] 2 URLs to be screenshot
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 262, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 95, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\sandeep\PycharmProjects\sparkflow_validation\take_screenshot.py", line 11, in
take_screenshot(url_list, options)
File "C:\Users\sandeep\PycharmProjects\sparkflow_validation\venv\lib\site-packages\webscreenshot\webscreenshot.py", line 523, in take_screenshot
pool = multiprocessing.Pool(processes=int(options.workers), initializer=init_worker)
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 212, in init
self._repopulate_pool()
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 326, in _repopulate_pool_static
w.start()
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\context.py", line 326, in _Popen
return Popen(process_obj)
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\sandeep\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
The same error is getting in a loop and the program is not terminating
@poornasandeep can you paste here the code you are using to call webscreenshot ?
@maaaaz I am getting the same error as @poornasandeep. Here is the code I am using (taken from the FAQ):
import argparse
from webscreenshot.webscreenshot import *
url_list = ['http://google.com']
options = argparse.Namespace(URL=None, cookie=None, header=None, http_password=None, http_username=None, input_file=None, log_level='DEBUG', multiprotocol=False, no_xserver=False, output_directory='./screenshots', port=None, proxy=None, proxy_auth=None, proxy_type=None, renderer='phantomjs', renderer_binary=None, ssl=False, timeout=30, verbosity=2, window_size='1200,800', workers=4)
take_screenshot(url_list, options)
It does not terminate and it keeps printing [+] 1 URLs to be screenshot
forever.
I am using Python 3.8.3 on Windows 10 (2004 update), with version 2.92 of the webscreenshot package.
Here is the error stack
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Program Files\Python38\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Program Files\Python38\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "d:\upwork\Nikhil Parekh\SMTP\mail with html\utilities.py", line 11, in <module>
take_screenshot(url_list, options)
File "C:\Users\yusuf\AppData\Roaming\Python\Python38\site-packages\webscreenshot\webscreenshot.py", line 535, in take_screenshot
pool = multiprocessing.Pool(processes=int(options.workers), initializer=init_worker)
File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 119, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild,
File "C:\Program Files\Python38\lib\multiprocessing\pool.py", line 212, in __init__
self._repopulate_pool()
File "C:\Program Files\Python38\lib\multiprocessing\pool.py", line 303, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "C:\Program Files\Python38\lib\multiprocessing\pool.py", line 326, in _repopulate_pool_static
w.start()
File "C:\Program Files\Python38\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 326, in _Popen
return Popen(process_obj)
File "C:\Program Files\Python38\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Thanks for reporting, it seems related to the way Python 3.8 now behaves with multiprocessing.
I think that the pool creation (that line) should be moved to the main()
function, as suggested on different cases
In the meantime, try to execute your code with Python 3.7 and not 3.8.
I confirm that bug, I tried to fix it but unfortunately failed so far in front of this madness.
I do understand the technical reasons, but I regret that users calling webscreenshot
from alternate scripts will have to handle multiprocessing by themselves instead of webscreenshot
doing it on its own.
An alternative would be to run it as a subprocess which seems to be working fine for me on Python 3.8:
import subprocess
subprocess.run('webscreenshot google.com --window-size 800,600')