uncleguanghui/pyflink_learn

pyflink 关于任务执行的问题

Closed this issue · 1 comments

pyflink任务的py脚本启动执行方式很少有文档提及,自己摸索感觉到不太对,想请大佬们帮忙看看,在此谢过

环境:
flink==1.15
python==3.7
apache-flink==1.15

问题1:使用flink run执行py脚本时无法指定python环境
官网执行py任务命令:./bin/flink run --python examples/python/table/word_count.py
但是python项目都是有自己的virtualenv,看官方文档没有相关指定python环境的参数,是pyflink不支持吗?

问题2:直接使用python来执行py脚本可以成功,并看到输出,但是这种方式可行吗?
直接使用虚拟环境的python来执行py脚本可以成功执行,但是在这个过程中没有任何地方去指定flink,在flink中也没有看到任务的信息,感觉是这个py任务脱离了flink

问题3:windows环境执行py脚本,在创建环境时就报错,但是没有看到相关讨论的文章
步骤:直接使用python来运行py脚本

Traceback (most recent call last):
  File "F:/PyDev/exercise/DB/pyflinker.py", line 14, in <module>
    stream_execution_environment=StreamExecutionEnvironment.get_execution_environment(),
  File "D:\.env\py37_data\lib\site-packages\pyflink\datastream\stream_execution_environment.py", line 802, in get_execution_environment
    gateway = get_gateway()
  File "D:\.env\py37_data\lib\site-packages\pyflink\java_gateway.py", line 62, in get_gateway
    _gateway = launch_gateway()
  File "D:\.env\py37_data\lib\site-packages\pyflink\java_gateway.py", line 106, in launch_gateway
    p = launch_gateway_server_process(env, args)
  File "D:\.env\py37_data\lib\site-packages\pyflink\pyflink_gateway_server.py", line 326, in launch_gateway_server_process
    stdin=PIPE, preexec_fn=preexec_fn, env=env)
  File "C:\Program Files\Python37\lib\subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "C:\Program Files\Python37\lib\subprocess.py", line 1207, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] 系统找不到指定的文件。

指定python环境的参数找到了 ( ̄▽ ̄*)

## 指定python环境
# ./bin/flink  run  --python examples/python/table/word_count.py  -pyexec ./venv/bin/python3.7

#  Python Table API  指定python环境
table_env.get_config().set_python_executable("./venv/bin/python3.7")

# #  Python DataStream API   指定python环境
# stream_execution_environment.set_python_executable("./venv/bin/python3.7")