Bad Performance using Python

Question

Bad Performance using Python

raeudigerRaeffi opened this issue 7 months ago · 1 comments

Hi we are using a self hosted version of piston and we encountered some major limitations with regards to runtime for Python. Given that Piston advertise itself as efficent and fast I assume the issue is with us and not with the software.
Our setup is the following:
We use the piston docker image with the cli to install python. Then we run sudo /piston/packages/python/3.12.0//bin/pip3 install statsmodels plotly plotly-express scikit-learn in order to install custom libaries.
The following enviroment variables are set:

PISTON_RUN_TIMEOUT=80000
PISTON_STDERR_LENGTH=800000
PISTON_MAX_PROCESS_COUNT=124
PISTON_MAX_FILE_SIZE=100000
PISTON_OUTPUT_MAX_SIZE=250000

Using this setup the code displayed below takes around 20 secs to execute for 50 data point in os.environ["data"] (On my machine it takes less than a second).

import os
import json
import pandas as pd
import plotly
import numpy as np
import plotly.express as px
data = json.loads(os.environ["data"])
df = pd.DataFrame(data)
df['order_date'] = pd.to_datetime(df['order_date'], format='%d/%m/%Y %H:%M')
fig = px.scatter(df, x='order_date', y='sales', trendline='ols')
graph_json = plotly.io.to_json(fig)\nprint({\"type\":\"plot\",\"variable\":graph_json})

Answer 1 · 2024-05-13T09:35:14.000Z

Are you running Piston on the same system as your local test? This could be one factor for the slow performance. This shouldn't have too large of an impact though.

I'm thinking this might be to do with python not caching pyc files for these libraries.
This is by design to ensure complete isolation of code with no persistent files across runs.

I would try seeing which lines of code are causing the performance bottleneck. My bets would be on one of the imports