jupyter-server/jupyter-scheduler

Scheduled job executing more than 1 time

TallGibbs opened this issue ยท 27 comments

Description

I scheduled a script to run at 12:00 PM on each weekday. When the next scheduled time came around, the job appears to have executed 3 different times with their run times overlapping each other. I noticed this because my script is setup to have an automatic email sent and I received 3 copies of the email all with different results from executing my script. Then, I checked the jupyter lab "notebook job" tab and saw 3 results of "completed" for the same script. However in my "notebook job definitions" area, the script was only scheduled 1 time.

I'm wondering if the environment selected to run the job in affected this. I scheduled it to run in the "base" environment (anaconda3) but I actually built the script in "projects" environment. Also, I had previously scheduled this job, but then made revisions to the script. So I deleted the scheduled job, and then scheduled it again. Could this have affected the results?

image

Here is my revised scheduled job (sorry, this screenshot is taken from AFTER the original issue showed up and I made a new schedule before opening this ticket).
image

Reproduce

I'm not actually sure how to reproduce the issue. All I did was schedule the script to run and this issue happened.

Expected behavior

I wanted the job to only run 1 time at the scheduled time. My script includes an automatic email sent out and I expected to only see 1 email and 1 completed job in the "completed job" tab.

Context

  • Operating System and version:
    Brower = Chrome
  • Jupyter Server version: 2.10.0
  • Jupyter Lab version: 4.0.8
  • Jupyter Notebook version: 7.0.6
Troubleshoot Output
Paste the output from running `jupyter troubleshoot` from the command line here.
You may want to sanitize the paths in the output.

(I tried to paste my entire result but I got an error that body is too long). I have manually trimmed some info out of this result.

(base) C:>jupyter troubleshoot
$PATH:

sys.path:
C:\Users\XXXXX\AppData\Local\anaconda3\Scripts
C:\Users\XXXXX\AppData\Local\anaconda3\python311.zip
C:\Users\XXXXX\AppData\Local\anaconda3\DLLs
C:\Users\XXXXX\AppData\Local\anaconda3\Lib
C:\Users\XXXXX\AppData\Local\anaconda3
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages\win32
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages\win32\lib
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages\Pythonwin
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages\win32
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages\win32\lib
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages\Pythonwin

sys.executable:
C:\Users\XXXXX\AppData\Local\anaconda3\python.exe

sys.version:
3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)]

platform.platform():
Windows-10-10.0.19045-SP0

where jupyter:
C:\Users\XXXXX\AppData\Local\anaconda3\Scripts\jupyter.exe
Jupyter Troubleshoot Results.docx

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! ๐Ÿค—

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! ๐Ÿ‘‹

Welcome to the Jupyter community! ๐ŸŽ‰

This job ran from its daily schedule and now I received 4 emails and it appears the job executed 4 times today (1 more time than yesterday). Is this going to continue to increase each day by 1?

Here you can see the notebook jobs completed:
image

Here is the current scheduled job:
image

Hi @TallGibbs. Thank you for opening this issue and proactively providing jupyter troubleshoot output, this is very helpful.

Could you please send a screenshot of "Run on a schedule" section of the Create Job screen for the job that this happened/happening with? For the sake of reproduction I'd like to understand if you define the schedule via Interval > Day or Interval > Custom schedule > cron expression option.

Screenshot 2024-01-25 at 11 18 26โ€ฏAM

Here you go!

Creating the schedule:
image

Final output of the scheduled job from the "Job Definition" page:
image

Thank you for the details @TallGibbs. I was not able to reproduce this through clock manipulation, trying to reproduce "organically" by waiting 12 PM.

@andrii-i Today the job completed two times. However, yesterday afternoon I re-installed jupyter notebook version 6.5.4 through Anaconda Navigator GUI, then updated (back) to 7.0.6 and then deleted all my jobs in Jupyter lab. Then I scheduled this job again and now it resulted in the job running twice:

image

Here is my scheduled jobs:
image

I'm not going to touch anything today on the job scheduler and see what happens on Saturday, Sunday and Monday. It should NOT run on the weekend and then should run again on Monday.

I set up a daily job run on my Windows 10 machine that has run exactly once per day, every day, since I first configured it.

If you run jupyter server list on a command prompt in your same Conda environment, how many entries do you see under "Currently running servers"? If you have multiple Jupyter Server instances running, that might be related to this.

Here is the update for today: on Saturday and Sunday the job did NOT run (which is a good thing; only scheduled Mon through Fri). Today though (Monday), the job ran 3 times which is once more than what happened on Friday.

Somehow the scheduler is running the job once more each time? Is there any way that my code can be causing this? I'm assuming NO because I see on Jupyter Lab's "Notebook jobs" tab indicating the whole job ran (and not that my code just looped through it).

Here is the job's result from today:
image

I'm going to leave everything the same for tomorrow and see if it run 4 times. Then, I'm going to make a new schedule that is NOT at 12:00 PM and see if that is related to the issue. @andrii-i

Thanks! For what it's worth, my schedule was at 8:00 am, and the run time typically started within 10 seconds of the hour.

I set up a daily job run on my Windows 10 machine that has run exactly once per day, every day, since I first configured it.

If you run jupyter server list on a command prompt in your same Conda environment, how many entries do you see under "Currently running servers"? If you have multiple Jupyter Server instances running, that might be related to this.

I just ran that code on my activated environment and I see 5 servers running (I'm not sure if it is safe to post screenshot of full server addresses). It may be confusing though because I also currently have Jupyter Notebook open with some code I am working on, while also running Jupyter Lab.

I'm guessing a better test it to close everything down, then open just an Anaconda Prompt and run that code (after activating my environment)?

If you have multiple Jupyter Servers open with the Jupyter Scheduler server extension running, that might be related to the behavior you're seeing. If you keep only one such server running, does that fix the problem?

(This may be an enhancement opportunity to run jobs only once, even if multiple servers are running.)

If you have multiple Jupyter Servers open with the Jupyter Scheduler server extension running, that might be related to the behavior you're seeing. If you keep only one such server running, does that fix the problem?

(This may be an enhancement opportunity to run jobs only once, even if multiple servers are running.)

What do you think is most valuable for testing/documentation purposes? I see 3 options I could do right now.

Option 1 = Do nothing, and see if script runs 1 additional time tomorrow at noon.
Option 2 = Close all jupyter servers, then proceed with essentially option 1 (let scheduled job run at Noon).
Option 3 = Close all jupyter servers, make a new scheduled job that runs today at 3:00 PM and see what happens

@TallGibbs please try option 2. Please make sure you have only 1 instance of the jupyter server with jupyter_scheduler extension installed running and see if the duplication goes away. I would expect this to solve the problem. I tried the same schedule on windows with 1 instance of the jupyter server running and every job is created only once:
no_duplicates

@andrii-i Ok, I physically closed all open tabs with anything Jupyter related and then opened an Anaconda Prompt and ran the command: jupyter server list

Here are the results from that:
image

It appears there are 3 servers still running even though I do not physically have any open browsers related to Jupyter. So, to try and shut them down I ran this command: jupyter notebook stop 8888

image

I get the response "Could not stop server on 8888" within my Anaconda Prompt. Same response for all 3 servers when I try the different ports. It makes me think there is some conflicts between my currently installed Anaconda prompt and previous versions? I had to uninstall and reinstall several times in the past few weeks due to some package conflicts.

@andrii-i sorry for the additional post: it looks like I had to completely close Anaconda Navigator and now I show 0 servers running.

image

Today's update: with all the servers shutdown the scheduled job did not run at all. Is that the expected response? Is there a way to execute a scheduled job without a server running?

Now that we verified no job ran, with no servers running, I'm going to schedule the file to run this afternoon and see what happens.

Looks good today, job ran successfully and only ran 1 time. Now, I will leave it untouched through tomorrow's scheduled job and verify it still only runs once.

image

@TallGibbs thank you for testing and confirming that there is no duplication when single server is running.

Today's update: with all the servers shutdown the scheduled job did not run at all. Is that the expected response? Is there a way to execute a scheduled job without a server running?

It is an expected response. Jupyter Scheduler is composed of jupyter_scheduler server extension and @jupyterlab/scheduler lab extension. jupyter_scheduler server extension owns and manages job execution. So with no instances of server running there would be no jobs created, with multiple instances of the server running there would be multiple jobs created.

Looks good today, job ran successfully and only ran 1 time. Now, I will leave it untouched through tomorrow's scheduled job and verify it still only runs once.

Sounds good. Please let us know if problem goes away.

It seems like your issue has been resolved, so I'll close this for now. Feel free to continue discussion if needed however! ๐Ÿ‘‹

Thank you everyone for all the help.

I'm not sure this is directly the right place for this question, but thought I would ask anyways even if just for a redirect: is there a way for me to have a jupyter server session running without Anaconda Navigator open? We are trying to find a replacement for Alteryx Server running scheduled jobs (at my company) and I'm hoping I can find a way with Jupyter where a schedule job can run even when my work computer is shutdown. I'm not an expert in computer networks/servers (clearly), so I am kind of figuring this out as I go...

I run JupyterLab by running jupyter lab from an Anaconda-enabled command prompt, and I haven't run Anaconda Navigator at all. You'll need to have a server running all the time, ideally something running in a data center or in a hosted environment (disclosure: I work for AWS), that stays up so that jobs get rerun.

@JasonWeill awesome info, thank you that is helpful and gives me some ideas.

Another follow-up closely related to this issue. Is this normal that using the command "jupyter server stop 8888" gives back the response "Could not stop server on 8888" but then when you run "jupyter server list" the server on port 8888 no longer shows running? Seems like a flaw in the code when it tells me "could not stop server on 8888."

Here is my screenshot of this exact sequence:
image

Jupyter Server has a separate issue queue: https://github.com/jupyter-server/jupyter_server/issues

@TallGibbs below are some good places to ask questions if you don't want to create an issue: