OxygenFramework/Oxygen.jl

Hot module reloading

Opened this issue Β· 43 comments

Is there a way to restart the server on file changes within the directory?

I was thinking something like BetterFileWatcher.jl might be useful for something like this but I'm not sure how to implement it

I tried to take a little look at this myself. What I found was that Revise and includet of my main module alone were not sufficient. Possibly invokelatest needs to be used somewhere in the request/response loop? Or else listen for the revise event and restart the server altogether?

You might get some idea by looking at Genie's code:

https://github.com/search?q=repo%3AGenieFramework%2FGenie.jl%20revise&type=code

Happy to help test/review a PR.

Hi @Dale-Black & @frankier

Thanks for opening this ticket and for the suggestions. This is a good next issue to tackle which would definitely help speed up local development.

I'll definitely try to follow what Genie.jl is doing to get this working, but I'm open to any and all suggestions. I did find a similar approach in this issue: JuliaWeb/HTTP.jl#587

Also I'll let you know when I have something worth testing & reviewing @frankier, thanks for the help!

That http.jl code looks great. So something like HTTP.listen with Revise should work?

Is there any progress in this direction?

@JanisErdmanis,

There was some progress made in a PR & discussion about a month ago - but I haven't worked on it since. I've been very busy this December.

@ndortega I see in #134 you mentioned a "server flapping issue". Where is that, and is there anything we can help with?

I use Revise and to do it I have:

include("routes.jl")
__revise_mode__ = :eval
Revise.track("routes.jl")
function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

Hmm, and that works? Can you share a bit more of your code and structure? Where are you serving the app from?

Hi guys,

Sorry about the inactivity on this issue. This issue is now on the top of my todo list. As far as direction and approaches, I'm all ears. Like most things, this is new territory for me and I'm open to any and all suggestions on how to tackle this problem.

@asjir thanks for your snippet. Id also like to hear more about your project setup and workflow. I really like how you embedded this feature as a middleware function, which would be a super clean way to integrate behind the scenes

Hmm, and that works? Can you share a bit more of your code and structure? Where are you serving the app from?

Ofc, though it's very basic: in my repo I have sveltekit projects and one folder for julia server X, in X I have 5 files: Project.toml, Manifest.toml, main.jl, routes.jl, util.jl
at the top of main.jl I have cd(@__DIR__) (so imports run from REPL) and then include("routes.jl") etc.
at the bottom of main.jl I have:

run(; port=2227, async=true) = (!isdefined(Main, :s) || !isopen(Main.s)) && (Main.s = serve(middleware=[CorsHandler, ReviseHandler]; port, async))
run()

this function starts an async server that I can close(s) but reruns of it don't try to start a new server which would overwrite my s variable

and inside routes.jl there's

include("util.jl")
Revise.track("util.jl")

so changes to util are also immediately reflected.

While I'm at it I'll show an example route:

post("/make-qch") do req
    data = json(req, MyStruct)
    saved[] = data
    f(data)  # returns NamedTouple
end

which allows me to call f(saved[]) from the REPL and Infiltrator.@inflitrate inside this function (can't use it in an async call of server response).

It looks like Fons' new project has built in hot module reloading. I haven't worked with it yet but something to look into
https://github.com/JuliaPluto/PlutoPages.jl

I can report that I have had some success with ReviseHandler, which use as follows:

using ModuleA
using Oxygen

function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

server = ModuleA.serve(port=2345; middleware=[ReviseHandler])

The ModuleA is defined as:

module ModuleA

using Oxygen; @oxidise

@get "/inside" function(req::Request)
    return "Hello from Inside"
end

end 

where the @oxidise macro will be available after the pull request #158 will be merged.

That looks clean

The pull request is now merged on master. It would be great to get a feedback on the Revise workflow before the release.

Amazing, works very nicely! (https://github.com/Dale-Black/HTMLStrings.jl/tree/main/examples/TodoApp)

Now I have more motivation to keep exploring "full stack" Julia. Excited to build out a better TodoApp with pure Julia as an example

@Dale-Black, is there any reason why you have chosen to register routes into the todo() function? For me, it looks like a bizarre thing to do, as the routes are registered at runtime.

No there isn't. I just changed that

I am following this with great interest. Unfortunately I cannot get the approach from #122 (comment) to work, possibly because I don't understand how to setup folder structures and modules properly. Specifically, I am doing:

server.jl:

includet("submodule.jl")

using ..ModuleA
using Oxygen

function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

server = ModuleA.serve(port=8080; middleware=[ReviseHandler])

and submodule.jl:

module ModuleA

using Oxygen; @oxidise

@get "/greet" function(req::Request)
    return "Hello from Inside, changed"
end

end 

Server works, but doesn't reload. What is strange, though, is that I need to restart Julia for changes to submodule.jl to register. A simple Ctrl-C and rerun of the server.jl (using run in REPL from in vscode) is not sufficient.

Is it required to have ModuleA in a proper package that is added as a dev dependency?

The approach I outlined earlier works only when the module is loaded statically, which Revise then picks up. To get the setup working, use ] generate ModuleA, put the code in src/ModuleA.jl, use @oxidise macro and activate the project with ] activate . which makes ModuleA available. Then, in the REPL, start the service with ReviseHandler. I haven’t managed to get an includet approach to work. Perhaps it is related to the way globals are revised.

Thanks. I got bit by not understanding modules vs packages and the functionality of ] generate.

I also managed to get it working using PkgTemplates, but this is maybe just more work for little gain, compared to your approach.

using PkgTemplate
t = Template(dir=".")
t("ServerPackage")

The development then happens in this ServerPackage.

Separate from this, a new server.jl resides in a project that has ServerPackage as a development dependency. (]dev ./ServerPackage) I have it in the root folder.


using Revise

import ServerPackage

function ReviseHandler(handle)
    req -> begin
        Revise.revise()
        invokelatest(handle, req)
    end
end

server = ServerPackage.serve(port=8080; middleware=[ReviseHandler])

And then, (just because it is possible), I run a client in separate process that repeatedly pings the API.

I've also managed to get the Revise-before-service-request approach working. Thanks a lot! It is very nice in its simplicity, however, it is in a way a "just-too-late" approach. Revising can take time, and it moves the process from when the changes are ready in the source to when you want the page to be rendered, meaning longer waits/increase latency.

I seem to be getting reasonably results with a combined approach where we have Revise watch the file system and runs asynchronously with the server process and then in-addition run Revise before service request. This, in combination with invokelatest appears to make the changes visible in the server thread. It does not appear to be strictly needed to also have Revise running in the server task, but I added this to try and ensure the request is blocked until revising is finished for longer revisions.

using Revise
using Oxygen
using ServerPackage: ServerPackage

function ReviseHandler(handle)
    req -> begin
        if !isempty(Revise.revision_queue)
            @time "Sync revision" Revise.revise()
        end
        invokelatest(handle, req)
    end
end

ServerPackage.serve(; host="127.0.0.1", port=8001, handler=ws_handler, middleware=[ReviseHandler], async=true)
while true
    wait(Revise.revision_event)
    @info "Starting async revision"
    @time "Async revision" Revise.revise()
end

The asynchronous Revise looks interesting. I see one race condition when Revise.revise() takes a long time, and the source files are modified within that time. For the HTTP request not to reflect the old state of the codebase, it is necessary to put back Revise.revise() in the ReviseHandler. Also, note that you may also miss Revise.revision_event in case it runs on a different thread from your event loop.

I see that it could be useful when developing a web server with live reloading. By sending a refresh event to the browser after the asynchronous revision, it could be a cool feature to have in development.

I see one race condition when Revise.revise() takes a long time, and the source files are modified within that time.

Good catch! To be perfectly honest I wasn't even thinking about whether revision_event was edge or level triggered. This code was fairly carelessly cribbed from Revise.entr(...), so it looks like the problem may also exist in Revise.entr(...). Luckily it is likely to be very rare in practice, and the synchronous Revise papers over it in terms of correctness. (Robustness was my aim in including both anyway.)

Also, note that you may also miss Revise.revision_event in case it runs on a different thread from your event loop.

Can you please explain this one for me a bit more? Revise.revision_event is already fired from another thread. Is it because Condition isn't threadsafe? At least it claims not to be and that Threads.Condition should be used with locking instead. I don't know how big a problem this is, but if it is a problem, it is a problem also with Revise.jl.

In case multi-threaded Revise is not possible, might we be able to force all the Oxygen.jl + Revise.jl code onto a single thread somehow?

By sending a refresh event to the browser after the asynchronous revision, it could be a cool feature to have in development.

This is even a further step of "live" but it's not clear when it's wanted and when not, so anything like this should be somehow "opt-in". Personally I'm already used to refreshing the page so don't condition it a problem. Bonito's interactive_server https://simondanisch.github.io/Bonito.jl/stable/api.html#Bonito.interactive_server is a LiveView-style implementation of this approach.

Another point of information is that just-in-time (or just-too-late) Revise is what is used for REPLs. See:

https://github.com/timholy/Revise.jl/blob/13a5eb7986ee1239ff938a92043c13fee04579cc/src/packagedef.jl#L1326-L1356

So maybe it's not such a problem after all. I may be having getting slow Revise on 1.11rc2 for some reason or it might just be that a slow loading webpage is somehow a lot more irritating than pending output on a REPL.

I don't think thread safety is a problem in practice here since the file watching tasks run in the main thread (they are started as sticky Tasks) which is the same as the file-change trigger Revise loop. I'm probably making the race condition worse by having yield() in my loop (edited it away for now), but I think the interpreter may be able to yield during Revise.revise() so the problem seems to still be possible.

Can you please explain this one for me a bit more? Revise.revision_event is already fired from another thread. Is it because Condition isn't threadsafe? At least it claims not to be and that Threads.Condition should be used with locking instead. I don't know how big a problem this is, but if it is a problem, it is a problem also with Revise.jl.

The thread safety simply means that the condition can be waited from another thread without risking putting internal state of the condition in inconsistent state. It doesn't fix race conditions when a Condition is notified from one thread while on other it is no longer suspended on wait and hence is skipped.

This problem is not present when the notifier and waiter are on the same thread. I guess this is the case here if the task for watching file changes runs on the main thread or the notifications are channeled through one.

I guess first entr(...) should be fixed first if this is a problem. See this issue timholy/Revise.jl#837

Any updates on this? Or is there any elegant way to perform some pseudo-hot-reloading (e.g. restart the server by an external script that monitor the git history)

@mzy2240 It is, in my opinion, a solved issue, apart from proper documentation. Server restarts are unnecessary, but to use hot reloading, one must (a bit of a strong word here) develop their service in a package together with @oxidise macro. The solution #122 (comment) is still the best way to start.

that sounds great! if that is the case, I have two sincere suggestions:

  1. Can we add this hot reloading approach to the doc? Imo it is so important that should be documented, even if it is something a bit hacky or "experimental". We can always refine the method later.
  2. Programmatic server restart is still a desired feature to improve the overall service reliability. No matter it is a scheduled restart or controlled by external script, it definitely helps improve the experience (and maybe performance) of Oxygen.jl in the production.

It is, in my opinion, a solved issue, apart from proper documentation.

I don't entirely agree with this. It seems to me that if possible it should be possible to get good reloading behavior with a single function call, macro call or keyword option. We could try and design this so that it would only be usable in the case the user code was actually revisable (i.e. their project was correctly structured). I give another go at put together another pull request, but I would like to hear from @ndortega about whether they agree and what type of API they would like.

The solution #122 (comment) is still the best way to start.

Why is this a better place to start than the lower latency version I added? The race condition is benign and exists also in Revise.jl.

I don't entirely agree with this. It seems to me that if possible it should be possible to get good reloading behavior with a single function call, macro call or keyword option.

I don't really follow. From my perspective, the ReviseHandler approach already supports this. Regarding the additional features and background precompilation for revision, those, in my opinion, belong to separate issues themselves as those are enhancements. In my opinion, it's not clear enough from a user experience standpoint, for instance, whether they want to conserve their laptop battery or prefer to see knowledge delay as an enactment of their hard work. The race condition is also tricky to address, as it would occur more often for long recompilations, which is also the situation where a background recompilation could be useful. It is a tricky thing that needs to be deliberately discussed in a separate issue.

One thing that could though improve user experience is an automatic addition of ReviseHandler to the middleware in case Revise is already loaded. I suppose that could be done with package extensions. This would also make it easier to write documentation part of the Revise workflow.

Also, I forgot there should be a clear way to communicate to the user that Revises no longer works, for instance, when a type is redefined or a constant change is made. The server could exit, return some HTTP error code, or put some colour in the terminal logging as it is already in Julia REPL.

Programmatic server restart is still a desired feature to improve the overall service reliability. No matter it is a scheduled restart or controlled by external script, it definitely helps improve the experience (and maybe performance) of Oxygen.jl in the production.

A generic programmatic service seems to belong to a separate issue, but it is an interesting feature to ask for in the scope of this issue. Using Revise in production seems like a wild idea, but why not? A big issue is that restarts erase the internal service state. So, to implement service restarts, there would need to be some savestate and loadstate hooks, which seems doable.

A generic programmatic service seems to belong to a separate issue, but it is an interesting feature to ask for in the scope of this issue. Using Revise in production seems like a wild idea, but why not? A big issue is that restarts erase the internal service state. So, to implement service restarts, there would need to be some savestate and loadstate hooks, which seems doable.

I really like the idea of persistent state but I think it should be page 2 or 3. Do we have an existing restart solution (scheduled or external triggered) even at the risk of losing state? That would be already quite useful in some simple but reliability-first applications.

Not one I am aware of. Perhaps @ndortega can comment on the automatic restarts and whether they could be integrated within Oxygen. @mzy2240 I think you should open a new issue on the restarts and perhaps cross-reference with this one when necessary to keep this issue focused.

From my perspective, the ReviseHandler approach already supports this.

In that case I misunderstood you. It seems that we are probably agreed that this will be "solved issue" when it's actually solved by adding this or something similar to Oxygen (i.e. the issue is solved-in-principle).

In my opinion, it's not clear enough from a user experience standpoint, for instance, whether they want to conserve their laptop battery or prefer to see knowledge delay as an enactment of their hard work.

In terms of user experience, a delay in a REPL due to compiling is more expected than a delay in page loading, which has a significantly less responsive feel. Browsers can actually time out too I think for an especially long revise.

Another alternative or parallel solution could be to push some kind of spinner to the browser due to revising and refresh the page when it's done -- making sure to preserve whatever POST/GET data from the request.

The race condition

Maybe I'm misunderstanding you again. Which race condition are you referring to? That the revise event can be lost in case there are changes during a revision? This isn't a big problem since it will get revised later before the request. Maybe you can outline the exact scenario where you foresee problems.

clear way to communicate to the user that Revises no longer works

I agree that it would be a better user experience for those hacking and testing against a browser for some stuff to be pushed "in-band", but this is an argument for a special servedebug(...) mode, which would be not for production, but could in addition show stack traces from exceptions.

With the race condition, I meant the effect of the Revise limitation of being unable to cancel running recompilation. Let's say $T$ is the time it takes to recompile the project for some change. The probability that a change will happen within a recompilation time is proportional to $P \propto T$. If such a change happens, the user can wait $2T$ until they see their last change take effect. Hence, there is a tradeoff.

this is an argument for a special servedebug(...) mode, which would be not for production, but could in addition show stack traces from exceptions.

It would be better to make APIs deep by adding a keyword argument. Also, it would be more intuitive for the user if Oxygen recognised that Revise is loaded and added the handler accordingly.

I'm not entirely sure which approach would be best. At the end of the day, I think Oxygen needs auto-reload feature that works out of the box with as little "gotchas" as possible.

I personally don't use Revise.jl all that much, but I know a huge portion of the julia community does and has great success with it. I'm definitely open to either adding it as a dependency / extension to get this feature working. On the other hand, most other frameworks do complete server restarts to ensure the current running application is up to date. This has been a tried-and-true method we could use to solve this problem as well.

I haven't used Genie in a while, but I know they have a similar feature where they use Revise to watch all project files and are able to upload the handlers in real time. Granted, it's a lot easier to do this when you have a predefined folder structure to monitor. This package enforces near-zero architectural restrictions on where code comes from so we can't make as many assumptions about where their code will be stored.

I'm not sure which approach would be best, so I'd like to hear more from you all on which approach you'd prefer based on your experience you have with other web frameworks)

On the other hand, most other frameworks do complete server restarts to ensure the current running application is up to date. This has been a tried-and-true method we could use to solve this problem as well.

It is my guess that the restarting function would need to contain Revise.revise() and invokelatest to reflect the changes made within a package code. Hence, the handler approach for the code seems significantly more straightforward. It would also have the same behaviour for type redefinitions.

I haven't used Genie in a while, but I know they have a similar feature where they use Revise to watch all project files and are able to upload the handlers in real time. Granted, it's a lot easier to do this when you have a predefined folder structure to monitor.

Genie had to bolt in Revise with a watcher as it cannot develop the code within a package structure due to globals being erased at compilation time, much like Oxygen before the @oxidise macro. I don't think they would have made it that way if they could say the code could be developed with Revise straightforwardly in a package.

This package enforces near-zero architectural restrictions on where code comes from so we can't make as many assumptions about where their code will be stored.

What you may be able to do is to create a sandbox module within Oxygen and then export an includeox command which could put include statement with eval in that module. Then, the server could start from the internal module and get Revise to work that way. It also seems possible to recreate an internal module when type redefinitions happen and combine it with a server restart, but at the cost of erasing the state.

In terms of "complete restart", it may be possible to replace the process with a whole new Julia process with exec. This would be a lot slower probably less smooth than Revise. It might possible to use SO_REUSEADDR to make things a bit smoother. I think for most people this would be a second choice, but it could be combined with the Revise approach when Revise fails (e.g. struct definitions for now). One consequence here is that when we execute everything again and there is an exception introduced in top level code (e.g. a syntax error) it might be difficult to recover and keep things executing when it is fixed without also having a parent process monitoring everything and restarting.

I'm not sure if there's perfect behavior here, more like a convenience which helps improve coding flow a bit 90% of the time, which is why I'm tempted to think Revise is the sweet spot despite a few caveats, especially since it's behaviour is reasonably well understood within the broader Julia community.

My preference would therefore be to have a debug server mode that is either only available when @oxidise is used, or make it available in other cases, but issue a warning that Revise will only affect code in modules and therefore not the main route handler code.

Shall I put together another PR as a strawman, or does someone else want to? It feels like that might get us further at this point. (Asking because @ndortega was the last to express interest.)

My preference would therefore be to have a debug server mode that is either only available when @oxidise is used, or make it available in other cases, but issue a warning that Revise will only affect code in modules and therefore not the main route handler code.

I would be in favour of implicit behavior using Revise loading in the main as condition under which the handler is added. This condition could be explicitly disabled with keyword argument when needed. A good way to do this seems to be is to check Main.Revise binding to be defined and add the coresponding handler then.

Shall I put together another PR as a strawman, or does someone else want to? It feels like that might get us further at this point. (Asking because @ndortega was the last to express interest.)

@frankier, please feel free to open up a pr here. I agree that going with a Revise-first approach makes the most sense right now - if it really isn't sufficient for most workflows then we could look into adding in a server-restart later.

Personally, I'm interested in getting this functionality added directly to the code (instead of through a Revise package extension) so that people could toggle the revision of their code out of the box with a dev / reload / revise flag. Adding revise as a dependency would go well with the next minor version upgrade where I'll be focusing on upgrading the min julia version to 1.10 (new LTS) and upgrading the existing package extensions (without Requires.jl). Any dependency change will require a minor version bump anyways so I might as well do them both at the same time.

Whatever the implementation looks like, having a clean way to toggle this behavior is my biggest priority. So far, I've avoided building in this direction because my early attempts felt super hacky and required way too much work from the user's perspective.