π¬ Determine Extension Architecture
jasonplatts opened this issue Β· 45 comments
Before moving too far forward with the Extension API and marketplace, we need to decide what type of environment will be used to run extensions.
JavaScript Core, https://developer.apple.com/documentation/javascriptcore, is an option. It does not support ESM, but as some have mentioned, there are ways to work around these limitations using build tools.
Initially, a Swift extension API is being developed.
However, we ultimately plan to take an approach similar to Raycast that was described by mathieudutour here, #180 (comment). The architecture involves using node and communicating with the CodeEdit application using JSON-RPC. A react-reconciler will allow extensions to create a custom UI with React components.
Thank you to everyone who contributed to the conversation! Please feel free to continue to add feedback and/or suggestions.
First swift, then js. Swift is the easiest for us to implement.
I would prefer using Swift for extensions, but having JavaScript API would be nice to have also, considering that the app may be ported to iPadOS later.
So, I think we need to provide Swift Extension API and prioritize it right now, but also export it to JavaScript.
I'd imagine more Javascript devs would be using this than Swift devs because they would use Xcode for Swift development for the most part correct? I think both are important and I agree with our current path.
May I propose Lua as another option? Lua is small, portable, and fast. Actually neovim has a quite mature plugin ecosystem in Lua (see nvim-lua-guide). We can attract those plugin developers if we provide a Lua plugin interface.
That's an interesting option @henryhchchc. I think at this point we are probably best sticking with Swift and/or JavaScript as our primary focus.
In my opinion, considering the large VS Code user base, most extensions developers are going to be looking to develop using JavaScript/TypeScript. Initially it will be important to encourage as many of those developers as possible to contribute and add features that aren't in the core application. I believe extensions are key to making CodeEdit usable for the majority of developers.
Since this is a Mac-centric application, it would be really cool to have Swift as an option, but as @austincondiff mentioned, I don't think that is what most extension developers be looking to use. However, if that what is practical to begin, I'm all for it!
Ultimately, I wish we could design this in a way that lets extension developers use whatever language they feel most comfortable. I don't want to discourage anyone from contributing. But, it probably isn't realistic without providing some form of language conversion. Additional extension language support is something we could maybe revisit further down the road?
To the Swift developers, @pkasila, @MarcoCarnevali, @underthestars-zhy, and any others, how could we go about writing this API in Swift and then packaging/compiling/submitting the extensions in an approachable way?
Please correct me if I am wrong, VS Code uses the command line I think. Nova just takes care of everything via the GUI. There is a menu option to "Submit extension" and everything simply happens in the background. Not all developers like working in the command line, so it be nice to provide the ability for developers to build/submit the extensions from the GUI. Maybe we could require advanced builds to be performed from the command line?
So, in my opinion, Swift Extension Architecture can be implemented the way described below.
Swift API
So, let's say we have a CodeEditExtensionAPI
package that declares protocols to work with API (the main one is CodeEditAPIProtocol
). Both the CodeEdit app and extensions link to this package. Then, the API class, conforming to CodeEditAPIProtocol
, is implemented in the CodeEdit app. Then, an object of this class should be passed to the extension, so it can interact with API.
Loading extensions
All extensions are loaded using dlopen
at the runtime from dynamic libraries.
Each dynamic library should have createExtension
function as follows:
@_cdecl("createExtension")
public func createExtension() -> UnsafeMutableRawPointer {
return Unmanaged.passRetained(HelloWorldExtensionBuilder()).toOpaque()
}
The API object is passed to the ExtensionBuilder.build
function.
More information about plugin system using dynamic libraries in Swift can be found here
Compiling and packaging extension
Extensions should be compiled as dynamic libraries (dylib
s).
The minimal extension's package structure should be as follows:
HelloWorld.ceext
| - manifest.json
| - plugin.dylib
manifest.json
stores the manifest of the extensionplugin.dylib
is the dynamic library with the extension
Submitting extensions
I think there are 2 ways:
- We simply create a Git repository for developers to submit their extensions using PRs (by providing a manifest and a link to the GitHub Release page or somewhere else as a JSON file): it will be pulled by the CodeEdit app and indexed on the device
- We create a custom catalog service (marketplace) where (and there are 2 options)
- both extensions' manifests and packages are submitted by the developer and stored by us
- only extensions' manifests are stored by us and packages are stored by the developer somewhere else (e.g. GitHub Releases)
Other languages
- If we are speaking about using JavaScriptCore then we can conform this API to be exported to the JavaScriptCore
- If we are speaking about any other language then there should be more or less similar way of exporting API
Awesome @pkasila!
Compiling and packaging extension
Extensions should be compiled as dynamic libraries (
dylib
s).The minimal extension's package structure should be as follows:
HelloWorld.ceext | - manifest.json | - plugin.dylib
manifest.json
stores the manifest of the extensionplugin.dylib
is the dynamic library with the extension
Are we able to automate and simplify this process from the extension developers perspective through CodeEdit? Maybe a "Package and Submit Extensionβ¦" menu item?
Submitting extensions
- both extensions' manifests and packages are submitted by the developer and stored by us
I am in favor of this option if we can make it work. I understand there will be costs involved and it probably depends on community support. I think it's the most reliable choice.
If extension developers provide their own storage, whether it be GitHub or something else, there is the possibility that these services might be down, causing some extensions to work and some other not, which could cause confusion to users. Our storage might go down too, but I would think it would likely be the entire extension library and we would at least have the ability to troubleshoot and/or explain the outage.
We also run the risk of extensions being deleted, but still appearing in the extension library.
Are we able to automate and simplify this process from the extension developers perspective through CodeEdit? Maybe a "Package and Submit Extensionβ¦" menu item?
I think we can determine whether an extension project is open. Any extension's project can be a Swift Package (with a dynamic library) with manifest.json
file, so if CodeEdit (actually, a specific extension for extension developers) detects that this is an extension's project, then it adds a target for the extension to be published and this target handles packaging and submitting the extension to the store.
I am in favor of this option if we can make it work. I understand there will be costs involved and it probably depends on community support. I think it's the most reliable choice.
If extension developers provide their own storage, whether it be GitHub or something else, there is the possibility that these services might be down, causing some extensions to work and some other not, which could cause confusion to users. Our storage might go down too, but I would think it would likely be the entire extension library and we would at least have the ability to troubleshoot and/or explain the outage.
We also run the risk of extensions being deleted, but still appearing in the extension library.
I think that I'll try to play around with how API for the extensions store and the store itself can be implemented a little bit. And I'll write about what I think suits our needs better later.
Thanks @pkasila. Sounds to me like a solid starting point. I know there will be details to figure out and a lot of trial and error as we go.
Could I suggest to look at how Raycast is handling extensions? It's a delightful experience and it's attracting many developers.
Good suggestion @FezVrasta! For references:
Yeah, thanks @FezVrasta. This is great. I'd never heard of it before.
Compiling and packaging extension
Extensions should be compiled as dynamic libraries (dylibs).
The minimal extension's package structure should be as follows:
HelloWorld.ceext
| - manifest.json
| - plugin.dylibmanifest.json stores the manifest of the extension
plugin.dylib is the dynamic library with the extension
Maybe it would be a better option to bundle the dynamic library as a mac "Bundle" [1] instead of a custom file structure.
Apple has a documentation article [2], that describes how Apple would suggest building a plugin architecture. The article is a bit old, but it should be fairly straight forward to translate to Swift.
A bundle can also contain arbitrary resources (such as images, javascript, localized strings) and it's easy and well-defined how to load it and also unload it to release the resources.
For reference this is also what Dash does for its docsets [3].
1: https://developer.apple.com/documentation/foundation/bundle
2: https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/LoadingCode/LoadingCode.html#//apple_ref/doc/uid/10000052i
3: https://kapeli.com/docsets
@viktorstrate Looks good! Actually I forgot about Bundle
s when I was writing. I think they can fit all of our needs
Again @FezVrasta thanks for the tip!
I have been doing some research on how Racast has built their API. Maybe others can help me dig a little deeper but I found this so far.
From a Console Interview with their CTO, Petr Nikolaev:
Developers are the audience that will help us to bring the ecosystem to the level where we have a lot of extensions. Actually, the main product that weβre building right now β API for extensions β is not yet available to the public. We want to make it super easy to build productivity tools without you needing to know how to build desktop apps or websites. [...]
We recently stopped adding new extensions so we can focus on building our JavaScript API that will allow our community to integrate other services with Raycast. [...]
The goal for the API is that it is consistent across all platforms for all extensions. We also want to move all our current extensions with third-party services to TypeScript as well at some point.
They have a closed source but I gathered that they are using react-reconciler and react native for macOS to allow React to render native views. I'd love to get further insight here and if they are willing to shed some light here that would be awesome as I think there is room to work together and help each other. I might reach out at some point.
I think this is the right direction for our API and architecture because, as mentioned in the past, developers that use our app will most likely be more acclimated to JS/TS and React as opposed to Swift even though it may be easier for us to write our API in Swift (maybe we write and maintain both though).
That is as far as I got. If anyone can help or has any expertise in this area, it would certainly be welcome!
Hey π I'm working on the extensions ecosystem at Raycast. I'm on paternity leave at the moment but happy to chat if you'd like.
they are using react-reconciler and react native for macOS to allow React to render native views
We are indeed using a custom react reconciler but not RN for macOS (since that it is itself a reconciler)
We really should write a blog post of the extensions architecture at some point but very roughly:
- we download the nodejs runtime at the start (if not already done)
- we launch a long-running node process which we communicate with via JSON-RPC on stdio
- the node process launches a new Worker thread to run an extension (to isolate its memory, etc.) and acts as a bridge between the app and the extension
@mathieudutour we are happy you can join us in our discussion around the CodeEdit extension architecture. As you might have read, I am a huge fan of your work on the Raycast extension architecture and API. As we have explored different strategies and methods, we believe the same path you have taken will allow more developers to contribute extensions as you have seen with Raycast. The fact is, as much as we like Swift, there are more Javascript and React developers out there. Many similar macOS apps have gone a similar route in using Javascript including Sketch, Nova, and Craft.
Talking specifics and our path to implementing a similar architecture with editor specific APIs, I'd love to set up a chat with our maintainers, you and whoever else at Raycast that cares to join in a channel on our Discord server. It is really up to y'all how much you would like to share as I know you and others have put a lot of hard work into creating this amazing extension architecture. That said, we'd be more than appreciative for anything you can share or any insight you can give to help us on our path.
We aim to create the best editor for macOS with the community, for the community and we realize we need to support each other with each of our areas of expertise in order for the community to pull this together and keep it free for our users. So thank you for getting involved at whatever level you and Raycast are comfortable with!
If you wouldn't mind, join our discord, reach out to me, and we can go from there!
Hi @mathieudutour. Thanks for your post! I second what @austincondiff has said. We would love to chat with you in our Discord whenever you might be available.
Hi everyone! I am very excited for the project and I am fascinated by the progress that y'all are making. Thanks a lot for your work so far.
I personally have to agree that the Raycast extension system is great, especially because you can use npm packages directly due to it using node.js internally. For me this really makes it superior to a custom implementation on top of JavaScriptCore. I am not an expert in how VSCode internally works, but to my knowledge they have some kind of server - client architecture where the extension host communicates with the editor instance itself through a communication channel. This enables extension to run in containers or even on remote VMs making it really powerful (https://code.visualstudio.com/docs/remote/remote-overview).
I'd argue that having the possibility for implementing something like that is something that should be kept as a possibility for CodeEdit as well. That would require some sort of remote-procedure interface between the extension host and the editor. I personally used GRPC in projects a lot and think that that might be an option here: The editor starts a GRPC server which the extension host could authenticate and connect to. Whenever an extension wants to do something it would send a request to the editor.
For the simple case that the editor and the extension host are running on the same machine we could also communicate over pipes / IPC sockeets.
On top of that a react-based library like the one that raycast has is still very much possible.
Hi @lukasmoellerch. Thanks for the suggestions and perspective. Using node.js does seem to have some nice advantages over using JavaScript core.
I am not an expert in how VSCode internally works, but to my knowledge they have some kind of server - client architecture where the extension host communicates with the editor instance itself through a communication channel. This enables extension to run in containers or even on remote VMs making it really powerful (https://code.visualstudio.com/docs/remote/remote-overview).
I'd argue that having the possibility for implementing something like that is something that should be kept as a possibility for CodeEdit as well. That would require some sort of remote-procedure interface between the extension host and the editor. I personally used GRPC in projects a lot and think that that might be an option here: The editor starts a GRPC server which the extension host could authenticate and connect to. Whenever an extension wants to do something it would send a request to the editor.
For the simple case that the editor and the extension host are running on the same machine we could also communicate over pipes / IPC sockeets.
This is really interesting @lukasmoellerch! @pkasila, @lukepistrol, @MarcoCarnevali, @austincondiff, do you have any thoughts on this?
For reference, https://grpc.io.
@lukasmoellerch I think this is an excellent idea, thanks for sharing! I never considered this having, as you say, a server/client architecture but it makes complete sense given the fact that extensions will be communicating with the app via JS.
@pkasila Does this fall in line with what you were working on? Any thoughts on this approach?
@mathieudutour are you doing anything like this at Raycast?
if you consider the app a server and the node process a client, yeah it's similar. We communicate via JSON-RPC on stdio.
Possible JSON RPC resources:
- Swift: JSONRPC by @mattmassicotte
- JavaScript: stdio-jsonrpc
if you consider the app a server and the node process a client, yeah it's similar. We communicate via JSON-RPC on stdio.
@mathieudutour Would you then send JSX over JSON-RPC to be rendered via the app (server)? Itβs probably not straight up JSX, but reconciled in some way. How would that work?
That's the job of the react reconciler: it transforms the react components into a JSON that we send over the app. So the app only receives known elements and can render them
@mathieudutour, in this case, something like:
<List isLoading={true}>
<List.Item title=`My nice title` subtitle=`Beautiful subtitle` />
</List>
would be transformed by reconciler into something like:
{
"type": "list",
"properties": {
"isLoading": true
},
"children": [
{
"type": "list.item",
"properties": {
"title": "My nice title",
"subtitle": "Beautiful subtitle"
}
}
]
then you receive it in the native side and render the native UI based on that JSON, which would lead us to something like Backend Driven UI, right?
Three possible interesting links:
- https://github.com/facebook/react/tree/main/packages/react-reconciler
- https://agent-hunt.medium.com/introduction-to-react-native-renderers-aka-react-native-is-the-java-and-react-native-renderers-are-828a0022f433
last but not least:
- https://github.com/vadimdemedes/ink (this one implement an API for extensions)
@pkasila Does this fall in line with what you were working on? Any thoughts on this approach?
I was working on a different approach (but I was working on Swift-based extensions). We are loading extension into the app and run its code inside the app. But that is about Swift-based extensions.
I considered that we can just run compiled JS extensions inside JavaScriptCore and let them communicate through JSExport
ed object.
I think that it would be better to compare running JS extension inside JavaScriptCore and in NodeJS runtime (+JSON-RPC). And find out what works best in terms of resource usage. Because I believe we can send JSON (anything) from react reconciler through both JavaScriptCore's JSExport
ed object and JSON-RPC. And obviously we can send API calls both through JSExport
ed object and JSON-RPC.
Still, I understand that there are some pros in using NodeJS runtime. For example, you can work with the system directly.
Still, I understand that there are some pros in using NodeJS runtime. For example, you can work with the system directly.
In addition, I wonder if by using Node.js we could allow users to add/remove extensions and even hot-swap during extension development all without having to restart the app or reopen your workspace.
something like would be transformed by reconciler into something like
Exactly yes. There are some shenanigans for callbacks (event handlers) because you can't pass JS functions across the bridge. So they are replaced by a unique identifier and kept in a map in the node process so that the app can call it by its id.
But that's some implementation details which don't really affect the overall architectures.
it would be better to compare running JS extension inside JavaScriptCore and in NodeJS runtime (+JSON-RPC). And find out what works best in terms of resource usage.
For context, Raycast started its API with JSCore. But JSCore is neither Node nor a browser, which means that 50% (I don't actually know the proportion but it's quite high) of the npm packages just didn't work and it was breaking expectations for users. We quickly found out that it was quite a dealbreaker for a lot of people. So "resource usage" is really not the only comparison point you should consider.
For even more context, I also worked on the Sketch API, but didn't participate in its architecture. Sketch uses JSCore (and cocoascript). You could use webpack to bundle npm packages but the issue of npm packages not working was still present.
So I decided to expose a require
global via JSCore which would allow scripts to require other scripts, like you do in node. I also started to rewrite pretty much the entire node API in JSCore (via cocoascript) so that npm packages would work out of the box (see https://github.com/skpm/fs for example). But it was a huge task and I didn't have any resources so it was only half done when I left - I don't think anyone picked it up since.
So "resource usage" is really not the only comparison point you should consider.
Partly Agree, but we shouldn't make it act too much like VSCode/Electron apps, then the whole point of "native" will be pointless.
But this is my personal opinion.
Still, I understand that there are some pros in using NodeJS runtime. For example, you can work with the system directly.
In addition, I wonder if by using Node.js we could allow users to add/remove extensions and even hot-swap during extension development all without having to restart the app or reopen your workspace.
I think something like that should be possible. I am not sure whether this was mentioned here already, but VSCode also has something called "activation events", where extensions are only activated once they are actually used, e.g. the C++ extension would only become active once a C++ file is opened or a C++ command is used.
The same mechanisms could be used for hot-reloading where an application is deactivated, the source code is replaced and then activated again.
So "resource usage" is really not the only comparison point you should consider.
Partly Agree, but we shouldn't make it act too much like VSCode/Electron apps, then the whole point of "native" will be pointless. But this is my personal opinion.
I'd argue that extensions are usually just some thin layer above some other tool, thus the bottleneck is mostly something else - the messages / layout tree being created using JavaScript probably won't make the application feel less native. What makes VSCode feel slow is the renderer (I think).
Adding to my earlier comment of a server / client architecture: It usually makes more sense that the "server" is the extension host and the editor instance is a client of that server - that the user will only need to be able to access the extension host, but the machine the user is editing on doesn't have to be accessible in the network.
The main issue of Electron apps is not node, it's the rendering, and the huge amount of resources needed by each app. Having a node process run in background of a native app would be totally fine in my opinion.
I think something like that should be possible. I am not sure whether this was mentioned here already, but VSCode also has something called "activation events", where extensions are only activated once they are actually used, e.g. the C++ extension would only become active once a C++ file is opened or a C++ command is used.
The same mechanisms could be used for hot-reloading where an application is deactivated, the source code is replaced and then activated again.
It's the same in Nova too. I mentioned this briefly in issue #76 in the "Extension Entry Point" section. Both VS Code and Nova also include activation information in the extension manifest.
Based on a prior conversation on Discord with @josephschmitt @mattmassicotte @jasonplatts @pkasila @avdept and a few others, with the announcement of ExtensionKit in the latest WWDC which has been written about here, we have arrived at the following conclusion.
We will use ExtensionKit and expose a Swift API with the goal of later exposing a Javascript API later on. This will allow us to move fast and remain true to our overall mission statement which is to create something as native as possible to create a more performant editor. I invite anyone who cares to add to this or to explore specifics around this chosen path to do so. I feel like we can close this issue once all the kinks have been ironed out as long as nobody objects to it as I feel like we have arrived at a conclusion here.
I just wanted to leave a quick update in our progress. We've been taking this real slow because we really need to get this right to begin with because it will affect everything we do going forward.
Our idea to execute on this is relatively simple. We will end up exposing a Swift API and JS API in parallel, Swift to start because we will obviously need Swift functions to do anything. Then either as we go or shortly after we will expose a JS API
To get JS and Swift to play nicely we'd have a server/client model. We'd use Bun instead of Node for faster execution especially because it is based on JSCore rather than Chromium's JavaScript v8 engine. JavaScript execution using Bun can reach near-native speeds which is what we are looking for. Bun also supports modern JS and TypeScript out-of-the box.
We would spin up a JavaScript service on Bun on application start that houses all of the users installed extensions. The app and each installed extension would communicate via XPC and/or JSON-RPC. I also found this project that can reconcile React JSX into JSON so that we can offer React components to render out our Swift views.
We need to prove out this theory before we bring any of this to CodeEdit by creating a simple POC. That might look like a simple "hello world" Swift application. We might change the text or enter a name with an input and a button that a Javascript extension tells our Swift application to render, then when clicked, our JS extension will tell our Swift application to execute a specific function with a certain set of arguments in will provide.
Once we get this POC done, we can start putting it into CodeEdit. We can tell it to add a new navigator, inspector, or debugger. We can tell it to add autosuggestions under certain conditions, add snippets, and add commands. Then we build from there.
Before we begin putting a POC together, does anyone have any feedback on this approach?
We also need to figure out the communication protocol. We have mentioned gRPC, JSON-RPC, and XPC. Any preference? Pros and cons?
This is what I know of the topic, so people, please correct me if I'm wrong.
JSON-RPC (Personally never used it)
Pros:
- Human readable
- Relative easy to learn
- Language agnostic
Cons:
- Slower due to text encoding and decoding over the network
- Not type save
gRPC (Tinkered with this)
Pros:
- Fast, low latency
- Widely used in micro-service architectures (So it's "battle-tested")
- Type save
- Auto generated code based on .proto files
- Language agnostic
Cons:
- Not human readable, uses binary
- Steep learning curve (In my opinion)
XPC (Personally never used it, actually never heard of it prior)
So, I don't really no the pros and cons of this. But it does seems like very few developers use this. The issue I see with that is that it could take more time develop the system if it's decided that this protocol is used. Also I have no idea how this would interact with the JavaScript API. The thing that it has going for it, is that it is Apple native, but I don't think that matters. So, my conclusion would be that I wouldn't recommend using XPC.
gRPC has my personal preference, but on whether to use it would kind of depend on how many people in our community know how it works and/or want to learn how it works. As far as I know, it's mainly used in backend development and I assume most of the CodeEdit community are frontend developers.
I've worked with grpc for last few years and its really good tool to communicate between 2 servers. We don't really need human-readability, since there's nobody in between to read request/response data. Also I can't say it has steep curve. It has some issues if you compile it from sources for specific platform, but if you use prebuilt binaries then IDE should provide code completion for generated code.
I have app that makes you to use UI to compile your proto files into specific language. https://github.com/avdept/Protobuf-GUI-Compiler
Perhaps we should discuss this topic in the next meeting because I think we should make a decision on this, because of the high priority.