Project idea: HTTP Proxy for JSON-RPC

Question

Project idea: HTTP Proxy for JSON-RPC

Opened this issue 7 years ago · 18 comments

Project idea: HTTP Proxy for JSON-RPC

Motivation

I'd like to remove HTTP server from C++ Ethereum node. The node should only expose RPC access by the most primitive transport: Unix Sockets and Named Pipes on Windows. The HTTP transport should be provided by an external tool translating HTTP requests/responses to/from the protocol required by Unix Sockets and Named Pipes.

Notes about design

The proxy may be done in a language other than C++. Go looks friendly.
The proxy can provide other transports than HTTP, e.g. WebSocket, TCP.
The node can still support --rpc flag. It should execute and configure the proxy as another process. This is quite common in Unix word. This way git supports SSH, etc.

Privileges

Problem

We would like to have different (and configurable) permissions for RPC module depending on transport protocol. Let's assume there are 2 RPC modules: eth with public blockchain data and admin. We must allow accessing both admin and eth modules by Unix Sockets and Named Pipelines on the same time not allowing access to admin with HTTP. User should also be able to configure modules access permissions per transport protocol.

This can be more complicated if we consider allowed HTTP hosts.

Solution 1: Blacklisting

This should match current geth behavior, where by default all modules can be accessed via Unix Sockets and only some modules can be accessed by HTTP.

In this solution all modules are accessible by default by Unit Sockets and Named Pipes. When the proxy process is started it may send a special message (can be JSON-RPC message) with the information what modules are to be disabled. The node must allow this message to be send only once per connection.

Solution 2: Whitelisting

Similarly to solution 1, but this time none module is enabled by default. The proxy must send a special massage listing the modules to be enabled. The node must allow this message to be send only once per connection.

This would require also changes to tools like ethereum-console and geth attach. They will also have to send the whitelist on startup.

Solution 3: Access token

I noticed that in C++ the admin RPC module requires a special token to be passes as a part of JSON-RPC request. I think the token is generated every node startup.

Answer 1 · 2017-09-27T15:29:37.000Z

What happens when we drop HTTP without adding the proxy? The Travis scripts on Solidity and Bamboo use Unix Sockets.

Answer 2 · 2017-09-27T15:32:07.000Z

What happens when we drop HTTP without adding the proxy? The Travis scripts on Solidity and Bamboo use Unix Sockets.

At the moment probably nothing. But I'd like to at least have a plan and a design how to add the support back in future. This just looks to me like a nice inter-team small project.

Answer 3 · 2017-09-27T15:48:24.000Z

What is the polling overhead of the different technologies? I heard that web sockets are quite good in that regard.

Answer 4 · 2017-09-27T15:50:06.000Z

I think it sounds like a good idea to have the http-rpc as a standalone thing. The interface could be a lot more refined if it was independant (custom certificates, multiplexing, access controls etc).

What is the polling overhead of the different technologies? I heard that web sockets are quite good in that regard.

Yup, websockets are essentially tcp-sockets, so as long as it's not closed, the server can push.

Answer 5 · 2017-09-28T10:44:28.000Z

I have though about something a couple of months ago. Mainly because of the reasons holiman mentioned. Such a proxy could also do account management and handle signing request (provide a UI to the user to authorize a signing request). Nodes would only operate on public data.

Websockets begin their live as http connections and are upgraded to websockets. Geth already supports it and allows a client to subscribe to events such a new headers and logs. The client will receive a notification that contains the event data. No polling required.

Answer 6 · 2017-09-28T11:31:36.000Z

Such a proxy could also do account management and handle signing request (provide a UI to the user to authorize a signing request). Nodes would only operate on public data.

I was thinking about this as a separate project. Such agent would have access to accounts' keyfiles and be placed in the middle of RPC communication. It would translate "personal" requests into "raw" requests. E.g. personal_sendTransaction into eth_sendRawTransaction.

But maybe it is not bad idea to merge this 2 projects into a single one.

Answer 7 · 2017-10-23T21:11:39.000Z

One of the downsides of the Unix socket approach is framing of the JSON (also not sure that named pipes allow multiple connections?)

The framing in Web Sockets (see https://tools.ietf.org/html/rfc6455#section-5.2) could be used and that would turn this proxy pretty transparent as it would only need to add the HTTP framing on top.

Answer 8 · 2017-10-23T21:19:43.000Z

Both Unix Sockets and Named Pipes servers recognize individual connections. Is that you question @axic?

I'm not sure what is framing about? To reuse single connection for multiple independent streams?

I don't like Web Socket approach as the base transport for RPC, because the connection available to every user of the machine and you some additional authentication mechanism to be added. Am I right?

Answer 9 · 2017-10-24T08:53:35.000Z

Framing is about knowing where the message boundaries are. Current IPC relies on streaming JSON decoders to determine messages boundaries. There are two long, heated threads about this though :)

Websockets doesn't define any authentication or encryption, that is provided by HTTP. Though I only mentioned Websockets' framing, which is the actual message passing protocol after Websockets has been negotiated over HTTP (to avoid reinventing the wheel).

Answer 10 · 2017-10-24T09:06:59.000Z

So the goal is to know where the JSON message ends without parsing the JSON?
I've seen that libjson-rpc-cpp has also TCP transport and it uses special char to delimit the messages. See https://github.com/cinemast/libjson-rpc-cpp/blob/master/src/jsonrpccpp/server/connectors/tcpsocketserver.h#L19-L22.

By default "new line" \n is used to delimit messages, probably because "new line" (and other control characters) are not allowed in JSON strings directly.

Answer 11 · 2017-10-24T09:10:31.000Z

Hm... I think I missed the fact that nice formatted JSON contains new line chars.

Answer 12 · 2017-11-22T14:54:45.000Z

Unix Socket Authentication

It is possible (but probably in not portable way) to get process and user id of a connection.

This information can be used to implement blacklist: if we know that the HTTP proxy is process N, we can limit privileges of connections from process N.

However, I'm not convinced with this solution in case the default access level is unrestricted. It would be to easy for users to spin of proxies on their own that without restrictions applied.

Answer 13 · 2017-11-22T14:55:39.000Z

@karalabe if you have some free time, can you point us to the packages that are used for JSON RPC in geth?

Answer 14 · 2017-11-30T00:28:25.000Z

I created this PoC in Python: HTTP to Unix Socket proxy: https://github.com/chfast/json-rpc-proxy.

Answer 15 · 2017-11-30T00:47:32.000Z

All our rpc boilerplate is in the 'rpc' package/folder in our repo root.

Answer 16 · 2018-02-01T03:03:06.000Z

The scripts/jsonrpcproxy.py is nice for enabling HTTP-based RPC-interaction with the eth node.
However, it would have been nicer to keep the options such as "--json-rpc" and "--json-rpc-port" in the source code so that we could have more choices of interest !

One should have introduced some compiler flag for turning on/off the option json-rpc.
For MiniUPnP, there is such a nice compiler flag !! Why not same for json-rpc ?

Answer 17 · 2018-03-09T03:11:14.000Z

websocket http proxy server is josn解码
unix tcp socket many connection

Answer 18 · 2018-03-09T03:18:07.000Z

hava not a request