ethereum/aleth

Project idea: HTTP Proxy for JSON-RPC

Opened this issue · 18 comments

Project idea: HTTP Proxy for JSON-RPC

Motivation

I'd like to remove HTTP server from C++ Ethereum node. The node should only expose RPC access by the most primitive transport: Unix Sockets and Named Pipes on Windows. The HTTP transport should be provided by an external tool translating HTTP requests/responses to/from the protocol required by Unix Sockets and Named Pipes.

Notes about design

  1. The proxy may be done in a language other than C++. Go looks friendly.
  2. The proxy can provide other transports than HTTP, e.g. WebSocket, TCP.
  3. The node can still support --rpc flag. It should execute and configure the proxy as another process. This is quite common in Unix word. This way git supports SSH, etc.

Privileges

Problem

We would like to have different (and configurable) permissions for RPC module depending on transport protocol. Let's assume there are 2 RPC modules: eth with public blockchain data and admin. We must allow accessing both admin and eth modules by Unix Sockets and Named Pipelines on the same time not allowing access to admin with HTTP. User should also be able to configure modules access permissions per transport protocol.

This can be more complicated if we consider allowed HTTP hosts.

Solution 1: Blacklisting

This should match current geth behavior, where by default all modules can be accessed via Unix Sockets and only some modules can be accessed by HTTP.

In this solution all modules are accessible by default by Unit Sockets and Named Pipes. When the proxy process is started it may send a special message (can be JSON-RPC message) with the information what modules are to be disabled. The node must allow this message to be send only once per connection.

Solution 2: Whitelisting

Similarly to solution 1, but this time none module is enabled by default. The proxy must send a special massage listing the modules to be enabled. The node must allow this message to be send only once per connection.

This would require also changes to tools like ethereum-console and geth attach. They will also have to send the whitelist on startup.

Solution 3: Access token

I noticed that in C++ the admin RPC module requires a special token to be passes as a part of JSON-RPC request. I think the token is generated every node startup.

What happens when we drop HTTP without adding the proxy? The Travis scripts on Solidity and Bamboo use Unix Sockets.

What happens when we drop HTTP without adding the proxy? The Travis scripts on Solidity and Bamboo use Unix Sockets.

At the moment probably nothing. But I'd like to at least have a plan and a design how to add the support back in future. This just looks to me like a nice inter-team small project.

What is the polling overhead of the different technologies? I heard that web sockets are quite good in that regard.

I think it sounds like a good idea to have the http-rpc as a standalone thing. The interface could be a lot more refined if it was independant (custom certificates, multiplexing, access controls etc).

What is the polling overhead of the different technologies? I heard that web sockets are quite good in that regard.

Yup, websockets are essentially tcp-sockets, so as long as it's not closed, the server can push.

I have though about something a couple of months ago. Mainly because of the reasons holiman mentioned. Such a proxy could also do account management and handle signing request (provide a UI to the user to authorize a signing request). Nodes would only operate on public data.

Websockets begin their live as http connections and are upgraded to websockets. Geth already supports it and allows a client to subscribe to events such a new headers and logs. The client will receive a notification that contains the event data. No polling required.

Such a proxy could also do account management and handle signing request (provide a UI to the user to authorize a signing request). Nodes would only operate on public data.

I was thinking about this as a separate project. Such agent would have access to accounts' keyfiles and be placed in the middle of RPC communication. It would translate "personal" requests into "raw" requests. E.g. personal_sendTransaction into eth_sendRawTransaction.

But maybe it is not bad idea to merge this 2 projects into a single one.

axic commented

One of the downsides of the Unix socket approach is framing of the JSON (also not sure that named pipes allow multiple connections?)

The framing in Web Sockets (see https://tools.ietf.org/html/rfc6455#section-5.2) could be used and that would turn this proxy pretty transparent as it would only need to add the HTTP framing on top.

Both Unix Sockets and Named Pipes servers recognize individual connections. Is that you question @axic?

I'm not sure what is framing about? To reuse single connection for multiple independent streams?

I don't like Web Socket approach as the base transport for RPC, because the connection available to every user of the machine and you some additional authentication mechanism to be added. Am I right?

axic commented

Framing is about knowing where the message boundaries are. Current IPC relies on streaming JSON decoders to determine messages boundaries. There are two long, heated threads about this though :)

Websockets doesn't define any authentication or encryption, that is provided by HTTP. Though I only mentioned Websockets' framing, which is the actual message passing protocol after Websockets has been negotiated over HTTP (to avoid reinventing the wheel).

So the goal is to know where the JSON message ends without parsing the JSON?
I've seen that libjson-rpc-cpp has also TCP transport and it uses special char to delimit the messages. See https://github.com/cinemast/libjson-rpc-cpp/blob/master/src/jsonrpccpp/server/connectors/tcpsocketserver.h#L19-L22.

By default "new line" \n is used to delimit messages, probably because "new line" (and other control characters) are not allowed in JSON strings directly.

Hm... I think I missed the fact that nice formatted JSON contains new line chars.

Unix Socket Authentication

It is possible (but probably in not portable way) to get process and user id of a connection.

This information can be used to implement blacklist: if we know that the HTTP proxy is process N, we can limit privileges of connections from process N.

However, I'm not convinced with this solution in case the default access level is unrestricted. It would be to easy for users to spin of proxies on their own that without restrictions applied.

@karalabe if you have some free time, can you point us to the packages that are used for JSON RPC in geth?

I created this PoC in Python: HTTP to Unix Socket proxy: https://github.com/chfast/json-rpc-proxy.

All our rpc boilerplate is in the 'rpc' package/folder in our repo root.

The scripts/jsonrpcproxy.py is nice for enabling HTTP-based RPC-interaction with the eth node.
However, it would have been nicer to keep the options such as "--json-rpc" and "--json-rpc-port" in the source code so that we could have more choices of interest !

One should have introduced some compiler flag for turning on/off the option json-rpc.
For MiniUPnP, there is such a nice compiler flag !! Why not same for json-rpc ?

websocket http proxy server is josn解码
unix tcp socket many connection

hava not a request