[technical exploration] js-ipfs bundle optimized
daviddias opened this issue ยท 22 comments
As of today, the js-ipfs dist version is considerably large (~4MB), which is a lot of JavaScript for a module designed for the browser. It is true that we have to inject a lot of code to support things like spdy, multiple transports, crypto channels and so on, but there are things that can be avoided/optimized, namely: shims everywhere for es6->es5 transpilation.
The 'promise' is that in the future, webpack's 'tree shaking' will remove duplicate code paths by reusing dependencies, but as far as I can tell, this won't work for CommonJS modules (anytime soon or ever).
Also, supporting several API's natively, namely callbacks and promises, forces every user to require everything, even if they decide to use just one, paying the cost associated.
My intuition is that the benchmarking tests will also reveal that we can optimize js-ipfs by reducing the overhead introduced by shims and that straight ES5 code will be more optimized than ES6, we will see.
So, here are some things that we might want to reconsider meanwhile:
- expose just callbacks by default and have a module to expose a promise interface
- use a subset of JS, within the ES5 scope to reduce the shims
- use a subset of JS, that matches the instructions by WebAssembly so that js-ipfs can be vastly optimised by the JS engines (idea suggested by @jbenet and @mikolalysenko)
I'm sure there will be other things we can do too.
I agree we should investigate how we can minimize the size of js-ipfs, but I strongly disagree to do it at the cost of developer productivity.
tree shaking
is a real possibility if we start moving to ES2015 module specification instead of commonjs and transpile those for node. This would be quite straightforward for us to do as we already have all the tooling in place to do it. Given we would do sth like this we could start generating specialized builds quite easily likeasync
orlodash
to that allow the installation of exactly those tools that they are interested in.- Removing shims will not buy us a large reduction in size as it is nearly a constant overhead. Before we consider doing sth like this I would like to see concrete numbers of what inside our builds is actually creating the major size issues, rather than wild guessing.
- As long as we don't ship a
Promise
shim the overhead in terms of size for the promise interface is so minimal that I don't see it warranting all the work of extracting it into its own module
Using https://webpack.github.io/analyse/ and http://ipfs.io/ipfs/QmSoAbNz1QWwioCzGHtV97Mtx23u19r3oVmznhk2bMLxSp you can see and analyze the largest offenders in terms of size.
The one that are visible immediately are
- forge.js (861 KiB)
- async.js (186 KiB)
- highland.js (126 KiB)
- 2 x bn.js (85 KiB)
- pako ( > 100 KiB spread over different modules)
Just for comparison, the core shim is 7 KiB large.
Some more work allowed me to bring down the size to 1.7M
minimized and 3.3M
non minimized. I will push that update to aegir soon.
Also analysis of modules size of the new bundle is below
Note: these numbers are before minification
libp2p-crypto: 871.31 KB (19.4%)
elliptic: 305.05 KB (6.78%)
spdy-transport: 230.43 KB (5.12%)
readable-stream: 97.4 KB (42.3%)
<self>: 133.03 KB (57.7%)
core-js: 197.72 KB (4.40%)
async: 186.07 KB (4.14%)
pako: 172.24 KB (3.83%)
bn.js: 170.29 KB (3.79%)
highland: 127.8 KB (2.84%)
asn1.js: 126.35 KB (2.81%)
readable-stream: 99 KB (2.20%)
isarray: 120 B (0.118%)
<self>: 98.88 KB (99.9%)
through2: 97.14 KB (2.16%)
readable-stream: 95.05 KB (97.8%)
<self>: 2.09 KB (2.15%)
hpack.js: 85.92 KB (1.91%)
readable-stream: 49.61 KB (57.7%)
<self>: 36.31 KB (42.3%)
hash.js: 80.62 KB (1.79%)
simple-peer: 65.92 KB (1.47%)
readable-stream: 49.61 KB (75.3%)
<self>: 16.31 KB (24.7%)
lodash.map: 64.24 KB (1.43%)
lodash.filter: 64.16 KB (1.43%)
interface-connection: 64.15 KB (1.43%)
readable-stream: 49.61 KB (77.3%)
duplexify: 11.35 KB (17.7%)
end-of-stream: 1.98 KB (3.08%)
<self>: 1.21 KB (1.88%)
browserify-aes: 61.74 KB (1.37%)
duplexify: 57.26 KB (1.27%)
readable-stream: 49.61 KB (86.6%)
end-of-stream: 1.98 KB (3.45%)
<self>: 5.68 KB (9.91%)
simple-websocket: 55.37 KB (1.23%)
readable-stream: 49.61 KB (89.6%)
<self>: 5.76 KB (10.4%)
ipfs-bitswap: 54.4 KB (1.21%)
socket.io-parser: 53.84 KB (1.20%)
json3: 42.28 KB (78.5%)
isarray: 120 B (0.218%)
<self>: 11.44 KB (21.2%)
sha.js: 52.65 KB (1.17%)
length-prefixed-stream: 52.56 KB (1.17%)
readable-stream: 49.61 KB (94.4%)
<self>: 2.95 KB (5.61%)
from2: 51.59 KB (1.15%)
readable-stream: 49.61 KB (96.2%)
<self>: 1.98 KB (3.84%)
block-stream2: 51.04 KB (1.14%)
readable-stream: 49.61 KB (97.2%)
<self>: 1.43 KB (2.79%)
bl: 46.6 KB (1.04%)
readable-stream: 41.37 KB (88.8%)
<self>: 5.23 KB (11.2%)
engine.io-client: 46.31 KB (1.03%)
lodash.isequalwith: 45.68 KB (1.02%)
buffer: 44.96 KB (1.00%)
ipfs-block-service: 42.1 KB (0.936%)
async: 38.16 KB (90.6%)
<self>: 3.94 KB (9.37%)
des.js: 40.69 KB (0.905%)
diffie-hellman: 35.13 KB (0.781%)
browserify-sign: 29.14 KB (0.648%)
socket.io-client: 25.56 KB (0.568%)
component-emitter: 2.91 KB (11.4%)
<self>: 22.65 KB (88.6%)
parse-asn1: 23.97 KB (0.533%)
ipfs-unixfs-engine: 22.24 KB (0.495%)
ipfs-merkle-dag: 10.37 KB (46.6%)
<self>: 11.87 KB (53.4%)
browserify-zlib: 21.81 KB (0.485%)
regenerator-runtime: 21.34 KB (0.475%)
protocol-buffers: 20.93 KB (0.465%)
create-hash: 20.75 KB (0.462%)
protocol-buffers-schema: 19.54 KB (0.434%)
public-encrypt: 17.62 KB (0.392%)
libp2p-swarm: 15.61 KB (0.347%)
engine.io-parser: 15.54 KB (0.346%)
has-binary: 1.06 KB (6.81%)
isarray: 120 B (0.754%)
<self>: 14.36 KB (92.4%)
util: 15.4 KB (0.342%)
ipfs-merkle-dag: 15.25 KB (0.339%)
assert: 15.08 KB (0.335%)
multiaddr: 12.23 KB (0.272%)
lodash.range: 12.03 KB (0.268%)
ripemd160: 11.91 KB (0.265%)
db.js: 11.47 KB (0.255%)
is-property: 10.76 KB (0.239%)
ip: 9.96 KB (0.221%)
wbuf: 8.68 KB (0.193%)
obuf: 8.65 KB (0.192%)
heap: 8.39 KB (0.187%)
events: 8.13 KB (0.181%)
lru-cache: 8.03 KB (0.179%)
debug: 7.67 KB (0.171%)
string_decoder: 7.61 KB (0.169%)
create-ecdh: 7.49 KB (0.167%)
ipfs-repo: 6.99 KB (0.155%)
miller-rabin: 6.45 KB (0.144%)
browserify-cipher: 6.25 KB (0.139%)
cipher-base: 6.25 KB (0.139%)
utf8: 6.23 KB (0.138%)
path-browserify: 6.04 KB (0.134%)
core-util-is: 5.9 KB (0.131%)
pbkdf2: 5.54 KB (0.123%)
libp2p-ipfs-browser: 5.13 KB (0.114%)
crypto-browserify: 5.12 KB (0.114%)
idb-plus-blob-store: 4.81 KB (0.107%)
multistream-select: 4.54 KB (0.101%)
libp2p-webrtc-star: 4.51 KB (0.100%)
peer-id: 4.43 KB (0.0985%)
create-hmac: 4.03 KB (0.0896%)
evp_bytestokey: 3.88 KB (0.0862%)
vm-browserify: 3.71 KB (0.0825%)
libp2p-websockets: 3.69 KB (0.0821%)
libp2p-identify: 3.69 KB (0.0820%)
browserify-des: 3.63 KB (0.0806%)
multihashes: 3.55 KB (0.0789%)
stream-browserify: 3.54 KB (0.0788%)
browserify-rsa: 3.53 KB (0.0785%)
lodash._createwrapper: 3.48 KB (0.0774%)
base64-js: 3.24 KB (0.0721%)
buffer-shims: 3.16 KB (0.0702%)
util-deprecate: 3.15 KB (0.0701%)
process: 3.13 KB (0.0697%)
stable: 2.94 KB (0.0654%)
component-emitter: 2.93 KB (0.0652%)
length-prefixed-message: 2.74 KB (0.0609%)
varint: 1.42 KB (51.8%)
<self>: 1.32 KB (48.2%)
lodash._basecreatecallback: 2.67 KB (0.0594%)
mafmt: 2.55 KB (0.0567%)
brorand: 2.54 KB (0.0566%)
ipfs-unixfs: 2.51 KB (0.0558%)
end-of-stream: 2.31 KB (0.0514%)
lodash._basecreatewrapper: 2.29 KB (0.0510%)
ms: 2.28 KB (0.0506%)
blob: 2.15 KB (0.0477%)
lodash.contains: 2.12 KB (0.0472%)
timers-browserify: 2.06 KB (0.0457%)
process-nextick-args: 2.03 KB (0.0452%)
base-x: 2.02 KB (0.0449%)
randombytes: 2.01 KB (0.0448%)
ieee754: 2.01 KB (0.0446%)
fs-blob-store: 2 KB (0.0445%)
lodash._basebind: 1.98 KB (0.0441%)
lodash.forown: 1.94 KB (0.0432%)
is-ipfs: 1.89 KB (0.0421%)
signed-varint: 1.84 KB (0.0410%)
varint: 1.42 KB (77.0%)
<self>: 435 B (23.0%)
peer-info: 1.75 KB (0.0389%)
libp2p-spdy: 1.74 KB (0.0388%)
promisify-es6: 1.7 KB (0.0377%)
varint: 1.67 KB (0.0371%)
base64-arraybuffer: 1.66 KB (0.0370%)
lock: 1.59 KB (0.0354%)
hat: 1.56 KB (0.0348%)
run-parallel-limit: 1.43 KB (0.0319%)
lodash.isarray: 1.4 KB (0.0311%)
lodash._basecreate: 1.39 KB (0.0310%)
backo2: 1.37 KB (0.0304%)
yeast: 1.32 KB (0.0294%)
lodash.bind: 1.32 KB (0.0293%)
inherits: 1.31 KB (0.0292%)
has-binary: 1.3 KB (0.0288%)
isarray: 120 B (9.04%)
<self>: 1.18 KB (91.0%)
generate-function: 1.27 KB (0.0281%)
lodash._setbinddata: 1.26 KB (0.0281%)
lodash.keys: 1.22 KB (0.0271%)
lodash._shimkeys: 1.2 KB (0.0266%)
peer-book: 1.19 KB (0.0264%)
lodash._slice: 1.18 KB (0.0263%)
parseuri: 1.17 KB (0.0259%)
lodash.isobject: 1.16 KB (0.0258%)
multihashing: 1.14 KB (0.0253%)
lodash.support: 1.12 KB (0.0248%)
lodash.isstring: 1.08 KB (0.0241%)
webpack: 1.07 KB (0.0238%)
lodash._isnative: 1.06 KB (0.0236%)
lodash._baseindexof: 1 KB (0.0223%)
is-typedarray: 1016 B (0.0221%)
ipfs-block: 934 B (0.0203%)
os-browserify: 927 B (0.0201%)
<self>: 44.37 KB (0.987%)
Some interesting data here: https://nolanlawson.com/2016/08/15/the-cost-of-small-modules/
There is also: https://chrisbateman.github.io/webpack-visualizer/ which helps analyse dependencies from webpack a bit better.
In addition there is this article about using nsolid to do runtime analysis of dependencies https://nodesource.com/blog/is-guy-fieri-in-your-node-js-packages/
Using the webpack visualizer I found out that we are currently bringing in 12 versions of readable-stream cough which should be fixed after #403
@dignifiedquire I would love if there was a section explaining how you hook those tools to help you visualise, and if possible that they can be run with just a script :)
@diasdavid this can not be put into a script, but
- for the visualizers, you need to generate a
stats.json
file and upload it to the linked webpages, they will then show you the analyis. To generate this file I need to add an option toaegir
to expose it easily as I currently use a hacked together version to do this. - for the nsolid one, the linked blog article goes into detail on how to do this
@diasdavid can we change the name of the topic to "optimize the size of js-ipfs"?
Some feedback from elsewhere (cc @pelle). @diasdavid @dignifiedquire make sure to note the comment re. dependencies.
Primary reason for using browser-ipfs:
Lots of unnecessary dependencies cause fragility and huge integration problems both in browser and ReactNative.
Secondary reason is size, which isn't an issue in ReactNative but a very big issue on the web in particular mobile web:
js-ipfs-api
99% 2016-08-18 11:38:01 โ |2.3.0| Big-Corn-Island in ~/code/js-ipfs-api
ยฑ |master โ| โ ls -l dist/
total 8600
-rw-r--r-- 1 pelleb staff 1550487 Aug 18 11:36 index.js
-rw-r--r-- 1 pelleb staff 1925637 Aug 18 11:36 index.js.map
-rw-r--r-- 1 pelleb staff 920487 Aug 18 11:36 index.min.js
browser-ipfs
ยฑ |master โ origin {2} โ| โ ls -l dist/
total 8
-rw-r--r-- 1 pelleb staff 1592 Apr 25 21:53 ipfs.min.js
Perhaps this is not the right thread but since js-ipfs-api and js-ipfs are connected and we're on the topic of size optimization:
ConsenSys is using https://github.com/pelle/browser-ipfs for their IPFS wrapper. Their reason is simply the size: 900kb (js-ipfs-api) vs. 2kb (browser-ipfs).
@diasdavid @dignifiedquire this begs the question: why is js-ipfs-api so large? What are the dependencies that make it so large? If js-ipfs-api can't use less dependencies, should we talk about a light api lib, much like browser-ipfs?
@haadcode Thank you for bringing that up. It should go, however, into a js-ipfs-api
issue, we will probably get a lot of savings from the dedup on js-ipfs-api
.
This makes me thing that it would be pretty dope if js-ipfs-api
was modular inside like async
, so that devs can do require('ipfs-api/cat') and get just the bit they need.
Moving conversation re. js-ipfs-api to ipfs-inactive/js-ipfs-http-client#353
A lot of good things are in this article: https://pouchdb.com/2016/01/13/pouchdb-5.2.0-a-better-build-system-with-rollup.html. PouchDB has struggled with similar issues in terms of optimisation in size but also ensuring things are usable in node, webpack & browserify.
More interesting things about using a monorepo and many small packages from PouchDB: https://pouchdb.com/2016/06/06/introducing-pouchdb-custom-builds.html
JS IPFS is a large collection of modules that aim to implement IPFS in Node.js and the browser. As such the distributions of these modules has a specific set of constraints.
Our current setup is not bad, and does generate bundles usable in Node.js and the browser, but there are some pain points that need work.
Current Pain Points
- Bundles are quite large
- Lots of dependencies are duplicated, for example
readable-stream
is included 12 times in the currentjs-ipfs
browser bundle. - Developers have to know very domain specific configurations to
be able to use browserify or webpack. - We break browserify and webpack compat without knowing about it
until we get a bug report.
Optimization Goals
- Bundle Size
- Ease of use for developers embedding the library (i.e. Orbit)
- Ease of use for contributors
Module Formats
There are two different module formats for JavaScript modules in main use today.
- CommonJS
var dep = require('dependency')
- Only native format in Node.js at the moment.
- ES2015 Modules
import dep from 'dependency'
- You can read about it in detail in this article: http://www.2ality.com/2014/09/es6-modules-final.html
The current code base uses CommonJS.
Available Tooling
The tooling landscape is quite large today, with things developing and changing quite rapidly. The for us currently relevant tooling is listed below.
Module Bundlers
A module bundler can take in many JavaScript files and generate a bundle, which is usable in the browser.
- [Webpack](CommonJS, ES2015)
- [jspm](CommonJS, ES2015)
- [Closure Compiler](CommonJS, ES2015)
- [Rollup](CommonJS, ES2015)
- Browserify
- [Babel](CommonJS, ES2015)
ES2015 Transpilers
Transpilers can transform code written with ES2015 features and output code that is usable in ES5 (and lower) environments.
A good comparision of the differences in size and runtime can be found in The cost of transpiling ES2015 in 2016.
Proposal
Given the set of constraints mentioned above, the following is a list of steps I suggest to improve and solve our current pain points.
1. Improve build artifacts
Similar to what PouchDB does, the end result for Node.js and the browser should be a single file.
If there are differences between Node.js and browser, modules use two different entry points
src/index.js
- Original source for node.jssrc/index-browser.js
- Original source the browser
For the builds we target the same places as currently
dist/index.js
ES5 code for the browserlib/index.js
- ES5 code for node.js
but lib/index.js
will be a single file, fully transformed rather than still many files such that treeshaking and processing of things like webpack loaders already happend and this is runnable through in node.js directly.
To make tooling aware of what is avaliable, the following should fields should be in package.json
"main": "./lib/index.js",
"jsnext:main": "./src/index.js",
"browser": {
"./lib/index.js": "./dist/index.js"
},
"jspm": {
"main": "dist/index.js"
}
Benefits
- Fully compatabile out of the box, with default configuaration with
- browserify
- webpack
- jspm
- rollup
Drawbacks
- Transpiled code in
lib/index.js
is a bit harder to read as it
is now a single large file.
2. Test webpack & browserify in CI
- Build with the default configurations for browserify and webpack.
- Run the full test suite against these versions.
Benefits
- We can be sure that our builds are usable by other developers.
Drawbacks
- CI run time increases.
3. Move to ES2015 Modules
- Using tools like cjs-to-es6 this is pretty straight forward for our own modules.
- For dependencies that do not yet publish a build which uses ES2015
- Enable tree shaking in our webpack build.
Benefits
- Smaller module size, due to the availability of statically analyzable dependencies and so allowing us to use tree shaking
Drawbacks
- Not runnable in Node.js directly anymore until they integrate ES2015 modules or you use something like
babel-register
.
4. Carefully audit the dependency tree
- Look at all of them
- Migrate where needed and large enough benefits are clear to ES2105 modules
- Major culprits that we know about
- all shims for Node.js functionality in the browser
- forge
- web-crypto -> browserify-crypto
- readable-stream and all users of it
Benefits
- Only include what we absolutly need
- Improves tree shaking if we can use dependencies that use ES2015 modules.
Drawbacks
- Takes time
Resources
Blog Posts
- https://nolanlawson.com/2016/08/15/the-cost-of-small-modules/
- https://pouchdb.com/2016/01/13/pouchdb-5.2.0-a-better-build-system-with-rollup.html
- https://pouchdb.com/2016/06/06/introducing-pouchdb-custom-builds.html
- https://github.com/samccone/The-cost-of-transpiling-es2015-in-2016
- http://www.2ality.com/2015/12/webpack-tree-shaking.html
Issues on IPFS
Very good suggestions. It would also be useful if we could create a lighter weight configuration of the library ideally just using the packager like browserify or webpack implementing only common use cases for the browser.
Another packager to add to the list is React Native's https://github.com/facebook/react-native/tree/master/packager
It presents a bunch of new headaches in that it doesn't go in and try to replace any node infrastructure. This causes many problems using libraries that assume something like browserify will automatically add Buffer
and Crypto
support. In most cases I've had to go in and browserify and derequire libraries to make them work.
React Native may seem like a corner case, but I know many developers are starting to use it now so it will be increasingly common.
Thanks @pelle for bringing up react-native, the browser bundle in the above suggestion should work in react-native but it's important to test and check.
Pretty alarmed by this proposal. This type of optimization (removing code) should not be so complex. And I'm not convinced it has to be.
- Not being compatible with node is not an option.
- Using ES6 imports should be a last resort. They demonstrate very poor judgment in design and are a huge step backwards for code readability, and simplicity (in the ritch hickey sense of the word). It also runs counter to the programming model of ipfs. (Immutable dags) and is not something WE should promote and lead to, because it is going to make future code harder to use. (This is a long term tussle.) in particular it counters the excellent functional paradigm that CJS and node.js created and that works so well with code in ipfs.
- optimizations should not require all this-- start by removing unused stuff. Look at the way people do it now-- proper static analysis.
- Why can't you throw googleclosure at it? That's a much more rigorous solution that removes all dead code, optimizes heavily. I seem to recall it reduces equal functions (or was going to). (It's not tree shaking, it's dagify, picking the only leafs you need, and compress)
- there seem to be a lot dependent thinking here, meaning there seems to be a very complicated path because of decision dependencies that are not made explicit here. I'm not confident that this is actually necessary, and I would really like to see a walk through of those decisions before claims like "we have to move to ES6 imports because we have to use webpack tree shaking because we have to use webpack" can be validated.
@dignifiedquire should move this proposal into its own threat to capture discussion there. It's going to get big
I did some more investigation, and it looks like rollup has quite the large savings for us. I generated a bundle which transforms commonjs to es6 modules and then uses rollbar and got the unminified size down from 3.4M
to 2.4M
.