Write new “Introduction to IPFS” guide
Mr0grog opened this issue · 12 comments
We should write a new guide that largely replaces https://ipfs.io/docs/getting-started/ and the IPFS white paper. Rather than focusing on the CLI, it should focus on the concepts and workings of IPFS, using the CLI to demonstrate those ideas.
I’ve kept this separate from #8, where the discussion seems more focused around an introductory usage doc for each as described in the “later” section of #58. But if we should merge them together, let me know here!
First task here: a rough outline.
I can take a stab at the first task this week.
Great! This is the general outline I had started to think about (rougher near the end), but would love to hear your version:
-
What exactly is IPFS?
- A way to store and access files, websites, and apps in a distributed way
- Wikipedia example: if you’re reading about Aardvarks, the you got the webpage directly from another computer somewhere across the country that Wikipedia owns. With IPFS, the webpage might have come from your neighbor’s computer across the street or maybe someone else across town.
- Makes it hard for a website to go offline
- It’s possible for things to load faster because they might come from nearby
- Harder for authorities to block or censor content (e.g. Turkey + Wikipedia?)
- It’s participatory: you can also participate in sharing the Aardvark page and when someone else loads the Aardvark webpage, it might come from your computer.
- “Inter-planetary”
- Links can’t change
- Different content at the same address over time, e.g. your favorite recipe site used to be in customary, but recently switched to metric — and you want that old customary version
- Content moving to a different address, e.g.
http://mycompany.com/what_we_do
→http://mycompany.com/services
- Pages that are removed entirely
- Caution: even on IPFS, this doesn’t mean content can never disappear. A given link always points to the same content, but it’s a participatory network, so if nobody has that content, you still won’t be able to get it. (More on this below.) Content can never disappear so long as someone cares about it :)
- The traditional web of control and authorization vs. the distributed web of possession and participation
- You possess files/data and it’s up to you to make it available (and give possession to others)
- It’s a network of participation — it only works if people participate in sharing others’ content. The whole internet doesn’t fit on your computer, but it can fit in pieces across everybody’s computer
- A way to store and access files, websites, and apps in a distributed way
-
Ok, so how does it work?
- It’s all about protocols, but we also have reference implementations
- Talk about and link to specs
- Reference implementations in Go and JS, but anybody could build another client that matches the specs
- Install IPFS
- Note that we also provide libraries for using it programmatically (link to them), but we’ll stick with the CLI as an example here
- Download and install
go-ipfs
ipfs init
- Addresses, e.g.
/ipfs/QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG/readme
(result of init above)- Cryptographic hashes
- Identifies the content uniquely, even though it’s small
- Can contain references to other hashes (e.g. the
.../readme
→/ipfs/QmPZ9gcCEpqKTo6aq61g2nXGUhM4iCL3ewB6LDXZCtioEB
) - Talk about DAGs, Merkle trees, nodes/links here or later?
- Accessing content
- Try to
ipfs cat <Wikipedia Aardvark page hash>
, doesn’t work - Start daemon
ipfs swarm
→ see who you’re connected to- Now cat that same hash
- Works, but HTML in the console is not that great
- Browse web content through your gateway
http://localhost:8080/ipfs/<hash>
- Gateway is a little web server on your computer, part of the daemon
- Whenever you ask for
http://localhost:8080/<address>
it doesipfs cat <address>
under the hood and sends that to your browser
- What actually happened here?
- Queried the swarm for providers
- Queried provider for addresses
- Asked for content from an address
- Light explanation of DHTs here?
- Try to
- Publishing content
ipfs add <whatever>
- Check the public gateway (explain that public gateway is just like your local gateway but running on another computer somewhere else)
- This doesn’t mean your content is permanently available! Think back to the steps above — what happened this time?
- Adding the content told other people in the swarm that your computer has
<hash>
- Gateway queried the swarm for providers, and swarm told it you said you did
- Gateway asked for your computer’s address
- Gateway asked your computer for the file
- Next time someone tries to get the file, the swarm will say there are two providers — your computer and the gateway. That person might get it from either place.
- But the gateway only has so much hard drive space and people ask for stuff from it all the time. It’ll probably be gone in a few hours. When you turn off your computer, it’ll be gone, too, and there’ll be nowhere to get the file from :(
- Adding the content told other people in the swarm that your computer has
- So how do you make sure sure content stays available?
- Someone else might do it because they thought your content was interesting or important
- But really, the only guarantees are the ones you make. Again, this is about participation.
- Coordinate sharing content with lots of friends
- Pay someone else to share it (pinning services)
- ipfs-cluster
- Run your own server
- Pinning
- Tells your IPFS daemon to make sure it holds onto a file and doesn’t eventually throw it away like the public gateway does
ipfs pin ls
— you already have some pins (when youipfs add <hash>
, it gets pinned)ipfs pin add <hash>
- It’s all about protocols, but we also have reference implementations
-
How do I change things?
- I shared a document, but I want to update it
- IPNS
- DNSLink
- IPRS (???????)
- I shared a document, but I don’t want to share it anymore!
- Unpin it
ipfs pin rm <hash>
- If other people also have it (e.g. the gateway), you need to ask them to remove it the same way on their side
ipfs dht findprovs <hash>
(this is the querying for providers step above!)
- Unpin it
- I want to build an app that uses IPFS
- Immutable content; thinking in terms of changes/transforms to a tree
- Pubsub
- CRDTs
- Mostly just name and summarize the concepts here, then link off to another doc (which… maybe doesn’t exist yet)
- I shared a document, but I want to update it
-
What next?
- Share some useful content! https://archives.ipfs.io
- Put your web site on the distributed web
- Build an app
- Link to libraries
- Orbit.chat
- Peerpad
- Check out awesome-ipfs
- Other ways to use IPFS
- Desktop
- Companion
- Help us out if you’re a developer :)
It seems like this would be a great backbone for something I just pitched to @flyingzumwalt
I've been thinking a lot about the docs and the onboarding process (for both my product and the IPFS family too) - and would like to share an idea with you. I briefly mentioned it during our last call, and it has finally firmed up in my mind's eye. It isn't just documentation and it isn't just a build tool - but it is somewhere in between and can absolutely fulfil both roles.
It is a mixture of a "bot" that can not only get help for you (and even parse/merge help docs for you in your specific type of crisis), but also interactively help you set up a repository, a node, a swarm, a multiformat definition, an IPLD record, an ipfs-pack etc. I kind of think of it as a "choose your own adventure" kind of thing. Like ZORK meets a POSIX wizard controlled either in the CLI, from the Web or with a Discord bot.
https://gist.github.com/nothingismagick/d0980c2287a60023067e8a9ba9add8ce
Perhaps instead of writing docs, more time should be spent on leveraging specs for documentation generation. The more I think about it, with this kind of complex system, hand-written documentation is doomed to be out of date as soon as it is written - specs and the respective test-driven development, however - must be up to date or else the product can't even be shipped...
Then a human can come and write some "semantic sugar" to smoothen everything - you know, somebody that describes the basic principles, while letting the code describe itself.
something I just pitched to @flyingzumwalt… https://gist.github.com/nothingismagick/d0980c2287a60023067e8a9ba9add8ce
Neat! I think I agree with @flyingzumwalt that it would be good to see how this winds up working in practice for your own app and potentially use that as a basis for discussing how we might do it for IPFS.
Here's an outline version with a pivot towards use-cases and examples for the two primary audiences: end-users and developers. It focuses on IPFS while leaving Libp2p as a separate exercise. Although given the topics on discuss, it might make sense to just fold libp2p concepts (e.g., pubsub, crdts) in here.
My apologies for concepts that I'm misrepresenting as I'm still learning and have a bunch of newbie questions. Please call these errors out or highlight things that don't make sense. Thanks!
-
What is IPFS?
Goal: Explain the 10,000 ft view of what problems IPFS is addressing.
(See Rob's Section 1) -
How do I use it?
Goal: Introduce the two primary user groups (end-users and developers) to relevant scenarios with working e2e examples and code samples.
-
End-Users:
- Host your website on IPFS?
- P2P Filesharing (highlight large files)?
- Archiving?
- (Suggestions?)
- [Branch to example apps??]
-
Developers:
- Lift principle building blocks from: Peerpad, Orbit.chat?
- IPLD example?
- Browser companion?
- [Branch to API Usage Guides, API documentation, Repos, Libp2p Getting started]
-
Installation Instructions:
- Be a node on the network
- How does it work?
Goal: Educate on the high-level architecture, introducing key concepts and how they interrelate.
(I think it makes sense to explain each of the below with working examples. Bonus points if it's setup as an interactive tutorial so that the user could walk-through the concepts and build on each, as if participating in a lab.)
- Merkle Links / Addressing
- Nodes and Swarms
- Gateways to HTTP
- Pinning / Garbage Collection
- Immutable vs. Mutable content
- Clustering
- Content Discovery? (how does the network know which nodes are hosting a given merkle link?)
- Reference the role of Libp2p in IPFS (or fold libp2p concepts in here)
- [Branch to Overview Youtubes, Specs, Whitepapers, Case Studies*, Libp2p Getting started]
(*Case studies could discuss a particular implementation in detail. They should be selected to educate on distributed data concepts - e.g. limitations, threat models.)
- What's next?
Goal: Communicate the release calendar, roadmap, and feature stability chart.
- Release Calendar
- Roadmap
- Feature Table covering the intention of the feature (i.e., what are the primary use cases), stable release version (what functionality is working vs. not working), preview release version (what functionality is being introduced/fixed), rough timeline for future feature work.
- [Branch to Contributing]
- FAQ
In the case that the FAQ is used a tl;dr for navigating the rest of the doc, the list should include Q's that are already answered above and link back.
-
Concepts:
- When I upload content, where does it go? (it's seeded, discoverable and accessible on the network…)
- How long will my content stay on IPFS? (if you're the only one hosting it, then while your node is connected until GC…)
- Links are difficult to type, how can we get memorable / human friendly names?
- How do I change my content (i.e., I want to publish non-static files)?
-
Where can I talk with the team on specific questions? (discuss, irc)
-
Why use IPFS instead of …. ?
(list needs more coverage and improved categorization)
Oh, I like adding something like “how do I use” or “use cases” between the “what is it” and “how does it work” sections! I was thinking I would try to massage examples/use cases into the parts of section 1, but making an explicit section may be clearer and less work for both the author and reader :)
with working e2e examples and code samples.
I think this might be too much—the intro guide should be working hard to be concise and relatively easy to get through. I feel like this level of depth is better suited to use-case guides like the “running a pinning service” one described in #62.
I’m going to spend some time over the next week working on a draft of this. Thanks!
An e2e use case is a lot of work and I agree with you maybe not the right spot in the intro guide, but having a really simple example of something useful you can do right now would be great. Like sending someone a big video that is just too large for any free service like WeTransfer... or sending a picture-postcard to someone, then realizing that you spelled their name wrong but it’s ok because they are still sleeping and you update the pin with a new version.
Ok, we’ll that last one is a bit contrived, but my suggestion is to use easy to grasp examples from the real world and show how you can do the same - or even better with IPFS...
Cool. Excited to see a first draft. Please let me know how I can help!
Re: e2e use case... I second@nothingismagick comment of using an example of something useful that they could do immediately. Even if it's a link to another page, and not directly included in the Getting Started, I think it would be hugely helpful to get someone off the ground.
Draft of the very first section (a high level overview — “what is IPFS?”) is up in #67 for anyone who wants to give feedback.
I'm picking this up -- or what parts of it I can manage at the moment. Work underway in #170 .
Closing this issue, since it's superseded by #170.