Add Reframe protocol server
BigLep opened this issue · 20 comments
Done Criteria
- An ipfs/go-delegated-routing client (e.g., Hydra, go-ipfs) can call storetheindex to get results in production.
- storetheindex operators have operational visibility into the reframe endpoint (like they have with their existing provider endpoint)
Why Important
go-ipfs isn't planning to add support for the current version of the storetheindex client. It is planning to add support for delegated-routing though as part of ipfs/kubo#8775 . Having an easy self-service configuration option for go-ipfs to query storetheindex directly will ensure 100% content discovery for any go-ipfs node that has storetheindex as a delegated-routing server for CIDs that exist in storetheindex only. The most immediate usecase is to configure the ipfs.io gateway in this way.
Notes
Before this work can be done:
- The delegated routing interface needs to be defined: ipfs/go-delegated-routing#8
- ipld/edelweiss generated client / servers should be possible. This will be encapsulated by ipfs/go-delegated-routing.
Once this change is deployed to storetheindex, the Hydras can be updated to use delegated routing: libp2p/hydra-booster#162
For point number 2 above, EngRes IPFS team will take on there are metrics available. storetheindex will own creating a dashboard that uses them.
@davidd8 @willscott : wanted to make sure that you saw I created this placeholder issue. It's linked from https://www.notion.so/pl-strflt/migrate-store-the-index-hydra-comms-to-the-delegated-routing-protocol-6c97dbc25a084a35b60fed1c3ef715a4 . I wanted a GitHub issue I could link to from other GitHub issues.
Is there an expected time frame that the protocol will be defined.
I looked at edelweiss just now and am somewhat scared by >10k lines of code that doesn't seem to have been reviewed - are there plans for hardening/load testing the resulting server to gain confidence in the generated code before deployment, or is that something we need to be prepared for?
Is there an expected time frame that the protocol will be defined.
Good question. Lets discuss timeline for the interface/protocol definition in ipfs/go-delegated-routing#8
are there plans for hardening/load testing the resulting server to gain confidence in the generated code before deployment, or is that something we need to be prepared for
A couple of things:
- edelweiss should have done as much of that as reasonably as possible before coming to storetheindex with the proposition of adding edelweiss-generated code. These plans admittedly haven't been made yet. Tracking issue: ipld/edelweiss#12
- the key item to accomplish here is using the delegated routing protocol. That requirement is also satisfied if storetheindex rolls their own server implementation.
updating this to cover: "Add Reframe protocol server", let me know if that reflects the current ask, @BigLep
@willscott : yes, that's accurate. Thanks for updating.
@willscott Delegated Routing is about to be merged for production (ipfs/go-delegated-routing#11). This is a good time to take care of this issue and integrate it in storetheindex. Can I help with this by preparing an integration PR?
sure.
The main point of complexity I foresee is how we handle the metadata fields of records.
Right now, the storetheindex server node does not attempt parsing them at all - it just passes them through as bytes. We can have a limiting parsing based on currently understood fields to allow the current bitswap and graphsync metadata records to be converted into their reframe equivalents, but i'm not sure there's a good open-ended way for storetheindex to fail-open and be future-proofed against other metadata protocols currently since it doesn't try to enforce any specific type of encoding of the bytes.
@willscott I want to make sure I understand the context of your concern with metadata.
I am working under the assumption that we are integrating the Reframe API server in STI purely for the purposes of STI talking to Hydra. In particular, STI would continue to serve its other users using the specialized API that you have developed for them.
If this is the case, the only data that the Reframe API understands and cares about currently is provider records for bitswap-capable nodes. Therefore, the integration would involve parsing out bitswap records from the metadata on the STI side and feeding them into the appropriate structures of the Reframe API. If in the future Reframe API supports more types of records, we will have to upgrade the integration accordingly.
Right, it's that we can't support the Any default
case in the union https://github.com/ipfs/specs/pull/272/files#diff-08fb031656230573a5a243374d2f35a25fc604ca11aa36b96c27b1f96aa3f9beR176
Filtering to bitswap records on storetheindex is fine. The point of coordination that's worth a heads up to @aschmahmann is that if we end up wanting to extend bitswap records to have auth, this parsing will mean we can't just flow the additional record data to clients.
I don't think we need to figure out the longer term metadata format in this iteration, but it's something we're setting ourselves up to want pretty quickly.
@gammazero : I'm new on this project and its deployments. Is it self-service to determine whether this has been deployed?
we are trialing gitops as a preferred deployment strategy. commits merged to the repo will trigger a bot PR that can be merged to trigger dev deployment. tags will trigger PRs that merge to trigger prod deployments.
to determine what's been deployed you can either look at the .deploy directory, which includes the active images, or use the https://cid.contact/health / https://dev.cid.contact/health endpoint to see the current deployed git commit.
I'm reopening because of the need to add an operational dasbhoard like there is for the existing providers endpoint: https://protocollabs.grafana.net/d/pPROWOD7z/providers?orgId=1&var-Source=storetheindex-prod&var-Jobs=storetheindex%2Findexer&var-pod=indexer-0
I also updated the done criteria to make it clear we need this to be working in production.
All: I added this note to the description "For point number 2 above, EngRes IPFS team will take on there are metrics available. storetheindex will own creating a dashboard that uses them."
@willscott : will a promotion to prod deploy occur automatically or does a manual step need to be taken? I want to know when we can move forward on the Hydra production deployment.
productions to prod are happening a couple times a week. they're visible as active PRs in this storetheindex repo.
- the last deploy happened on friday EU time before the latest monitoring PRs got merged. This can also be verified at https://cid.contact/health which lists the running commit
- The next deploy should be early next week as we continue to debug an excessive CPU usage issue we've been working though this past week.
- Once it's deployed we can add the additional grafana plots and update here
@BigLep The current in-production commit for STI includes the latest Reframe code. Therefore, I will initiate deployment on the Hydra end.
@honghaoq : I don't know if it was duplicated, but it looks like the done criteria were met and thus can be closed. Go for it!
Done.