Niftysave
Scans ethereum for ERC-721 Non-Fungible Tokens and replicates all assets by saving them on nft.storage.
Overview
Ingestion
Cron job named ERC721
runs periodically (on github CI).
It performs following steps until it runs out of time and then exits:
- Pull batch of tokens by querying [EIP721 Subgraph][].
- Import pulled batchinto Fauna DB via
importERC721
GraphQL endpoint.
Fauna does most of the heavy lifting here, specifically importERC721
User Defined Function (UDF) does following:
- Stores each token owner into an
Owner
collection. - Stores each token contract into a
TokenContract
collection. - Stores actual token details (
tokenID
,tokenURI
,mintTime
) in aToken
collection. - Cross links all of the above (so that graph can be queried in any direction).
- Stores import result into
ERC721ImportResult
collection and cross-links with all the tokens it imported. - Updates the
Cursor
record (which is used next time around by the job to query subgraph from that position).
Analysis
Cron job named Token Metadata
runs runs periodically (on github CI). It goes over tokens that were ingested and attempts to analyze token metadata assumed to be linked throug tokenURI
. It pulls batch of not yet analyzed tokens from db via findTokenAssets
GraphQL query and performs following steps concurrently across all tokens:
- Parse
token.tokenURI
, if failed just report an error. - Fetch contents of URL (If it is
ipfs://
url or a IPFS gateway URL fetch fromhttps://ipfs.io/ipfs
otherwise from the actual URL). If failed report an error. - Parse contents as metadata JSON, if failed report an error.
- Pin contents of
token.tokenURI
to IPFS cluster. (By CID if it was IPFS or IPFS Gateway URL otherwise by uploading content). - Pull out all the URLs found in metadata.
- Submit metadata and all the linked assets to DB via
importTokenMetadata
GraphQL mutation.
As a result some tokens will be marked problematic due to errors reported and some will get associated entries in Metadata
and Resource
collections.
Saving
Cron job named Token Asset
runs periodically (on github CI). It goes over discovered resources that were linked from token metadata and attempts to replicate them in IPFS cluster. It pulls batch of linked resources from db via findResources
GraphQL query and concurrently save each one via following steps:
- Parse
resource.uri
, on failure mark resource problematic. - If parsed URL is an
ipfs://
or IPFS gateway URL extract IPFS path and pin it on cluster. - If parsed URL is not recognized as above attempt to download content from it. If failed mark resource problematic.
- Upload content to IPFS cluster for pinning.
- Update
resource
in DB to includecid
andipfsURL
.
Database
Fauna DB is used as a storage layer and jobs interact with it through GraphQL API. Invariants are upheld through User Defined Functions (UDF)s which are exposed as custom mutations over GraphQL.
Hacking
⚠️ Please do not change schema or run untested code on a production database as you may corrupt data. Instead configure your environment to use dev db instance.
Environment
You will need to setup environment with variables listed below.
Recommended way is through
./.env
file.
FAUNA_KEY
- Your fauna db access token.IPFS_CLUSTER_KEY
- Access token for IPFS cluster.BATCH_SIZE
- Number of tokens scanner will pull at a time.TIME_BUDGET
- Time budget in seconds (task will abort once out of time).
Setting up new/test database
- Create a new database at https://dashboard.fauna.com/
- Generate access token in database settings (under security tab) and assign it to
FAUNA_KEY
env varibale (in./.env
file). - Get db schema up to date by running
yarn setup
. It will apply all db migrations to get it up to date with a schema.
Schema
Database schema is primarily driven by GraphQL which schema. To make changes to
the schema edit ./fauna/resources/schema.graphql
file and then run yarn update-schema
which will:
- Reflect schema changes in DB (
⚠️ Remember to use dev db). - Download and organize new database collections/indexes/functions at
./fauna/resources
directory.
Each collection/index/function is written in file under corresponding directory with a same named and
.fql
extension. E.g. function namedboom
would be located at./fauna/resources/Function/boom.fql
.
If you only wanted to change schema than only other thing you'd need to do is
generate a migration by running yarn create-migration
script. That would
create a directory under ./fauna/migrations/
and will contain changes made.
Make sure to include generated migration with a pull request.
User Defined Functions (UDF)s
As alluded to in above section, all the UDFs will be organized in
./fauna/resources/Function
directory. Each file containing single function
with a name of the file.
You can modify existing functions or create new ones, once done you will need to
generate a migration by running yarn create-migration
script.
Indexes
Fauna does not really support changing indexes, if you find yourself in need to do that it is likely that you'd be better of create a new index instead.
Creating new indexes just requires creating a corresponding file, e.g. index
named allTokens
would require creating ./fauna/resources/Index/allTokens.fql
file with a single CreateIndex
expression.
Note: Often times @index graphql directive in the schema would do a trick.
Collections
You would probably never need to modify or create new collection manually, as they are generated from GraphQL schema.
Preparing pull request
Typically you would combine schema changes with function changes and possibly accompany them with new indexes. Best practice is to do these as follows:
- Start with changing a schema. Anything but bugfix will require graphql entrypoint query or mutation so it's a best place to start.
- Push schema changes by running
yarn update-schema
. That would also pull all new functions/collections/indexes into your repo. - Modify / create functions. Above step would bring some new functions and here you'd modify them and maybe introduce some new functions to ficilitate reuse.
- Create necessary indexes. Your new functions often would need indexes and likely you'll create them as you write those functions.
- Creating a migration by running
yarn create-migration
, which will generate a directory in./fauna/migrations
which needs to be included in pul request.