storj-archived/storjshare-daemon

Reputation loss due to file size rejection.

tempestb opened this issue ยท 8 comments

If your node receives an alloc request for a shard size that your node does not have space for, your node will not respond to the request.

This will lower your reputation.

This could be abused in a couple ways...

  1. Advertise a shard size that no one can fit, repeatedly. This will drop all reputation to zero over time.

  2. Favors largest nodes since they can hold the most data, their reputation will never go down. Vs small nodes who will often not be able to fit larger shards. So your largest node, 8tb, will get the most data and your smallest nodes will get the least. Assuming SIP9 reputation is based on distribution of date is distributed by preference of nodes with highest reputation.

In other words, a farmer would be at a disadvantage unless they run 8tb nodes.

Until the issue with 1/256 of the allocated size is fixed, we can limit the potential for DoS with this by checking a max shard size.

What is the point of having huge shards like 4GB? I think even 100M is too large. With small shards, client will download single large file from many nodes which will be faster than downloading a huge file from a one (possibly slow) mirror.

@akostadinov Did you ever thought about the ressources needed to create these small shard transfers, encryption and decryption? Even now many CPUs (single thread performance) struggle under decent load.
If i want to upload, lets say a VM Backup with 400GB, its already 100 Shards ( not counting the parity ones)
Further smaller shards mean more IO load and requests on the bridge side.

I would go exactly the opposite route and would appreciate bigger shards.

How is it not a problem for torrents to split all the files in small chunks? Above some size there sholdn't be much IO overhead for nodes (btw I mean switching storage to filesystem, I saw issues that the current db approach is not good anyway). As for node design I can't speak. Perhaps node should send out info to clients similar to torrent files/magnet and then client would be able to figure things out without bugging the node.

About tracking transfers, I think client could sign some confirmation and send to farmer. E.g. each 2MB downloaded and client sends signed confirmation to farmer. Then farmer can use the confirmations to claim transfers to nodes. If client doesn't send confirmations, then farmer will blacklist client and not bother with him anymore.

And limited shard size will help with limited ability of client or farmer to cheat about transfers.

Also imagine a 400GB backup. It is likely that not whole backup will change until the next one. Splitting shards would allow for applying deduplication techniques which could be a huge saver for the clients.

The shard size has no effect on cheater. The cheater will simply adopt and go on.

Deduplication is not going to happen. Even if the same file gets uploaded twice it will be encrytped with a different key.

How should they cheat?
Its all a matter of pricing. If the Downloads from a Node are more expensive than the revenue that the Farmer gets everything is fine. At the moment the problem is the payout formula and the high revenue for transferred data.

Just fyi, the torrent/magnet links you are refering to arent encrypted itself. If you are talking about zip files or similar thats another layer,
And we are talking about a maximum shard size, not fixed.

So client will encrypt each shard with a different key? I thought clients use the same key for their files so deduplication should be possible.

wrt cheating I was talking about the proposed solution about signing download confirmations. I don't know how it works now. But given your post is sounded like bridges are too much involved in download process presently.

My main point was that after some size increase, there shouldn't be any significant IO penalty for the farmer node. Probably bridges implementation needs to be reconsidered given how huge number of huge files are distributed by torrents with no apparent issue. I don't see how should it be different whether the files are encrypted or not. Bridges nor nodes should be able to decrypt the files anyway, are they?