lukechampine/us

Garbage collection errors

MeijeSibbel opened this issue · 5 comments

@jkawamoto , i checked the attached garbage collection patch from Luke (request: communication error: download request exceeded maximum batch size);

diff --git a/renter/proto/session.go b/renter/proto/session.go
index 7947d89..788faec 100644
--- a/renter/proto/session.go
+++ b/renter/proto/session.go
@@ -287,6 +287,7 @@ func (s *Session) SectorRoots(offset, n int) (_ []crypto.Hash, err error) {
                return nil, err
        }
        if err := s.sess.ReadResponse(&resp, uint64(4096+32*n)); err != nil {
+               println("sector roots err:", req.RootOffset, req.NumRoots, s.host.MaxDownloadBatchSize)
                readCtx := fmt.Sprintf("couldn't read %v response", renterhost.RPCSectorRootsID)
                rejectCtx := fmt.Sprintf("host rejected %v request", renterhost.RPCSectorRootsID)
                return nil, wrapResponseErr(err, readCtx, rejectCtx)

(1)

image

This low output looks the same;

error | SectorRoots: host rejected LoopSectorRoots request: communication error: download request exceeded maximum batch size
-- | --
  | t errorVerbose | communication error: download request exceeded maximum batch size host rejected LoopSectorRoots request lukechampine.com/us/renter/proto.wrapResponseErr 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:53 lukechampine.com/us/renter/proto.(*Session).SectorRoots 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:294 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC.func1 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:320 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:331 github.com/storewise/s3-gateway/pkg/server/contract/standalone.(*contractManager).MigrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/contract/standalone/contract.go:371 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).migrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:526 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).Start.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:389 github.com/storewise/s3-gateway/pkg/bg.(*taskGroup).Go.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/bg/bg.go:65 golang.org/x/sync/errgroup.(*Group).Go.func1 	/Users/junpei/pkg/mod/golang.org/x/sync@v0.0.0-20200625203802-6e8e738ad208/errgroup/errgroup.go:57 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373 SectorRoots lukechampine.com/us/renter/proto.wrapErr 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/proto.go:16 lukechampine.com/us/renter/proto.(*Session).SectorRoots 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:294 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC.func1 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:320 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:331 github.com/storewise/s3-gateway/pkg/server/contract/standalone.(*contractManager).MigrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/contract/standalone/contract.go:371 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).migrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:526 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).Start.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:389 github.com/storewise/s3-gateway/pkg/bg.(*taskGroup).Go.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/bg/bg.go:65 golang.org/x/sync/errgroup.(*Group).Go.func1 	/Users/junpei/pkg/mod/golang.org/x/sync@v0.0.0-20200625203802-6e8e738ad208/errgroup/errgroup.go:57 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373

Also found some other errors related to garbage collection;

(2)
failed garbage collection | Write: host supplied invalid Merkle proof;

error | Write: host supplied invalid Merkle proof
-- | --
  | t errorVerbose | host supplied invalid Merkle proof lukechampine.com/us/renter/proto.init 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:34 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5420 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.main 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:190 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373 Write lukechampine.com/us/renter/proto.wrapErr 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/proto.go:16 lukechampine.com/us/renter/proto.(*Session).Write 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:583 lukechampine.com/us/renter/proto.(*Session).DeleteSectors 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:687 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC.func3 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:388 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:389 github.com/storewise/s3-gateway/pkg/server/contract/standalone.(*contractManager).MigrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/contract/standalone/contract.go:371 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).migrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:526 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).Start.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:389 github.com/storewise/s3-gateway/pkg/bg.(*taskGroup).Go.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/bg/bg.go:65 golang.org/x/sync/errgroup.(*Group).Go.func1 	/Users/junpei/pkg/mod/golang.org/x/sync@v0.0.0-20200625203802-6e8e738ad208/errgroup/errgroup.go:57 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373

(3)
SectorRoots: host rejected LoopSectorRoots request: communication error: download request has invalid sector bounds

error | SectorRoots: host rejected LoopSectorRoots request: communication error: download request has invalid sector bounds
-- | --
  | t errorVerbose | communication error: download request has invalid sector bounds host rejected LoopSectorRoots request lukechampine.com/us/renter/proto.wrapResponseErr 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:53 lukechampine.com/us/renter/proto.(*Session).SectorRoots 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:294 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC.func1 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:320 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:331 github.com/storewise/s3-gateway/pkg/server/contract/standalone.(*contractManager).MigrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/contract/standalone/contract.go:371 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).migrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:526 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).Start.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:389 github.com/storewise/s3-gateway/pkg/bg.(*taskGroup).Go.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/bg/bg.go:65 golang.org/x/sync/errgroup.(*Group).Go.func1 	/Users/junpei/pkg/mod/golang.org/x/sync@v0.0.0-20200625203802-6e8e738ad208/errgroup/errgroup.go:57 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373 SectorRoots lukechampine.com/us/renter/proto.wrapErr 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/proto.go:16 lukechampine.com/us/renter/proto.(*Session).SectorRoots 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:294 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC.func1 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:320 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:331 github.com/storewise/s3-gateway/pkg/server/contract/standalone.(*contractManager).MigrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/contract/standalone/contract.go:371 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).migrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:526 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).Start.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:389 github.com/storewise/s3-gateway/pkg/bg.(*taskGroup).Go.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/bg/bg.go:65 golang.org/x/sync/errgroup.(*Group).Go.func1 	/Users/junpei/pkg/mod/golang.org/x/sync@v0.0.0-20200625203802-6e8e738ad208/errgroup/errgroup.go:57 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373

(4)
failed garbage collection | SectorRoots: contract has insufficient funds to pay for revision


SectorRoots: contract has insufficient funds to pay for revision
--
  | t errorVerbose | contract has insufficient funds to pay for revision lukechampine.com/us/renter/proto.init 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:30 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5420 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.doInit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:5415 runtime.main 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/proc.go:190 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373 SectorRoots lukechampine.com/us/renter/proto.wrapErr 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/proto.go:16 lukechampine.com/us/renter/proto.(*Session).SectorRoots 	/Users/junpei/src/github.com/lukechampine/us/renter/proto/session.go:268 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC.func1 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:320 lukechampine.com/us/renter/renterutil.(*PseudoFS).GC 	/Users/junpei/src/github.com/lukechampine/us/renter/renterutil/filesystem.go:331 github.com/storewise/s3-gateway/pkg/server/contract/standalone.(*contractManager).MigrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/contract/standalone/contract.go:371 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).migrateFiles 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:526 github.com/storewise/s3-gateway/pkg/server/storage/dynamo.(*StorageClass).Start.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/server/storage/dynamo/storage.go:389 github.com/storewise/s3-gateway/pkg/bg.(*taskGroup).Go.func1 	/Users/junpei/src/github.com/storewise/s3-gateway/pkg/bg/bg.go:65 golang.org/x/sync/errgroup.(*Group).Go.func1 	/Users/junpei/pkg/mod/golang.org/x/sync@v0.0.0-20200625203802-6e8e738ad208/errgroup/errgroup.go:57 runtime.goexit 	/usr/local/Cellar/go/1.14.4/libexec/src/runtime/asm_amd64.s:1373

I think we resolved this -- there's a weird host that set their batch size really small.

image

@jkawamoto are we keeping bad hosts with strange batch sizes in circulation for buckets formed before we started filtering them? If so then that might explain why we still see these errors in the logs.

Edit: These are all old buckets, let's close this @lukechampine , if it occurs for new buckets i will ask to re-open.

@MeijeSibbel We don’t check contracts that are already formed. That’s a good point. We should remove such hosts from the existing contract sets.

@jkawamoto i don't know if we should be too worried about this though, unless it creates some other issues. Once we implement bucket delete and we remove all these test buckets, the errors should go away.