Crash in AppInit() upon start (EXCEPTION: NSt8ios_base7failureE)

Question

Crash in AppInit() upon start (EXCEPTION: NSt8ios_base7failureE)

Closed this issue 10 years ago · 9 comments

Hi,

On latest git, one of my namecoin installations is crashing upon start.
I suppose it might be reacting this way to corrupt files.

Any way I can help debugging it and hopefully find a way for namecoin to handle this?

solt@dev2:~/namecoin/src$ ./namecoind


************************
EXCEPTION: NSt8ios_base7failureE
CDataStream::read() : end of data
namecoin in AppInit()

terminate called after throwing an instance of 'std::ios_base::failure'
  what():  CDataStream::read() : end of data
Aborted

Answer 1 · 2014-06-17T11:58:17.000Z

This means that one of the unserialisation routines fails. It is probably really related to corrupt data files (or a bug in the format upgrades done recently, but since the upgrade works in general, I don't see why it would fail for you). I don't really see what we could do about corrupt data files.

If you still want to find out more, you can run namecoind in a debugger (e. g., gdb with "catch throw") and see where exactly (backtrace) the exception is thrown.

Answer 2 · 2014-06-17T12:25:30.000Z

I ran into this one once when I had renamed/deleted blkindex.dat IIRC. I had to rename my Namecoin directory and redownload the blockchain (once it is finished you can copy the wallet.dat to the new data folder).
Edit: Could be that I also mixed versions to provoke the error. Stick to the new version!

Answer 3 · 2014-06-17T15:53:47.000Z

I'll most probably end up doing what phelixbtc suggests.
In the mean time. I recompiled with -ggdb3 and did a backtrace. It's at:

Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1                                                                                        ".
[New Thread 0xb6d5db70 (LWP 8603)]
Catchpoint 1 (exception thrown), 0xb7eff160 in __cxa_throw ()
   from /usr/lib/i386-linux-gnu/libstdc++.so.6
(gdb) bt
#0  0xb7eff160 in __cxa_throw () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#1  0x0806f891 in CDataStream::setstate (
    this=<error reading variable: Unhandled dwarf expression opcode 0xfa>,
    psz=0x83da2ec "CDataStream::read() : end of data", bits=4,
    this=<error reading variable: Unhandled dwarf expression opcode 0xfa>)
    at serialize.h:1028
#2  0x0807a834 in CDataStream::read (this=0xbffff064, pch=0xc390438 "",
    nSize=22) at serialize.h:1056
#3  0x08081bf7 in Unserialize<CDataStream, std::allocator<bool> > (is=...,
    v=..., nType=nType@entry=2, nVersion=nVersion@entry=37500)
    at serialize.h:590
#4  0x08082b5f in SerReadWrite<CDataStream, std::vector<bool> > (
    nVersion=37500, nType=2, obj=..., s=..., ser_action=...) at serialize.h:810
#5  CTxIndex::Unserialize<CDataStream> (this=0xbffff19c, s=..., nType=2,
    nVersion=37500) at main.h:794
#6  0x08082d71 in Unserialize<CDataStream, CTxIndex> (nVersion=37500, nType=2,
    is=..., a=...) at serialize.h:411
#7  operator>><CTxIndex> (obj=..., this=0xbffff064) at serialize.h:1125
#8  CDB::Read<std::pair<std::string, uint256>, CTxIndex> (
    this=this@entry=0xbffff240, key=..., value=...) at db.h:91
#9  0x08078779 in CTxDB::ReadTxIndex (this=0xbffff240, hash=..., txindex=...)
    at db.cpp:424
#10 0x080e2da0 in CWallet::ReacceptWalletTransactions (this=0xc380df0)
    at wallet.cpp:541
#11 0x0815874a in AppInit2 (argc=-1073744776, argv=0xbffff7a4) at init.cpp:467
#12 0x0815a2a8 in AppInit (argc=argc@entry=1, argv=argv@entry=0xbffff7a4)
    at init.cpp:116
#13 0x080536cb in main (argc=1, argv=0xbffff7a4) at init.cpp:102

Going up the trace I found dbset to contain references of blkindex.dat and nameindex.dat.
Can't tell which is having problems.

Answer 4 · 2014-06-17T15:56:40.000Z

I know it's bad to suggest root causes, but could it be that a previous blkindex.dat rewrite didn't complete properly? Namecoind isn't running and this is my ~/.namecoin:

I know it's bad to suggest root causes, but could it be that a previous blkindex.dat rewrite didn't complete properly? Namecoind isn't running and this is my ~/.namecoin:
-rw-------  1 user user     630784 Jun 12 16:13 addr.dat
-rw-------  1 user user 1158031186 Jun 12 16:06 blk0001.dat
-rw-------  1 user user  517251072 Jun 17 12:50 blkindex.dat
-rw-------  1 user user  168435712 May 15 15:47 blkindex.dat.rewrite
drwx------  2 user user       4096 Jun 17 11:18 database
-rw-------  1 user user          0 Jun 17 11:21 db.log
-rw-------  1 user user     210693 Jun 17 17:47 debug.log
-rw-------  1 user user          0 Apr 25 13:32 .lock
-rw-r--r--  1 user user         67 Apr 28 09:55 namecoin.conf
-rw-------  1 user user   61640704 Jun 12 16:13 nameindex.dat
-rw-------  1 user user      94208 Jun 17 12:51 wallet.dat

Answer 5 · 2014-06-17T16:16:31.000Z

From your backtrace it is clear that blkindex.dat is the problem. Also, as you observed, the rewrite should have finished. However, actually I believe that even if you killed the daemon while it was rewriting, there "shouldn't" be a problem (the database file would just be larger than necessary and contain lots of empty pages). However, I've not actually tested that, of course. So hopefully this doesn't indicate a "real" bug.

Answer 6 · 2014-06-18T09:07:17.000Z

I don't know. Anyway, I removed blkindex.dat and restarted it and it's
synched up now and not crashing.
The incident is resolved, but perhaps the general question remains: how
should namecoind handle exceptions on trying to read a corrupted file.
Maybe an error message saying which file was impossible to read and suggest
removal?

On Tue, Jun 17, 2014 at 6:16 PM, Daniel Kraft notifications@github.com
wrote:

From your backtrace it is clear that blkindex.dat is the problem. Also, as
you observed, the rewrite should have finished. However, actually I believe
that even if you killed the daemon while it was rewriting, there
"shouldn't" be a problem (the database file would just be larger than
necessary and contain lots of empty pages). However, I've not actually
tested that, of course. So hopefully this doesn't indicate a "real" bug.

—
Reply to this email directly or view it on GitHub
#115 (comment).

Pysiak

Answer 7 · 2014-06-18T09:15:43.000Z

Note that, as far as my understanding goes, you can't "recover" from a missing/corrupted blkindex.dat in the same way as you can recreate the nameindex. I believe that if you delete blkindex.dat (or have to delete it), then you also should remove the blk*.dat files since the whole blockchain will be downloaded again anyway. (Since I do not yet know fully how the networking code works, this may be wrong - but I believe it is the case. You should be able to see whether or not the blk0001.dat file is double its initial size after you have finished syncing.)

Answer 8 · 2014-06-18T09:19:36.000Z

Well, that's what I thought so too and I figured blk.dat files need to go
as well, so I removed them together with blkindex.dat yesterday. I forgot
to mention that. Not sure about the size, I think it's similar size now. I
was low on disk space and I have more now, but I removed the .rewrite.
Perhaps if we know how much disk space is required for the rewrite,
namecoind could check if there's enough?

On Wed, Jun 18, 2014 at 11:15 AM, Daniel Kraft notifications@github.com
wrote:

Note that, as far as my understanding goes, you can't "recover" from a
missing/corrupted blkindex.dat in the same way as you can recreate the
nameindex. I believe that if you delete blkindex.dat (or have to delete
it), then you also should remove the blk*.dat files since the whole
blockchain will be downloaded again anyway. (Since I do not yet know fully
how the networking code works, this may be wrong - but I believe it is the
case. You should be able to see whether or not the blk0001.dat file is
double its initial size after you have finished syncing.)

—
Reply to this email directly or view it on GitHub
#115 (comment).

Pysiak

Answer 9 · 2014-09-14T18:15:06.000Z

Closing as this issue did not bubble up any more.