hrj/abandon

Unexpected error: Too many open files

jaa127 opened this issue · 6 comments

Abandon is leaking file handles at the momemt. If you feed it with large set of transactions, it will fail in following way:

 Processing: ledger/perf-1E4/2016/01/01/20160101T005242-1.txn
 Processing: ledger/perf-1E4/2016/01/01/20160101T014524-2.txn
 ...
 Unexpected error
 java.io.FileNotFoundException: ledger/perf-1E4/2016/05/29/20160529T110648-4084.txn (Too many open files)
hrj commented

I was able to specify 3000 input files on my ubuntu box. Try this test case.

What is your system config?

hrj commented

I tried with 5000 input files just for completeness and it still worked. Updated test case

hrj commented

Even though abandon works with 5000 input files on my system, it is possible that there is still a leak somewhere (abandon, scala or JRE). As far as I can see, abandon is closing the source file after parsing.

There is opened file (source) on line Process L135
Which is later closed on L148

Problem is that source is not used at all. Instead there is call parser.scannerFromFile which is opening file on Scala-lib level. That InputStream is never closed?

scannerFromFile -> (Abandon)
  PagedSeq.fromFile(filePath) -> (PagedSeq (Scala-lib))
        fromFile(new File(source)) ->  (PagedSeq (Scala-lib))
           fromReader(new FileReader(source)) ->   (PagedSeq (Scala-lib))
                      super(new FileInputStream(file)); (java.io)

It is also failing with your test data. It could have something to do with this machine configuration (16G ram), java version or environment settings (max files). However, this same machine runs similar tests with one million (1e6) files without problems, so it has probably something to do about scala-lib or how Abandon uses PagedSeq. In any case, please see my above comment.

java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
Processing:/tmp/abandonTest148/t5/f172.ledger
Processing:/tmp/abandonTest148/t5/f173.ledger
Unexpected error
java.io.FileNotFoundException: /tmp/abandonTest148/t5/f173.ledger (Too many open files)
	at java.io.FileInputStream.open0(Native Method)
	at java.io.FileInputStream.open(FileInputStream.java:195)
hrj commented

I fixed this by ensuring that scanner is created from the source, rather than from another file object.

And verified this by using the following code which returns number of FDs that are open:

    def openFileCount = {
      import java.lang.management.ManagementFactory
      import java.lang.management.OperatingSystemMXBean
      import com.sun.management.UnixOperatingSystemMXBean
      val os = ManagementFactory.getOperatingSystemMXBean()
      os match {
        case uos:UnixOperatingSystemMXBean => uos.getOpenFileDescriptorCount()
        case _  -1
      }
    }

Before the fix, the number of FDs was increasing for every input file that was processed.