yortus/DBFFile

readRecords will occasionally throw a RangeError

TheArchitect4855 opened this issue · 7 comments

I'm using this library to convert some old DBF data, and I'm running into an issue where readRecords(n) will throw RangeErrors. Currently I've just wrapped readRecords in a try/catch and only read one record at a time, which works, but naturally some data is lost.

Full stack trace:

RangeError [ERR_BUFFER_OUT_OF_BOUNDS]: Attempt to access memory outside buffer bounds
    at boundsError (internal/buffer.js:80:11)
    at Buffer.readInt32LE (internal/buffer.js:386:5)
    at int32At (/home/kurtis/Documents/anthology-converter/node_modules/dbffile/dist/dbf-file.js:264:72)
    at readRecordsFromDBF (/home/kurtis/Documents/anthology-converter/node_modules/dbffile/dist/dbf-file.js:345:35)
    at async Object.module.exports.convert (/home/kurtis/Documents/anthology-converter/src/converters/custconverter.js:31:17) {
  code: 'ERR_BUFFER_OUT_OF_BOUNDS'
}

Unfortunately I can't attach the files in question as they contain confidential/private data. However, I can share some metadata about the file:

  • Version: 48
  • # records: 23k
  • The file contains the Y (money) column type, which is currently unsupported

Thanks in advance for the help.

Unfortunately I don't have the tooling (or knowledge) to really do that. All I'm working with is your JS library and a DBF viewer extension in VS code.

From inspecting the file header, I can tell that it's a VFP file, so the weird endings may be the issue - Is there a workaround or known fix for that? If not, I have no issue seeing if I can fix it myself, I just don't really know what the problem is.

Ah, yeah, that's a good idea. I'll definitely give that a try! Thanks for the help.

I also had that problem.
I found that parsing the memo field is not correct, in case the dbf file is of type vfp9 (0x30).

I solved the problem as follows:

diff --git a/src/dbf-file.ts b/src/dbf-file.ts
index 5eed28d..b9ddae2 100644
--- a/src/dbf-file.ts
+++ b/src/dbf-file.ts
@@ -376,12 +376,14 @@ async function readRecordsFromDBF(dbf: DBFFile, maxCount: number) {
                             break;
 
                         case 'M': // Memo
-                            while (len > 0 && buffer[offset] === 0x20) ++offset, --len;
-                            if (len === 0) { value = null; break; }
                             let blockIndex = dbf._version === 0x30
                                 ? int32At(offset, len)
                                 : parseInt(substrAt(offset, len, encoding));
                             offset += len;
+                            if(isNaN(blockIndex) || blockIndex===0){
+                                value = null;
+                                break;
+                            }
 
                             // If the memo file is missing and we get this far, we must be in 'loose' read mode.
                             // Skip reading the memo value and continue with the next field.

Explanation:
The memo field in the dbf file represents the blockIndex in the memo file (* .dbt, * .fpt ...)
BlockIndex in a dbf file (0x30) is always 4 bytes long, encoded as a 32-bit integer, and the field length and offset must not be changed.
If the memo field has no content, then blockIndex = 0

In other cases (<> 0x30), blockIndex is a 10 byte ASCII encoded decimal number, which is right-aligned and left-filled with spaces. The parseInt () function ignores the gaps on the left, so it is unnecessary to move the offset. If the memo field has no content, then the blockIndex is filled only with spaces, and the parseInt () function will return NaN

A PR would be welcome!

Reopening this to possibly notify smarter people than me who helped here last time, as I came into the same error as described in #87 . From what I read here, the problem i encounter might be a superset of this error, and will require fix in similar fashion, which I am working on now. Has anybody experienced RangeErrors like me even after this?