tangjianpku/LINE

SIGABRT when importing graph

Closed this issue · 2 comments

I have tested the Linux implementation in two different machines. Here's the backtrace given by gdb.

line: malloc.c:2395: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.

Program received signal SIGABRT, Aborted.
0x00007ffff6ec5458 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
55  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007ffff6ec5458 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0x00007ffff6ec68da in __GI_abort () at abort.c:89
#2  0x00007ffff6f09308 in __malloc_assert (
    assertion=assertion@entry=0x7ffff6ffc048 "(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offs"..., file=file@entry=0x7ffff6ff8444 "malloc.c", line=line@entry=2395, 
    function=function@entry=0x7ffff6ffcd50 <__func__.11480> "sysmalloc") at malloc.c:300
#3  0x00007ffff6f0b016 in sysmalloc (nb=nb@entry=32, av=av@entry=0x7ffff7230b20 <main_arena>) at malloc.c:2392
#4  0x00007ffff6f0c066 in _int_malloc (av=av@entry=0x7ffff7230b20 <main_arena>, bytes=bytes@entry=18)
    at malloc.c:3828
#5  0x00007ffff6f0dfda in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3237
#6  0x00000000004012ce in AddVertex (name=0x7fffffffde40 "Kempton,_Illinois") at line.cpp:100
#7  0x00000000004015c0 in ReadData () at line.cpp:154
#8  0x00000000004027fe in TrainLINE () at line.cpp:389
#9  0x0000000000402da9 in main (argc=19, argv=0x7fffffffe008) at line.cpp:463

I have run the program with these arguments:

-train graph.tsv -output emb.txt -binary 0 -size 300 -order 1 -negative 5 -samples 100 -rho 0.025 -threads 8

The graph file is 1.4GB.

The program fails at different points in execution for each machine while "Reading edges". However, it stops at exactly the same point when executing it in different nodes of the same cluster at different times, so I don't think that not having enough memory is the issue. Besides, no RAM spikes where observed during execution.

I fixed the issue.

The SIGABRT was caused by an incorrect management of the vertex name strings.

First of all, only MAX_STRING characters are allocated for the names.

char name_v1[MAX_STRING], name_v2[MAX_STRING], str[2 * MAX_STRING + 10000];

And you are using fscanf assuming that the names won't ever be longer than MAX_STRING.

fscanf(fin, "%s %s %lf", name_v1, name_v2, &weight);

So I don't know why there's a length check in AddVertex.

int AddVertex(char *name)
{
    int length = strlen(name) + 1;
    if (length > MAX_STRING) length = MAX_STRING;
    vertex[num_vertices].name = (char *)calloc(length, sizeof(char));
    strcpy(vertex[num_vertices].name, name);
        ....
}

Also, that length check doesn't help in any way, because the strcpy call will copy the whole string into vertex[num_vertices].name even if MAX_STRING bytes where allocated.

So, in short, there's a buffer overflow just because of using vertex names longer than 100 chars. I fixed my problem by setting a bigger MAX_STRING.

I understand the advantages for assuming that no vertex name will be larger than a fixed length, but please, specify it in the README, or at least use a more significative macro name than MAX_STRING.

@oersted Thanks for sharing this! I meet the same problem. Beside this, the implementation may have another problem. Section 4.2 of the paper pointed that $d_v$ is the output degree, but in ReadData method of the implementation it add weight to both vertex[source].degree and vertex[target].degree.