LinkageIO/Camoco

Segmentation fault when building large networks

Closed this issue · 9 comments

I can build networks of smaller gene lists of ~25,000 with no problem but when I try with a larger set such as ~47,0000 the LOG: "Storing the coex table" will return a segmentation fault.

Thanks

I tried installing Camoco on a personal machine, and I still get the segmentation fault just FYI.

Can you post the log output? Does it say which line its occurring on?

In [1]: Gmax_a2_V1 = co.RefGen('Gmax_a2_V1')

In [2]: co.COB.from_table('Stacey_Soy_Expression_V2.csv',
   ...: 'SoyGeneral',
   ...: 'General Soy Network Stacey Data',
   ...: Gmax_a2_V1,
   ...: rawtype='RNASEQ',
   ...: max_gene_missing_data=0.3,
   ...: max_accession_missing_data=0.08,
   ...: min_single_sample_expr=1,
   ...: min_expr=0.001,
   ...: quantile=False,
   ...: max_val=300,
   ...: sep=','
   ...: )
[LOG] Mon Oct  3 14:12:49 2016 - Loading Expr table
[LOG] Mon Oct  3 14:12:49 2016 - Building Expr Index
[LOG] Mon Oct  3 14:12:49 2016 - Loading RefGen
[LOG] Mon Oct  3 14:12:49 2016 - RefGen for SoyGeneral not set!
[LOG] Mon Oct  3 14:12:49 2016 - Loading coex table
[LOG] Mon Oct  3 14:12:49 2016 - SoyGeneral is empty (IO error: Failed to open file: /media/jmichno/MeeshExternal/camoco_feather/databases/Expr.SoyGeneral.coex.ft)
[LOG] Mon Oct  3 14:12:49 2016 - Loading Global Degree
[LOG] Mon Oct  3 14:12:49 2016 - SoyGeneral is empty (IO error: Failed to open file: /media/jmichno/MeeshExternal/camoco_feather/databases/Expr.SoyGeneral.degree.ft)
[LOG] Mon Oct  3 14:12:49 2016 - Loading Clusters
[LOG] Mon Oct  3 14:12:49 2016 - Clusters not loaded for: SoyGeneral ()
[LOG] Mon Oct  3 14:12:50 2016 - Resetting raw expression data
[LOG] Mon Oct  3 14:12:50 2016 - Resetting expression data
[LOG] Mon Oct  3 14:12:50 2016 - Extracting raw expression values
[LOG] Mon Oct  3 14:12:51 2016 - Importing Raw Expression Values
[LOG] Mon Oct  3 14:12:51 2016 - Trans. Log: raw->RawRNASEQ
[LOG] Mon Oct  3 14:12:51 2016 - Resetting expression data
[LOG] Mon Oct  3 14:12:51 2016 - Extracting raw expression values
[LOG] Mon Oct  3 14:12:52 2016 - Performing Quality Control on genes
[LOG] Mon Oct  3 14:12:52 2016 - ------------Quality Control
[LOG] Mon Oct  3 14:12:53 2016 - Raw Starting set: 56044 genes 69 accessions
[LOG] Mon Oct  3 14:12:59 2016 - Found out 0 genes not in Reference Genome: GlycineMax - Wm82 Assembly 2 Version 1 RefGen - Gmax_a2_V1
[LOG] Mon Oct  3 14:12:59 2016 - Filtering expression values lower than 0.001
[LOG] Mon Oct  3 14:13:10 2016 - Found 17683 genes with > 0.3 missing data
[LOG] Mon Oct  3 14:13:17 2016 - Found 9124 genes which do not have one sample above 1
[LOG] Mon Oct  3 14:13:24 2016 - Found 8 accessions with > 0.08 missing data
[LOG] Mon Oct  3 14:13:24 2016 - Genes passing QC:
has_id                 56044
pass_membership        56044
pass_missing_data      38361
pass_min_expression    46920
PASS_ALL               37579
dtype: int64
[LOG] Mon Oct  3 14:13:24 2016 - Accessions passing QC:
has_id               69
pass_missing_data    61
PASS_ALL             61
dtype: int64
[LOG] Mon Oct  3 14:13:41 2016 - Genes passing QC by chromosome:
               has_id  pass_membership  pass_missing_data  pass_min_expression  PASS_ALL
chrom                                                                                   
Chr01            2457             2457               1642                 2010      1611
Chr02            3123             3123               2161                 2677      2132
Chr03            2649             2649               1769                 2186      1727
Chr04            2574             2574               1800                 2142      1764
Chr05            2491             2491               1768                 2134      1741
Chr06            3258             3258               2190                 2740      2156
Chr07            2743             2743               1928                 2311      1895
Chr08            3679             3679               2663                 3220      2618
Chr09            2865             2865               1812                 2328      1772
Chr10            2989             2989               2072                 2538      2024
Chr11            2570             2570               1885                 2256      1846
Chr12            2425             2425               1606                 2019      1578
Chr13            3730             3730               2774                 3352      2740
Chr14            2245             2245               1508                 1809      1463
Chr15            2774             2774               1834                 2246      1792
Chr16            2223             2223               1462                 1783      1406
Chr17            2627             2627               1904                 2250      1870
Chr18            3023             3023               1884                 2360      1831
Chr19            2642             2642               1788                 2188      1749
Chr20            2502             2502               1674                 2059      1633
scaffold_1038       1                1                  0                    1         0
scaffold_105        6                6                  0                    2         0
scaffold_1057       2                2                  2                    2         2
scaffold_1065       1                1                  0                    0         0
scaffold_1078       1                1                  0                    1         0
scaffold_110        2                2                  0                    1         0
scaffold_111        3                3                  0                    0         0
scaffold_1118       1                1                  1                    1         1
scaffold_112        1                1                  0                    0         0
scaffold_1160       1                1                  1                    1         1
...               ...              ...                ...                  ...       ...
scaffold_587        2                2                  1                    1         1
scaffold_608        2                2                  0                    1         0
scaffold_614        1                1                  1                    1         1
scaffold_623        1                1                  0                    0         0
scaffold_633        2                2                  1                    2         1
scaffold_636        3                3                  1                    3         1
scaffold_65         1                1                  1                    1         1
scaffold_660        2                2                  2                    2         2
scaffold_675        1                1                  1                    1         1
scaffold_681        1                1                  1                    1         1
scaffold_691        2                2                  0                    2         0
scaffold_711        1                1                  1                    1         1
scaffold_713        1                1                  1                    1         1
scaffold_72         1                1                  0                    0         0
scaffold_73         2                2                  1                    1         1
scaffold_74         2                2                  0                    0         0
scaffold_744        1                1                  0                    0         0
scaffold_75         1                1                  0                    0         0
scaffold_76         2                2                  0                    1         0
scaffold_78         2                2                  0                    0         0
scaffold_821        1                1                  1                    1         1
scaffold_843        1                1                  1                    1         1
scaffold_846        1                1                  1                    1         1
scaffold_852        1                1                  0                    1         0
scaffold_88         1                1                  0                    0         0
scaffold_896        1                1                  0                    1         0
scaffold_91         2                2                  0                    2         0
scaffold_93         7                7                  1                    1         1
scaffold_97         5                5                  0                    0         0
scaffold_99         3                3                  2                    3         2

[147 rows x 5 columns]
[LOG] Mon Oct  3 14:13:41 2016 - Kept: 37579 genes 61 accessions
[LOG] Mon Oct  3 14:13:42 2016 - Trans. Log: raw->quality_control
[LOG] Mon Oct  3 14:13:42 2016 - Performing Raw Expression Normalization
[LOG] Mon Oct  3 14:13:42 2016 - ------------ Normalizing
[LOG] Mon Oct  3 14:13:43 2016 - Trans. Log: raw->quality_control->arcsinh
[LOG] Mon Oct  3 14:13:43 2016 - Filtering refgen: Gmax_a2_V1
[LOG] Mon Oct  3 14:13:44 2016 - Building Indices
[LOG] Mon Oct  3 14:14:10 2016 - Adding 37579 Genes info to database
[LOG] Mon Oct  3 14:14:12 2016 - Adding Gene attr info to database
[LOG] Mon Oct  3 14:14:17 2016 - Building Indices
[LOG] Mon Oct  3 14:14:17 2016 - Calculating Coexpression
[LOG] Mon Oct  3 14:21:42 2016 - Applying Fisher Transform
[LOG] Mon Oct  3 14:22:00 2016 - Calculating Mean and STD
[LOG] Mon Oct  3 14:22:07 2016 - Finding adjusted scores
[LOG] Mon Oct  3 14:22:09 2016 - Build the dataframe
[LOG] Mon Oct  3 14:22:10 2016 - Calculating Gene Distance
Calculating for 37579 genes
[LOG] Mon Oct  3 14:23:06 2016 - Thresholding Significant Network Interactions
[LOG] Mon Oct  3 14:23:07 2016 - Storing the coex table
Segmentation fault

Can you try something quick? Go into ~/.camoco/databases and delete all
.ft files related to the object youre building. I found a bug that kills
feather when there are lingering files around. Maybe it's affecting you
too...

Otherwise, if you put a --debug flag right after Camoco in your command, it
should drop you into a debugger when it dies.

On Mon, Oct 3, 2016, 2:33 PM MeeshCompBio notifications@github.com wrote:

In [1]: Gmax_a2_V1 = co.RefGen('Gmax_a2_V1')

In [2]: co.COB.from_table('Stacey_Soy_Expression_V2.csv',
...: 'SoyGeneral',
...: 'General Soy Network Stacey Data',
...: Gmax_a2_V1,
...: rawtype='RNASEQ',
...: max_gene_missing_data=0.3,
...: max_accession_missing_data=0.08,
...: min_single_sample_expr=1,
...: min_expr=0.001,
...: quantile=False,
...: max_val=300,
...: sep=','
...: )
[LOG] Mon Oct 3 14:12:49 2016 - Loading Expr table
[LOG] Mon Oct 3 14:12:49 2016 - Building Expr Index
[LOG] Mon Oct 3 14:12:49 2016 - Loading RefGen
[LOG] Mon Oct 3 14:12:49 2016 - RefGen for SoyGeneral not set!
[LOG] Mon Oct 3 14:12:49 2016 - Loading coex table
[LOG] Mon Oct 3 14:12:49 2016 - SoyGeneral is empty (IO error: Failed to open file: /media/jmichno/MeeshExternal/camoco_feather/databases/Expr.SoyGeneral.coex.ft)
[LOG] Mon Oct 3 14:12:49 2016 - Loading Global Degree
[LOG] Mon Oct 3 14:12:49 2016 - SoyGeneral is empty (IO error: Failed to open file: /media/jmichno/MeeshExternal/camoco_feather/databases/Expr.SoyGeneral.degree.ft)
[LOG] Mon Oct 3 14:12:49 2016 - Loading Clusters
[LOG] Mon Oct 3 14:12:49 2016 - Clusters not loaded for: SoyGeneral ()
[LOG] Mon Oct 3 14:12:50 2016 - Resetting raw expression data
[LOG] Mon Oct 3 14:12:50 2016 - Resetting expression data
[LOG] Mon Oct 3 14:12:50 2016 - Extracting raw expression values
[LOG] Mon Oct 3 14:12:51 2016 - Importing Raw Expression Values
[LOG] Mon Oct 3 14:12:51 2016 - Trans. Log: raw->RawRNASEQ
[LOG] Mon Oct 3 14:12:51 2016 - Resetting expression data
[LOG] Mon Oct 3 14:12:51 2016 - Extracting raw expression values
[LOG] Mon Oct 3 14:12:52 2016 - Performing Quality Control on genes
[LOG] Mon Oct 3 14:12:52 2016 - ------------Quality Control
[LOG] Mon Oct 3 14:12:53 2016 - Raw Starting set: 56044 genes 69 accessions
[LOG] Mon Oct 3 14:12:59 2016 - Found out 0 genes not in Reference Genome: GlycineMax - Wm82 Assembly 2 Version 1 RefGen - Gmax_a2_V1
[LOG] Mon Oct 3 14:12:59 2016 - Filtering expression values lower than 0.001
[LOG] Mon Oct 3 14:13:10 2016 - Found 17683 genes with > 0.3 missing data
[LOG] Mon Oct 3 14:13:17 2016 - Found 9124 genes which do not have one sample above 1
[LOG] Mon Oct 3 14:13:24 2016 - Found 8 accessions with > 0.08 missing data
[LOG] Mon Oct 3 14:13:24 2016 - Genes passing QC:
has_id 56044
pass_membership 56044
pass_missing_data 38361
pass_min_expression 46920
PASS_ALL 37579
dtype: int64
[LOG] Mon Oct 3 14:13:24 2016 - Accessions passing QC:
has_id 69
pass_missing_data 61
PASS_ALL 61
dtype: int64
[LOG] Mon Oct 3 14:13:41 2016 - Genes passing QC by chromosome:
has_id pass_membership pass_missing_data pass_min_expression PASS_ALL
chrom
Chr01 2457 2457 1642 2010 1611
Chr02 3123 3123 2161 2677 2132
Chr03 2649 2649 1769 2186 1727
Chr04 2574 2574 1800 2142 1764
Chr05 2491 2491 1768 2134 1741
Chr06 3258 3258 2190 2740 2156
Chr07 2743 2743 1928 2311 1895
Chr08 3679 3679 2663 3220 2618
Chr09 2865 2865 1812 2328 1772
Chr10 2989 2989 2072 2538 2024
Chr11 2570 2570 1885 2256 1846
Chr12 2425 2425 1606 2019 1578
Chr13 3730 3730 2774 3352 2740
Chr14 2245 2245 1508 1809 1463
Chr15 2774 2774 1834 2246 1792
Chr16 2223 2223 1462 1783 1406
Chr17 2627 2627 1904 2250 1870
Chr18 3023 3023 1884 2360 1831
Chr19 2642 2642 1788 2188 1749
Chr20 2502 2502 1674 2059 1633
scaffold_1038 1 1 0 1 0
scaffold_105 6 6 0 2 0
scaffold_1057 2 2 2 2 2
scaffold_1065 1 1 0 0 0
scaffold_1078 1 1 0 1 0
scaffold_110 2 2 0 1 0
scaffold_111 3 3 0 0 0
scaffold_1118 1 1 1 1 1
scaffold_112 1 1 0 0 0
scaffold_1160 1 1 1 1 1
... ... ... ... ... ...
scaffold_587 2 2 1 1 1
scaffold_608 2 2 0 1 0
scaffold_614 1 1 1 1 1
scaffold_623 1 1 0 0 0
scaffold_633 2 2 1 2 1
scaffold_636 3 3 1 3 1
scaffold_65 1 1 1 1 1
scaffold_660 2 2 2 2 2
scaffold_675 1 1 1 1 1
scaffold_681 1 1 1 1 1
scaffold_691 2 2 0 2 0
scaffold_711 1 1 1 1 1
scaffold_713 1 1 1 1 1
scaffold_72 1 1 0 0 0
scaffold_73 2 2 1 1 1
scaffold_74 2 2 0 0 0
scaffold_744 1 1 0 0 0
scaffold_75 1 1 0 0 0
scaffold_76 2 2 0 1 0
scaffold_78 2 2 0 0 0
scaffold_821 1 1 1 1 1
scaffold_843 1 1 1 1 1
scaffold_846 1 1 1 1 1
scaffold_852 1 1 0 1 0
scaffold_88 1 1 0 0 0
scaffold_896 1 1 0 1 0
scaffold_91 2 2 0 2 0
scaffold_93 7 7 1 1 1
scaffold_97 5 5 0 0 0
scaffold_99 3 3 2 3 2

[147 rows x 5 columns]
[LOG] Mon Oct 3 14:13:41 2016 - Kept: 37579 genes 61 accessions
[LOG] Mon Oct 3 14:13:42 2016 - Trans. Log: raw->quality_control
[LOG] Mon Oct 3 14:13:42 2016 - Performing Raw Expression Normalization
[LOG] Mon Oct 3 14:13:42 2016 - ------------ Normalizing
[LOG] Mon Oct 3 14:13:43 2016 - Trans. Log: raw->quality_control->arcsinh
[LOG] Mon Oct 3 14:13:43 2016 - Filtering refgen: Gmax_a2_V1
[LOG] Mon Oct 3 14:13:44 2016 - Building Indices
[LOG] Mon Oct 3 14:14:10 2016 - Adding 37579 Genes info to database
[LOG] Mon Oct 3 14:14:12 2016 - Adding Gene attr info to database
[LOG] Mon Oct 3 14:14:17 2016 - Building Indices
[LOG] Mon Oct 3 14:14:17 2016 - Calculating Coexpression
[LOG] Mon Oct 3 14:21:42 2016 - Applying Fisher Transform
[LOG] Mon Oct 3 14:22:00 2016 - Calculating Mean and STD
[LOG] Mon Oct 3 14:22:07 2016 - Finding adjusted scores
[LOG] Mon Oct 3 14:22:09 2016 - Build the dataframe
[LOG] Mon Oct 3 14:22:10 2016 - Calculating Gene Distance
Calculating for 37579 genes
[LOG] Mon Oct 3 14:23:06 2016 - Thresholding Significant Network Interactions
[LOG] Mon Oct 3 14:23:07 2016 - Storing the coex table
Segmentation fault


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#42 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACcpvw3n6iBzKRMxRNWfgSbojIBFUsRqks5qwVh8gaJpZM4KIJ6V
.

I removed all of the files and it still fails. I also noticed that the camoco rm cli command does not work (or co.del_dataset() ), I have to manually delete the files in Unix unlike before the .ft implementation.

When I ran pdb before running the commands I only get a segmentation fault, there is no trace output.

Thanks

Yeah, the rm command was fixed upstream. Hmmmm, Ok, can you email me the
fpkm table it's dying on? And I guess the refgen for soy?

On Mon, Oct 3, 2016, 3:18 PM MeeshCompBio notifications@github.com wrote:

I removed all of the files and it still fails. I also noticed that the
camoco rm cli command does not work (or co.del_dataset() ), I have to
manually delete the files in Unix unlike before the .ft implementation.

When I ran pdb before running the commands I only get a segmentation
fault, there is no trace output.

Thanks


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#42 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACcpv_gogT0i4qpDXUxG9jlavN8N5g_bks5qwWMlgaJpZM4KIJ6V
.

The problem was introduced between feather-format 0.2.0 and 0.3.0. Making 0.2.0 the specifc requirement fixes it for now. When I have more time I will try to get to the real problem and report it to the feather team.

There also is a really weird issue having to do with pyplot aborting when loading into ipython, but it works in the normal interpreter, so I have no idea. I will put that on my list to look at one day as well.

I can confirm, that using feather-format 0.2.0 with the install script has fixed my issue. I have had no issues though using ipython with either build.