walaj/SeqLib

Could you settle with official bwa and fermi-lite libraries

tillea opened this issue · 18 comments

Hi,
I intend to package freebayes for Debian and the latest version is using SeqLib. So I tried to package SeqLib as well and was stumbling about the problem that BWA and FML have conflicting bseq1_t
declaration.
It turned out that you using personal forks of BWA and FML, not the upstream ones. When looking for bseq1_t in the SeqLib clones of BWA and FML, no definition appears. However, they do appear in the respective upstream repositories. Looks like you did patch the BWA and FML code bases to be able to mix them without a multiple-definition error.
Do you see any chance to solve this to make BWA and FML co-exist nicely?
Kind regards, Andreas.

walaj commented

Hi Andreas,
It would be great to have this packaged for Debian. If I understand you correctly, you are looking to link to the existing FML and BWA libraries, rather than the SeqLib-specific ones from my personal forks. If you have a suggestion for how I could resolve the bseq1_t multiple definition error during linking, without redefining them in either BWA or FML, I'd be happy to do that. I wasn't able to come up with a good solution, and just as you saw, it led me to make my own forks of the repositories so I could fix multiple definition issues.

Another solution (better) would be to have freebayes use a SeqLib-lite version without FML/BWA, since it doesn't actually use the BWA or FML libraries but rather the htslib wrapper functionality. I'd have to work with the freebayes people on this.

Another solution (better) would be to have freebayes use a SeqLib-lite version without FML/BWA

This is bad. We don't need more vendored modified versions of libraries. The solution should come from upstream FML / BWA which are incompatible at the moment.

it led me to make my own forks of the repositories so I could fix multiple definition issues.

Did you consider reporting this issue upstream at some point? If not, you should have. If yes, please point us to the relevant thread so we don't repeat ourselves.

walaj commented

@ghisvail Fair enough re: modified SeqLib. I was thinking that for FreeBayes it doesn't really need to link to BWA/FML, but separating this out is not necessary if we just resolve the issue for the single fully-complete SeqLib.

I have not reported the bseq1_t issue to either bwa or fml. Probably should have, but didn't. I can submit the issue report, but will probably need to make a PR that implements the name-switching.

Hi,
as you might have noticed in issue #3 of fermi-lite I proposed a patch that solved the issue with bseq1_t.
When I build the Debian package of fermi-lite with this patch I'm running in a different error which smells similar but not caused by fermi-lite as far as I can see:

BFC.cpp:351:5: error: ‘ch’ was not declared in this scope
     ch = fml_count(n_seqs, m_seqs, bfc_opt.k, bfc_opt.q, bfc_opt.l_pre, bfc_opt.n_threads);
     ^~
BFC.cpp:351:28: error: ‘m_seqs’ was not declared in this scope
     ch = fml_count(n_seqs, m_seqs, bfc_opt.k, bfc_opt.q, bfc_opt.l_pre, bfc_opt.n_threads);
                            ^~~~~~
BFC.cpp:351:90: error: ‘fml_count’ was not declared in this scope
     ch = fml_count(n_seqs, m_seqs, bfc_opt.k, bfc_opt.q, bfc_opt.l_pre, bfc_opt.n_threads);
                                                                                          ^
BFC.cpp: In member function ‘void SeqLib::BFC::correct_reads()’:
BFC.cpp:369:5: error: ‘es’ was not declared in this scope
     es.ch = ch;
     ^~
BFC.cpp:369:13: error: ‘ch’ was not declared in this scope
     es.ch = ch;
             ^~
BFC.cpp:370:15: error: ‘bfc_opt’ was not declared in this scope
     es.opt = &bfc_opt;
               ^~~~~~~
BFC.cpp:372:15: error: ‘m_seqs’ was not declared in this scope
     es.seqs = m_seqs;
               ^~~~~~
BFC.cpp:377:50: error: ‘bfc_ch_hist’ was not declared in this scope
     int mode = bfc_ch_hist(es.ch, hist, hist_high);
                                                  ^
BFC.cpp:394:29: error: ‘BFC_EC_MIN_COV_COEF’ was not declared in this scope
     bfc_opt.min_cov = (int)(BFC_EC_MIN_COV_COEF * kcov + .499);
                             ^~~~~~~~~~~~~~~~~~~
BFC.cpp:403:31: error: ‘kmer_correct’ was not declared in this scope
     kmer_correct(&es, mode, ch);
                               ^
BFC.cpp: In member function ‘void SeqLib::BFC::FilterUnique()’:
BFC.cpp:416:6: error: ‘m_seqs’ was not declared in this scope
  if (m_seqs[i].seq)
      ^~~~~~

Any hint how to work around this?
Kind regards, Andreas.

walaj commented

Hi Andreas,
It looks like you renamed it fml_seq1_t in your patch, whereas I renamed it fseq1_t in my fork. BFC.h is expecting m_seqs to be an array of fseq1_t. Changing fml_seq1_t to fseq1_t in your patch would probably allow SeqLib to compile without having to change SeqLib.

Hi again,
with the patches for a potential Debian package at
https://anonscm.debian.org/cgit/debian-med/fermi-lite.git/tree/debian/patches
I can build SeqLib. I also removed jsoncpp and ssw code copies. I wonder what header files should be installed into a potential Debian package.
It would be great if you would review these patches that would work with the suggested fermi-lite patches.
Kind regards, Andreas.

walaj commented

Hi Andreas,
I am seeing the fermi-lite repos (https://anonscm.debian.org/git/debian-med/fermi-lite.git), but when I clone I am still seeing the old bseq1_t structure. I know I'm missing something here. Could you guide a bit as to what I should be looking for? From what I understand, I should be trying to see if I can point SeqLib to the patched fermi-lite repos you put together? Are there further modifications to SeqLib that I can review? Any more instructions or a SeqLib PR would be helpful for my understanding.
Thank you for your work on this.
Best,
Jeremiah

walaj commented

Hi Andreas,
I just wanted to let you know that I haven't forgotten about this, and I appreciate your help and patience. I have some big project deadlines that have taken all my free cycles until then, but will look more into this soon after those are cleared out.
Best,
Jeremiah

ekg commented
walaj commented

Thanks so much Eric and Andreas,
What would be most helpful would be to have the changes formatted as a Git pull request. I've been meaning to learn the Debian system, but haven't had a block of time yet to do so (PhD thesis/defense due imminently...). I imagine it's not difficult at all, I'm just being unusually stingy with time this particular month.

Hi Jeremiah,
I admit creating my private clone of SeqLib just to enable you applying a patch does not seem sensible to me. I've done changes in my local clone and exported it via git format-patch. You can easily
unzip this zipfile and do
git am 0001-Avoid-name-space-conflict-with-official-bwa-and-ferm.patch
which imports the patch. Please let me know if you prefer a pull request anyway and I'll try to do my best to make your work as easy as possible.
Hope this helps, Andreas.

walaj commented

Ok getting closer here. I was able to apply your patch to SeqLib (per latest instructions) and see the changes (e.g. fseq1_t --> fml_seq1_t). What I need now is to compile SeqLib, but using the new fermi-lite repos with your patch. That's where I'm not sure where to find. I cloned the repository (git clone git://anonscm.debian.org/debian-med/fermi-lite.git), but wasn't able to figure out how to apply the patch. I have quilt now, but could use some guidance on as to where to obtain the fermi-lite patch file. I assume I then download the patch file to the cloned fermi-lite directory, and then just run quilt push -a? Am I understanding this correctly?

Any news here? Do you need further help?