phasegenomics/FALCON-Phase

Compatibility with FALCON headers

pb-cdunn opened this issue · 8 comments

I was just about to pull Zev's fixes since we cut our release, when I noticed 06f5114:

commit 06f5114ecd107e10b3bdb4059b7f317910eb0692
Author: zeeev <jewbaru@gmail.com>
Date:   Wed Sep 12 14:19:25 2018

    Revert "added compatibility with other FALCON Unzip headers"

    This reverts commit aebec592ae0e24bd063409abf3dc01ea82fdbe94.

diff --git bin/scrub_names.pl bin/scrub_names.pl
index 8dd062a..3782dfc 100755
--- bin/scrub_names.pl
+++ bin/scrub_names.pl
@@ -55,8 +55,8 @@ PRI: while (<$IN>) {
     }
     else{
        chomp;
-       $_ =~ /^>(.*F(?:p\d+)?)/;
-       my $p_name = "$1";
+       $_ =~ /^>(.*)F/;
+       my $p_name = "$1F";
        print $OUT ">$p_name\n";
        $primaries{$p_name} = 1;
     }
@@ -92,7 +92,7 @@ HAP: while (<$INB>) {

 foreach my $k (@haplotigs){
     my @sname = split /_/, $k;
-#    $sname[0] =~ s/p\d+//;
+    $sname[0] =~ s/p\d+//;

@zeeev , why did you revert?

@skingan , were you aware of that revert? Is it ok?

Oh? The reversion is not in our "release", so it's not in Bioconda. I thought you added it and reverted something else. Here is the git graph:

*   37fa90d Merge pull request #45 from phasegenomics/development
|\
| * 06f5114 (up/development) Revert "added compatibility with other FALCON Unzip headers"
| | * b25e4c6 (HEAD -> develop, tag: 0.2.0, origin/develop) bioconda expect /usr/bin/env shebangs
| | * 3b026a2 fc_ prefixes
| |/
| * aebec59 added compatibility with other FALCON Unzip headers
| *   513241e Merge pull request #46 from PacificBiosciences/develop
| |\
| | * 5eba263 Re-write emit_haplotigs.pl in Python
| |/
| *   79397e4 (origin/development, development) Merge pull request #44 from phasegenomics/trio_snp_validation
| |\
| | * 2c92cce (up/trio_snp_validation) summary report done
| | * d9d6411 classification script done
| | * d951268 about to build out the validation pipeline
| * |   bf2c5ab Merge pull request #41 from PacificBiosciences/development
| |\ \
| | * | af2b1fc Fix scrub_names.pl
...
% git remote -v
origin  git@github.com:PacificBiosciences/pb-falcon-phase.git (fetch)
up      git@github.com:phasegenomics/FALCON-phase (fetch)
  • up is the phasegenomics repo.
  • origin is our PacBio repo.
  • Note that internally, we merge on develop, per our VP.

I think Sarah's commit fixed my own attempt:

commit af2b1fcdb0571cadbe16d753cd26b7552ee7c6b8
Author: Christopher Dunn <cdunn@pacificbiosciences.com>
Date:   Wed Aug 29 22:09:55 2018

    Fix scrub_names.pl

diff --git bin/scrub_names.pl bin/scrub_names.pl
index b2aa7bb..3782dfc 100755
--- bin/scrub_names.pl
+++ bin/scrub_names.pl
@@ -81,7 +81,7 @@ HAP: while (<$INB>) {
     else{
         chomp;
         # Allow new Unzip naming (which we have reverted).
-        if (not ($_ =~ /^>(.*F(?:p\d+)_[0-9]+)/)) {
+        if (not ($_ =~ /^>(.*F(?:p\d+)?_[0-9]+)/)) {

So by reverting Sarah's, we would be back to mine. Is that really correct?

Let's be 100% sure.

zeeev commented

Hi @pb-cdunn,

Sorry for the trouble! Going forward I promise to work on the PB repo. I'm trying to get one last merge into master before we sync the PB codebase.

I think the code is starting to stabilize and should require less attention.

#60

No! Don't apologize! Progress is good.

I'm about to sync these changes. I'm just trying to be sure that we actually want Sarah's "revert". I'm not certain.

Thanks. We are now up-to-date on the develop branch of pb-falcon-phase.

zeeev commented

Cheers! Thanks @pb-cdunn.