Illumina/interop

HiSeq - When parsing error metrics, segmentation fault due to illegal access on phi-x adapter array.

yechenqiao opened this issue · 5 comments

Hi,

Please reference the run folder 150928_SN276_0594_AHCTWGBCXX in basespace cloud for reference if needed.

While parsing the error metric of this folder, along with many other runs, we have seen Segmentation fault and program crashes.

When debugging, I've noticed it has to do with this piece of code where it tries to access metric.phix_adapter_rates()[0] when there is a length 0 array in metric.phix_adapter_rates().

The patch would be, to combine this and the for loop immediately below, to skip the printing of these metrics if there is no phi-x adapter rate vector data.

image

Could you review and check the fix is as simple as:

        /** Write a error metric to the output stream
         *
         * @param out output stream
         * @param metric error metric
         * @param sep column separator
         * @param eol row separator
         * @return number of columns written
         */
        static size_t write_metric(std::ostream& out,
                                   const error_metric& metric,
                                   const header_type& header,
                                   const char sep,
                                   const char eol,
                                   const char)
        {
            out << metric.lane() << sep << metric.tile() << sep << metric.cycle() << sep;
            out << metric.error_rate();

            const std::vector<float>& phi_x_adapter_rates = metric.phix_adapter_rates();
            if (!phi_x_adapter_rates.empty()){
                for(size_t i=0;i<static_cast<size_t>(header.number_adapters());++i)
                    out << sep << phi_x_adapter_rates[i];
            }
            out << eol;
            return 0;
        }
    };

When you index into an std::vector using operator[] and the index is out of bounds, then a segmentation fault is expected.

It is better to use the at, that will raise an exception if you are out of bounds.

That code seems fine, the header only gives you the maximum number of adapters, not the value per metric. A metric may not have the adapter populated if there is not PhiX spiked in.

You could check phi_x_adapter_rates.size() and loop over that, this case will always be safe

Thanks. Just wondering because that code is not written by me. It's in the original source code in this project. Would it be good to have this officially patched in the code? :)

I see I misunderstood the issue.