pjotrp/bioruby-table

Error: invalid byte sequence in UTF-8

Opened this issue · 2 comments

I am reprocessing many files and some get the error in the subject line of this issue.

They have a degree symbol - 0xB0

example: 7.77° Minimum Slope

I am guessing that this may be where my issue is.

 /home/file/cmm/ST-GAGEMAX-OP10/50-6686 -16- Mach Op 10 Firstoff FO 654 2017-4-3 2_44_05 am.txt  ...Processing file...
bio-table 1.0.0 Copyright (C) 2012-2014 Pjotr Prins <pjotr.prins@thebird.nl>

 INFO bio-table: Array: [{:show_help=>false, :write_header=>true, :skip=>0, :pad_fields=>true, :columns=>["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "19", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19", "19"]}]
 INFO bio-table: Array: ["planid", "partnb", "id", "type", "idsymbol", "actual", "nominal", "uppertol", "lowertol", "deviation", "exceed", "featureid", "featuresigma", "comment", "link", "linkmode", "mmc", "useruppertol", "userlowertol", "fftphi", "fftphiunit", "zoneroundnessangle", "groupname", "groupname2", ""]
Error: invalid byte sequence in UTF-8

Can someone give me thoughts on how to resolve this?

Many Thanks.

It is an encoding thing. I can check. What Ruby are you using?

I finally found a way around it.
Since I didn't need the symbol, I struggled and finally found a way to replace it with sed.
I struggled to find a way to grep the files to see which ones needed to have the symbol replaced.

I am not sure what you are planning, but it would be cool if bio-table could quietly handle the wonky file.

albe@pmdsdata3:/home/file/cmm$ ruby -v
ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
  #  some files have degree symbol in them, 0xB0, , replace them before further processing because ruby bio-table doesn't like them. 
  #           Error: invalid byte sequence in UTF-8
  # if the file contains '\xB0' - degree symbol, then replace it with ~Deg~
  # http://unix.stackexchange.com/questions/6516/filtering-invalid-utf8
  if  grep --quiet -ax -v '.*'  $f  ; then
    sed -i 's/\xB0/~Deg~/g' $f
  fi