anlek/mongify

Problems with spatial datatypes

Closed this issue · 9 comments

I have a MySQL db with spatial datatypes (POINT, POLYGON, LINESTRING) which I need to migrate to GeoJSON format in the destination Mongo db.

When I set column "centre_coord", :integer it returns a 0, and when I set to String it throws a BSON error (probably because MySQL stores it in an 'internal format'):
/home/vagrant/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/bson-1.10.2/lib/bson/bson_c.rb:20:in serialize': String not valid UTF-8 (BSON::InvalidStringEncoding)Normally I would use an SQL QueryST_AsText()orAsText()` but since we can't modify the SQL Queries I'm a bit stuck.

Any help is greatly appreciated.

anlek commented

Hello @joshkopecek, you can change the format in a before_save. Read more about it here: http://www.rubydoc.info/gems/mongify/Mongify/Database/Table

Fantastic, I knew there was something missing @anlek. Thanks for your help.
Now, I'm guessing that the data's been returned as binary since AsText() hasn't been called. However, I can't even seem to print the data to the console:

table "zones" do
    before_save do |row|
    puts row.centre_coord # a point coordinate
  end
column "id", :key, :as => :integer
…

Also from what I understand the MySQL data has to be queried using AsText() otherwise it's unusable.

Any thoughts?

anlek commented

Hmm, sorry, that's a bit outside of my knowledge.
Currently, there is no way to force SQL to act differently. However, if you're using MySQL version 5.7 or higher, you can use Views. Build a view with the binary data as text and see if Mongify sees it. (I've never tested this before). Please let me know how it works out for you.

Great suggestions. I've never used views before but I'll investigate and get back to you.

It worked! You're a genius. I can confirm it actually works with MySQL 5.5 as well. Thanks for all the support.

anlek commented

Awesome, thanks for sharing!

For anyone that needs this in the future, I created a view in MySQL:

create view new_table as 
id,name as title,AsText(geometry) as `geometry`
from old_table

then just use a standard MySQL query

select * from new_table

and in Mongify (if you're working with GeoJSON)

table "new_table" do
    before_save do |row|
        coords = /^\s*point\s*\(\s*(\S+)[\s,]+(\S+)\s*\)\s*$/i.match(row.geometry)
        if coords
            row.loc = {}
            row.loc['type'] = 'Point'
            row.loc['coordinates'] = coords[1,2]
                end
                row.delete('geometry')
    end
    column "id", :key, :as => :integer
    column "title", :string
    column "geometry", :string
end

The magic regex just matches the numbers in the returned text from MySQL Point(12.1234,34.3456). I have more regex for LineStrings and Polygons if you need them.

anlek commented

That's awesome!
I do want to point out you're missing an end for the if statement.
Also, (if I may suggest a refactor), you can pull row.delete('geometry') out of both parts of the if statement and put it after it :)

Thanks again for sharing!!!

Thanks @anlek I corrected the previous post.

Here's some regex for polygons:
/^\s*polygon\s*\(\(\s*(.+\S)\s*\)\)\s*$/i
and LineStrings:
/^\s*linestring\s*\(\s*(.+\S)\s*\)\s*$/i
of course you'll end up with just raw text so you'll need to do:

linestring = /^\s*linestring\s*\(\s*(.+\S)\s*\)\s*$/i.match(row.linestring)
coord_pairs = linestring[1].split(',')
row.linestring = {}
row.linestring['type'] = 'LineString'
row.linestring['coordinates'] = []
for coord_pair in coord_pairs
    row.linestring['coordinates'].push(coord_pair.split(' '))
end

or

polygon = /^\s*polygon\s*\(\(\s*(.+\S)\s*\)\)\s*$/i.match(row.polygon)
coord_pairs = polygon[1].split(',')
row.polygon = {}
row.polygon['type'] = 'Polygon'
row.polygon['coordinates'] = []
row.polygon['coordinates'][0] = []
for coord_pair in coord_pairs
    row.polygon['coordinates'][0].push(coord_pair.split(' '))
    puts row.polygon.inspect
end
else
    row.delete('polygon')
end