Add cstrip() and cwrap functions
Closed this issue · 11 comments
It would be great if CoordRefSystems could provide functions cstrip()
and cwrap()
for converting between coordinate types (e.g. x::LatLon
) and coordinate vectors (e.g. SVector(x.lat, xlon)
). Such functions are very useful for interfacing with non-CoordRefSystems-aware software (e.g. plotting libraries) and implementing coordinate arithmetic. For example:
function rhumb_line(x::CRS, y::CRS)
C = Mercator
return t -> cwrap(C, t * cstrip(C, x) + (1-t) * cstrip(C, y))
end
function mean(coords::AbstractVector{<:CRS})
C = Cartesian{WGS84latest}
m = cwrap(C, mean(cstrip(C, x) for x in coords))
# Don't trust means that are too far underground
# (i.e. the `coords` are not local enough)
@assert abs(convert(LatLonAlt, m).alt) < 10
return m
end
translated_latlon(x::CRS, d::SVector{3}) = cstrip(LatLon, cwrap(ENU(x), d))
I believe that our current design with Base.getproperty
is enough for the use cases you have in mind:
julia> using CoordRefSystems
julia> ll = LatLon(0, 0)
GeodeticLatLon{WGS84Latest} coordinates
├─ lat: 0.0°
└─ lon: 0.0°
julia> ll.lat
0.0°
julia> ll.lon
0.0°
julia> m = Mercator(0, 0)
Mercator{WGS84Latest} coordinates
├─ x: 0.0 m
└─ y: 0.0 m
julia> m.x
0.0 m
julia> m.y
0.0 m
We could also think about implementing Base.getindex
for all CRS if an algorithm doesn't need to know the name of the coordinates. In general, most algorithms end up using the names x
and y
for Projected
and lat
and lon
for Geographic
. It is good to get an error if someone tries to pass the incorrect CRS.
What do you think? Would the present design already solve your original issue?
Would the present design already solve your original issue?
Not really. Let's take the above rhumb_line()
as an example. With cstrip()
and cwrap()
, this is effectively a one-liner.
function rhumb_line(x::CRS, y::CRS)
C = Mercator
return t -> cwrap(C, t * cstrip(C, x) + (1-t) * cstrip(C, y))
end
Using the properties, this gets much more cumbersome.
function rhumb_line(x::CRS, y::CRS)
x_merc = convert(Mercator, x)
y_merc = convert(Mercator, y)
return t -> Mercator(
t * x_merc.x + (1-t) * y_merc.x,
t * x_merc.y + (1-t) * y_merc.y,
)
end
Still doable, of course, but in our work computations like the above arise frequently and then the properties-based approach just doesn't scale. So I would argue there is a real gap in the API here, but I'm open to exploring alternative solutions.
I would argue that the second code snippet is much more readable and explicit. It also gives the compiler more chance to avoid allocations. It should be as fast or faster than handling opaque tuples, specially if you preprocess (1-t) once and use the result in both expressions explicitly.
Also notice that you don't need to convert to Mercator
if you dispatch on Mercator or any Projected
type:
rhumb_line(a::Projected, b::Projected) = t -> Mercator(t * a.x + (1-t) * b.x, t * a.y + (1-t) * b.y)
Effectively a one-liner, with optimal performance.
If we add a Base.getindex
for all CRS
, the code doesn't require cstrip
and cwrap
, and will also be performant:
rhumb_line(a::Projected, b::Projected) = t -> Mercator(t .* a .+ (1-t) .* b)
But that is a decision that we need to consider carefully, as we don't want to promote unsafe treatment of coords as if they were mere tuples. This is what happens everywhere else in other languages, and is a source of obscure bugs.
But that is a decision that we need to consider carefully, as we don't want to promote unsafe treatment of coords as if they were mere tuples. This is what happens everywhere else in other languages, and is a source of obscure bugs.
I fully agree, and in fact your implementation of rhumb_line()
is a typical example for why implementing arithmetic directly on the coordinate types is dangerous: for t*a + (1-t)*b
to be a rhumb line you have to switch to Mercator
coordinates before doing the interpolation, so if I call say rhumb_line(LatLon(...), WebMercator(...))
then your code silently does something completely different from what I intended. By contrast, if arithmetic is implemented in terms of cstrip()
/ cwrap()
then you are forced to explicitly specify in which coordinates the arithmetic is to be carried out, and therefore any mistake in choosing a coordinate space would at least be easily spottable.
so if I call say
rhumb_line(LatLon(...), WebMercator(...))
That is not possible with the code I shared above because there is no method dispatching Geographic
with Projected
coordinates. You are misunderstanding the concepts here.
If you really need to convert the inputs to Mercator
, then that is a separate issue. After you have your two inputs in the expected Projected
subtype, you can easily access x
and y
as I did above. It seems that you want to add semantics to cstrip
and cwrap
to make sure that the "x" coordinate is always returned whatever that means, even in the LatLon
case. That is not a good design. There is no such thing as the "x" coordinate of the LatLon
coordinates.
so if I call say
rhumb_line(LatLon(...), WebMercator(...))
That is not possible with the code I shared above because there is no method dispatching
Geographic
withProjected
coordinates.
I don't see how distinguishing between Geographic
and Projected
matters here1. rhumb_line()
should work for any CRS
that you pass into it, since any CRS
defines a unique point on earth and so you can define the rhumb line between the two. (There are probably some subtleties regarding CRS
with altitude information and/or different datums, but these subtleties are mostly orthogonal to the main point here, so let's not get into that.) My argument works out exactly the same when applied to say rhumb_line(Robinson(...), Orthographic(...))
, and in fact it works out even when applied to rhumb_line(Robinson(...), Robinson(...))
since the result is a rhumb line if and only if the interpolation is done in Mercator
coordinates.
It seems that you want to add semantics to
cstrip
andcwrap
to make sure that the "x" coordinate is always returned whatever that means, even in theLatLon
case. [...] There is no such thing as the "x" coordinate of the LatLon coordinates.
I suspect I may not have expressed myself clearly (and I'm terribly sorry if that's the case). I want cstrip()
to be this (or some minor variation of this):
function cstrip(C::Type{<:Projected}, a::CRS)
b = convert(C, a)
return SVector(b.x, b.y)
end
Yes, this removes the CRS meta-information associated with the (x,y)
tuple, but so does a.x
and also a[1]
, so in that respect there is no difference between the two approaches. However, there are important differences between cstrip()
and getproperty()
in other respects:
-
cstrip(C, a)
forces you to be explicit about the coordinate type in which you unpacka
, whilea.x
doesn't. So you could writea.x
in a context wherea
is known to be aMercator
coordinate, then later on you change the context to allow arbitraryProjected
and suddenly your code breaks. This type of mistake is not possible withcstrip()
. -
cstrip(C, a)
gives you the coordinates in a form that's much more convenient for many purposes. Imagine you have an (unstructured) set of coordinatesa
,b
,c
, and you want to plot them all on aWebMercator
map with some custom styling applied to each.cstrip()
allows you to do this:scatter!(axis, cstrip(WebMercator, a); label = "a") scatter!(axis, cstrip(WebMercator, b); color = :green) scatter!(axis, cstrip(WebMercator, c); marker = "x")
By contrast, with
getproperty()
you have to do this:awm = convert(WebMercator, a) scatter!(axis, (awm.x, awm.y); label = "a") bwm = convert(WebMercator, b) scatter!(axis, (bwm.x, bwm.y); color = :green) cwm = convert(WebMercator, c) scatter!(axis, (cwm.x, cwm.y); marker = "x")
Footnotes
-
In fact, I'm not even sure what is the difference between
Geographic
andProjected
coordinate systems. They are both just tuples of numbers describing points on earth, so as far as I can tell they are just "different by convention"? However, unless you think I'm seriously missing an important point here and would be so kind as to lift my ignorance, I suggest we postpone this discussion for another day. ↩
It also gives the compiler more chance to avoid allocations. It should be as fast or faster than handling opaque tuples, specially if you preprocess (1-t) once and use the result in both expressions explicitly.
AFAICT there should be no difference in performance between the two approaches. The tuples in question here are all statically sized and have fully inferable types, so the compiler should be able to optimise the structured code in exactly the same way as the manually unrolled code. Of course, as usual when it comes to performance optimisations the devil is in the detail, but this detail can swing both ways and therefore I don't think performance favours either approach.
cstrip(C, a)
forces you to be explicit about the coordinate type in which you unpacka
, whilea.x
doesn't. So you could writea.x
in a context wherea
is known to be aMercator
coordinate, then later on you change the context to allow arbitraryProjected
and suddenly your code breaks. This type of mistake is not possible withcstrip()
.
This argument is not very strong since the whole idea of the cstrip
helper is toconvert
and then use the fields as a raw tuple. The more explicit approach is
xy = let
m = convert(Mercator, c)
(m.x, m.y)
end
We retain our viewpoint that being explicit is better than adding hidden layers of conversion. Utility functions such as cstrip
and cwrap
can live in other downstream packages if necessary. There is nothing inhibiting the definition of these functions in user code.
cstrip(C, a)
gives you the coordinates in a form that's much more convenient for many purposes. Imagine you have an (unstructured) set of coordinatesa
,b
,c
, and you want to plot them all on aWebMercator
map with some custom styling applied to each.cstrip()
allows you to do this:scatter!(axis, cstrip(WebMercator, a); label = "a") scatter!(axis, cstrip(WebMercator, b); color = :green) scatter!(axis, cstrip(WebMercator, c); marker = "x")
This layer of abstraction is too low. What we have in mind is a Makie recipe that consumes Meshes.Point
with given coordinates, and plots them with the nice options it already provides. End-users should never access these low-level details of coordinates, querying fields, converting to different CRS point-by-point. The big picture will be clearer in the following months after we refactor Meshes.jl accordingly. You will then be able to assess and reproduce the approach elsewhere in a high-level interface if you wish.
- In fact, I'm not even sure what is the difference between
Geographic
andProjected
coordinate systems. They are both just tuples of numbers describing points on earth, so as far as I can tell they are just "different by convention"? However, unless you think I'm seriously missing an important point here and would be so kind as to lift my ignorance, I suggest we postpone this discussion for another day. ↩
For one thing, Geographic
coordinates wrap around the Earth, they are known to cover the entire globe, and are periodic. Projected
coordinates on the other hand have various limitations as you know. Some of them have a limited domain (see indomain
), and all of them are a flat approximation of the globe that does not wrap around. These two abstract types allow advanced users to dispatch different algorithms. If an algorithm doesn't use these properties, it can simply dispatch on the general CRS
supertype.
Utility functions such as
cstrip
andcwrap
can live in other downstream packages if necessary. There is nothing inhibiting the definition of these functions in user code.
That's fair. I initially didn't realise that all Projected
coordinates exhibit their coordinates as x
and y
fields, and without this convention it would have been quite tedious to maintain my own cstrip()
function. But with this convention, that's perfectly doable.
Nice. I think we converged on this one. We can come back to it later after we have a better idea of how these CRS will be used in practice.