Use a different dataset in the gallery example "examples/gallery/lines/roads.py"

Question

Use a different dataset in the gallery example "examples/gallery/lines/roads.py"

seisman opened this issue 6 months ago · 7 comments

The gallery example (https://www.pygmt.org/v0.12.0/gallery/lines/roads.html) uses a dataset from http://www2.census.gov/geo/tiger/TIGER2015/PRISECROADS/tl_2015_15_prisecroads.zip.

I'm getting the "Access Denied" error when I try to download the data using a China IP, but it works when I use a VPN server in US or Japan.

It would be better if we could find another dataset which is more accessible to users.

Answer 1 · 2024-07-09T07:30:35.000Z

I like the current geopandas line-geometry example. However, I agree that we should use a dataset which is accessible to more [all] users!
Maybe we can find a dataset with line-geometry in the list available at https://geodatasets.readthedocs.io/en/latest/introduction.html#what-is-the-geodatasets-data-object. Sometimes a conversion of the coordinates is needed. I started going though the list (not finished yet) and just picked a dataset with rivers in Europe:

import geopandas as gpd
import pygmt

# -----------------------------------------------------------------------------
gpd_lines = gpd.read_file(
    "https://www.eea.europa.eu/data-and-maps/data/wise-large-rivers-and-large-lakes/zipped-shapefile-with-wise-large-rivers-vector-line/zipped-shapefile-with-wise-large-rivers-vector-line/at_download/file/" + \
    "wise_large_rivers.zip"
)                     

gpd_lines.crs
gpd_lines_new = gpd_lines.to_crs('EPSG:4326')
gpd_lines_new
            
# -----------------------------------------------------------------------------
fig = pygmt.Figure()

fig.coast(
    projection="M10c", 
    region=[-10, 30, 35, 57],
    land="gray99",
    shorelines="1/0.1p,gray50",
    borders="1/0.1,gray30",
    frame=True,
    # rivers="1/1p,lightred",  # Compare with GMT built-in
)

fig.plot(data=gpd_lines_new, pen="0.5p,steelblue")

fig.show()

For this dataset, we can [only] filter based on Shape_Leng and use a different pen for each subset, similar as for the different road types:

import geopandas as gpd
import pygmt

# -----------------------------------------------------------------------------
gpd_rivers_org = gpd.read_file(
    "https://www.eea.europa.eu/data-and-maps/data/wise-large-rivers-and-large-lakes/zipped-shapefile-with-wise-large-rivers-vector-line/zipped-shapefile-with-wise-large-rivers-vector-line/at_download/file/" + \
    "wise_large_rivers.zip"
)                     

gpd_rivers = gpd_rivers_org.to_crs('EPSG:4326')
            
# -----------------------------------------------------------------------------
fig = pygmt.Figure()

for i_panel in range(2):
    
    fig.coast(
        projection="M10c", 
        region=[-10, 35, 35, 58],
        land="gray99",
        shorelines="1/0.1p,gray50",
        borders="1/0.01p,gray70",
        frame=True,
    )
    
# -----------------------------------------------------------------------------
    if i_panel==0:
        len_limit = 700000
        gpd_rivers_short = gpd_rivers[gpd_rivers["Shape_Leng"] < len_limit]
        gpd_rivers_long = gpd_rivers[gpd_rivers["Shape_Leng"] > len_limit]
        fig.plot(data=gpd_rivers_short, pen="0.5p,orange", label=f"shorter {len_limit} m")
        fig.plot(data=gpd_rivers_long, pen="0.5p,darkred", label=f"longer {len_limit} m")
        fig.legend()
        
# -----------------------------------------------------------------------------
    if i_panel==1:
        pygmt.makecpt(
            cmap="oslo",
            series=[gpd_rivers.Shape_Leng.min(), 1500000],
            reverse=True,
        )
        for i_river in range(len(gpd_rivers)):
            fig.plot(
                data=gpd_rivers[gpd_rivers.index==i_river],
                zvalue=gpd_rivers.loc[i_river, "Shape_Leng"],
                pen="0.5p",
                cmap=True,
            )
        fig.colorbar(frame=["x+llength", "y+lm"], position="+ef0.2c")
    
# -----------------------------------------------------------------------------
    fig.shift_origin(xshift="w+1.5c")
    
fig.show()

Answer 2 · 2024-07-09T09:35:16.000Z

ping the author of the gallery example @weiji14

Answer 3 · 2024-07-09T10:01:30.000Z

ping the author of the gallery example @weiji14

Hm. Not sure, but looking at PR #1474 it seems like @michaelgrund wrote the first version of this example.

Answer 4 · 2024-07-10T01:52:04.000Z

You're right. @michaelgrund is the original author. @weiji14 was dealing with the Okina character ʻ #1474 (comment) recently, which gives me the wrong impression making me think he's the author.

Answer 5 · 2024-07-10T10:56:44.000Z

I'm fine with changing the data resource for this example and really like the rivers dataset @yvonnefroehlich proposed. However:

Is this dataset really available everywhere, at least can you access this @seisman ?
I would only show one figure instead of multiples and ignore filtering or length-related color-coding to keep the example as simple as possible
Not sure if users may be confused because GMT also have built-in rivers to plot

Answer 6 · 2024-07-10T11:26:07.000Z

Is this dataset really available everywhere, at least can you access this @seisman ?

Yes, I can access the data. The current dataset in this example is hosted by US government site. I guess that's why it blocks China.

Answer 7 · 2024-07-10T11:30:39.000Z

Is this dataset really available everywhere, at least can you access this @seisman ?

I was hopeing so, as this dataset is provided by the European Union / European Environment Agency (EEA).

I would only show one figure instead of multiples and ignore filtering or length-related color-coding to keep the example as simple as possible

Sure, one figures should be enough to show the principle. I just took the opportunity to play around with the data 😄.

Not sure if users may be confused because GMT also have built-in rivers to plot

I don't think that this is a larger issue. Users may have own datasets with more detailed (or newer) data. If people think it is needed we can maybe include a short comment in the description of the example?