koop-test

This is a test deployment of a Koop.js server as part of the Code for Pittsburgh food access map project.

The high-level idea is that we have a bunch of food access location data in a CSV with latlongs, and we have an ArcGIS Online web app to display those data, but we can only use critical AGOL features such as filter widgets if we make our data available in a format akin to an AGOL hosted feature layer. Enter Koop.js: an open-source project from Esri that creates a lightweight web server that ingests geospatial data from a variety of formats and makes it available in a variety of formats, including Geoservices which should meet our needs here.

In this repository, I'm setting up a Koop server and deploying it on Heroku, whose free tier should be sufficient to let us keep the Koop server up 24/7. We will probably need to recreate some or all of this work in an official C4P repository, but this is a start and maybe we can just fork it into the C4P GitHub account.

How I Did It, by Victor Frankenstein

That's a Young Frankenstein reference, for what it's worth.

Goodness knows I'm not as good at documentation as Max, but I figure I'll keep some notes on what I did to get this working, so we can recreate the process later and elsewhere.

Install Koop

Per the Koop quickstart guide, Koop requires Node.js and npm (which is a part of Node.js). So:

Install Node.js. I installed version 12.18.3 LTS but I imagine the 14.10.0 Current version would work fine too.
Verify that Node.js installation was successful by opening a terminal and typing node -v and/or npm -v. This latter command printed 6.14.6 for me.
Type npm install -g @koopjs/cli to install the Koop CLI.

Create the Koop app

In the parent directory of where you want the app to live (say, C:\Users\drew\Documents\GitHub), run koop new app koop-test. This will create a directory called koop-test and pre-populate a Git configuration, Node.js dependencies, etc.
I don't really understand how koop new app interacts with existing Git or GitHub repositories. I initially created a repo called koop-test on GitHub and cloned the repo to my local machine, but running koop new app overwrote the contents of this directory, or at least the files with name conflicts.
In the koop-test directory, you can run npm start to start the dev server (a quick server that is available locally for rapid development purposes). You can then visit http://localhost:8080/ to see the dev server in action. (Initially, all it does is display "Welcome to Koop!") You can also run koop serve and get the same result (I think).
We need the CSV provider, so within the koop-test directory, run koop add provider koop-provider-csv.

Configure the Koop CSV provider

The basic idea of Koop is that it ingests data from any of several provider plugins, intermediately converts the data to GeoJSON, then exports data to any of several output plugins. Koop comes with the GeoServices output plugin already installed (because, as far as I can tell, making geospatial data available via the GeoServices API is the primary point of Koop) but we need to install the CSV provider plugin separately.

Helpful documentation here.

Open koop-test/config/default.json in a text editor.
Define one or more CSV sources, following the format in the documentation.
It might be helpful to assign an idField in the config file - Koop chirps at me that no idField was set, but reassures me that it created an OBJECTID field for me instead.

Receive some feature data!

Start the Koop dev server via koop serve and/or npm start (again, not sure whether there's a distinction).
Some trial and error (and some review of the GeoServices specification) yields the following endpoint as a location where the whole dataset will be returned in JSON format compatible with the GeoServices API: http://localhost:8080/koop-provider-csv/food-data/FeatureServer/0/query
However, this combination of Koop provider (CSV) and output (GeoServices) plugins, by default, returns the first 2000 features only. Our dataset has more than 2000 features (we can tell because exceededTransferLimit is true), so we need to pass a URL parameter overriding this default limit. The Koop FAQ sheds some light on this - we can use resultRecordCount to get more than 2000 results. We could just pass resultRecordCount=999999 or some other very large number, but this is inelegant. The FAQ doesn't mention it but a lucky guess reveals that resultRecordCount=-1 causes the endpoint to return all results: http://localhost:8080/koop-provider-csv/food-data/FeatureServer/0/query?resultRecordCount=-1

Install Heroku

Properly speaking, one doesn't "install Heroku" per se. Heroku is a platform-as-a-service company, not a piece of software. By "install Heroku," rather, I mean "install software on my local machine that lets me interact with Heroku." Specifically, this is the Heroku CLI. The below checklist is basically stolen from that Heroku support article.

Install Git, a prerequisite for the Heroku CLI. (Or, because I already had Git installed, open Git Bash and run git update-git-for-windows just to make sure I'm using the latest version.)
Install Heroku CLI.
Verify that installation was successful by opening a terminal and typing heroku --version - this printed heroku/7.42.13 win32-x64 node-v12.16.2 for me.
Type heroku login to connect Heroku CLI with your (our!) Heroku login credentials.

Configure the Koop app for deployment to Heroku

Another very helpful tutorial, here.

Open package.json in the koop-test directory and specify the version of Node.js needed for our Koop app. This is simply the version of Node.js I installed earlier (v12.x). See the guide for the specific syntax needed.
Again pulling code from the tutorial, edit src/index.js to use the port number that Heroku assigns dynamically to each dyno (virtual machine) instead of the static port number specified in config/default.json.
1. I found, when trying to run step 3 below, that the code provided by the Koop tutorial needed a slight change: process.NODE_ENV.$PORT needed to be replaced by process.env.PORT. This Heroku help page helped me identify the correct syntax.
2. In case you're wondering, like I was, where the process object referenced in the new code comes from, it is a global variable created when a Node.js process starts up.
From the koop-test directory, run heroku local web to test whether the app will run correctly. This will make the app available at http://localhost:5000/ (note: different port number than before).
Now that the Koop app is properly configured, let's commit and push all changes to Git/GitHub. I usually use the Git interface within VS Code or GitHub Desktop to do so, but you geniuses who actually understand Git can use whatever terminal commands you like. :)

Deploy the Koop app to Heroku

Still working from this tutorial.

From the koop-test directory, run heroku create to create a new app on Heroku. This will give the app a random name, but we can change the name later.
In Git Bash (or maybe also in any other terminal), from the koop-test directory, run git push heroku master. This will push the master branch from GitHub to a remote on Heroku, which will receive the code and then build/deploy the app accordingly. This worked for me first try and I was blown away... for a little while, anyway.
Near the bottom of the output from that terminal command, you will see a URL to the deployed app (in my case, https://glacial-dawn-93110.herokuapp.com/). Open this up, append the route we identified earlier, and we're in business: https://glacial-dawn-93110.herokuapp.com/koop-provider-csv/food-data/FeatureServer/0/query?resultRecordCount=-1

Get stymied by a mysterious app crash

Actually, no, we're not in business. That route (and other query routes) doesn't work. The app crashes and we may even need to run heroku restart to get it running again. But other routes do work! Just not the query route; you know, the one we actually need.
Try hard to diagnose the problem, but fail. Post a question or two to StackOverflow.
Document what you've done and push to GitHub. Share with your teammates; maybe they can figure out what is going wrong.
Email the main developer. His name is Rich Gwozdz and he works at Esri. He isn't sure what's going on either, but he suggests enabling debugging the Heroku deployment and points you toward a very helpful Stack Overflow thread.

Enable debugging on Heroku

Working from the top answer on the Stack Overflow thread, with help from Heroku's documentation on Heroku Exec and remote debugging.

In the koop-test directory, create a new file named Procfile. Arguably we should have had one of these all along, because Heroku uses this file to determine how to deploy the app, but Heroku's defaults work fine so we didn't need to create one until now.
Edit procfile to tell Heroku to start Node.js with debugging enabled: web: node --inspect=9090 src/index.js. The 9090 tells Heroku which port to use for debugging; we could have used almost any other port (except low-numbered ports like 80 which are already used for other purposes such as general web traffic).
From the koop-test directory, run heroku ps:exec to enable the debugging connection. When you get to a $ prompt, type exit to exit that prompt.
Redeploy the Heroku app: commit changes to GitHub, then git push heroku master.
Run heroku ps:forward 9090 to start port forwarding (whatever that means).
Depending on which IDE you're using (I use VS Code), create or edit launch.json and add the following configuration:

{
    "type": "node",
    "request": "attach",
    "name": "Heroku",
    "port": 9090
}

Launch the "Heroku" debugger in your IDE.

Try other things

The debugger notes a caught exception having to do with identifying the text encoding of the input CSV. Because this exception is caught, I really don't think this is the problem, but let's try to force UTF-8 encoding (rather than ASCII) and see if that helps. ... It does not. Same problem as before, and the caught exception is still present. Now the default encoding is UTF-8, but the crash still happens at the query route.
Try a different implementation of koop-provider-csv that does not use the the autodetect-decoder-stream package. First we need to remove the original koop-provider-csv:
1. Remove the koop-provider-csv reference in src/plugins.js.
2. Edit config/default.json to either remove or repurpose the koop-provider-csv configuration (by changing koop-provider-csv to @ntkog/koop-provider-csv). Edit: I later found that koop-provider-csv-ntkog expects the config entry to be under koop-provider-csv so no change to config/default.json was needed after all.
3. Run npm uninstall koop-provider-csv to remove the package.
4. Delete the directory src/koop-provider-csv.
Now we need to install the new koop-provider-csv: npm install ntkog/koop-provider-csv, then koop add provider koop-provider-csv-ntkog.
That one crashes for different reasons, plus Haoliang has recently published an update (version 3.1.0) to the original koop-provider-csv that avoids the text encoding exception. So let's go back to that plugin:
1. Remove the koop-provider-csv-ntkog reference in src/plugins.js.
2. Run npm uninstall koop-provider-csv-ntkog.
3. Delete the directory src/koop-provider-csv-ntkog.

Give up and try a different platform

Now when we run heroku local web, the route http://localhost:5000/koop-provider-csv/food-data/FeatureServer/0/query works great.
But when we redeploy our app via git push heroku master, the route https://glacial-dawn-93110.herokuapp.com/koop-provider-csv/food-data/FeatureServer/0/query still crashes the app. This is a total bummer. Haoliang is looking into why this might be the case, but for now we will need to try a different platform.
I'm going to try Google App Engine, part of the Google Cloud ecosystem. But I'll do that in a different repository than this one.

drewlevitt/koop-test