datasciencecampus/transport-network-performance

html_report KeyError

Opened this issue · 2 comments

Originally identified by @SergioRec in #248, context below. To reproduce the error, I used GtfsInstance on the chester test fixture. Using html_report(extended_validation=True, clean_feed=False) should trigger the KeyError: 'multiple_stops_invalid' .

A short-term fox sidestep (kicking the can down the road) could be to toggle the default value for extended_validation to False.

Original context below:

          Here's the code I ran:
# %%
from transport_performance.gtfs.multi_validation import MultiGtfsInstance
from pyprojroot import here

# %%
t = MultiGtfsInstance(here('data/interim/gtfs/itm_leeds_filtered_gtfs.zip'))
s = t.instances[0]
# %%
s.html_report(overwrite=True, clean_feed=False)
# %%

Here's the full traceback:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[3], [line 2](vscode-notebook-cell:?execution_count=3&line=2)
      [1](vscode-notebook-cell:?execution_count=3&line=1) # %%
----> [2](vscode-notebook-cell:?execution_count=3&line=2) s.html_report(overwrite=True, clean_feed=False)

File [~/src/transport_performance/gtfs/validation.py:1502](~//src/transport_performance/gtfs/validation.py:1502), in GtfsInstance.html_report(self, report_dir, overwrite, summary_type, extended_validation, clean_feed)
   [1500](~/src/transport_performance/gtfs/validation.py:1500) # create extended reports if requested
   [1501](~/src/transport_performance/gtfs/validation.py:1501) if extended_validation:
-> [1502](~/src/transport_performance/gtfs/validation.py:1502)     self._extended_validation(output_path=report_dir)
   [1503](~/src/transport_performance/gtfs/validation.py:1503)     info_href = (
   [1504](~/src/transport_performance/gtfs/validation.py:1504)         validation_dataframe["message"].apply(
   [1505](~/src/transport_performance/gtfs/validation.py:1505)             lambda x: "_".join(x.split(" "))
   (...)
   [1509](~/src/transport_performance/gtfs/validation.py:1509)         + ".html"
   [1510](~/src/transport_performance/gtfs/validation.py:1510)     )
   [1511](~/src/transport_performance/gtfs/validation.py:1511)     validation_dataframe["info"] = [
   [1512](~/src/transport_performance/gtfs/validation.py:1512)         f"""<a href="{href}"> Further Info</a>"""
   [1513](~/src/transport_performance/gtfs/validation.py:1513)         if len(rows) > 1
   [1514](~/src/transport_performance/gtfs/validation.py:1514)         else "Unavailable"
   [1515](~/src/transport_performance/gtfs/validation.py:1515)         for href, rows in zip(info_href, validation_dataframe["rows"])
   [1516](~/src/transport_performance/gtfs/validation.py:1516)     ]

File [~/src/transport_performance/gtfs/validation.py:1376](~/src/transport_performance/gtfs/validation.py:1376), in GtfsInstance._extended_validation(self, output_path, scheme)
   [1371](~/src/transport_performance/gtfs/validation.py:1371)         duplicate_counts[col] = impacted_rows[
   [1372](~/src/transport_performance/gtfs/validation.py:1372)             impacted_rows[f"{col}_original"]
   [1373](~/src/transport_performance/gtfs/validation.py:1373)             == impacted_rows[f"{col}_duplicate"]
   [1374](~/src/transport_performance/gtfs/validation.py:1374)         ].shape[0]
   [1375](~/src/transport_performance/gtfs/validation.py:1375) else:
-> [1376](~/src/transport_performance/gtfs/validation.py:1376)     impacted_rows = table_map[table].copy().iloc[rows]
   [1378](~/src/transport_performance/gtfs/validation.py:1378) # create the html to display the impacted rows (clean possibly)
   [1379](~/src/transport_performance/gtfs/validation.py:1379) table_html = f"""
   [1380](~/src/transport_performance/gtfs/validation.py:1380) <head>
   [1381](~/src/transport_performance/gtfs/validation.py:1381)     <link rel="stylesheet" href="styles.css">
   (...)
   [1390](~/src/transport_performance/gtfs/validation.py:1390)             {msg_type}</span>
   [1391](~/src/transport_performance/gtfs/validation.py:1391) </h1>"""

KeyError: 'multiple_stops_invalid'

Originally posted by @SergioRec in #248 (comment)

I'm pretty sure this is fixed in one of the open PRs, it just hasn't been implemented since the PR has not been merged.

I'm pretty sure this is fixed in one of the open PRs, it just hasn't been implemented since the PR has not been merged.

Thanks Charlie, once that backlog has been integrated with the package, we can close this out.