itsleeds/UK2GTFS

Processing Network Rail CIF daily updates fails

Opened this issue · 3 comments

I am attempting to use the UK2GTFS package to generate GTFS files for the NR Daily updates files. The processing of the full weekly update works well, however if I concatenate daily update data into the full CIF file, or try to process the daily update files individually (using the nr2gtfs function), I encounter the following error:

2024-02-06 14:15:41.77603 Building calendar and calendar_dates
2024-02-06 14:15:41.778934 Constructing calendar and calendar_dates
Error in checkForRemoteErrors(val) : one node produced an error: missing value where TRUE/FALSE needed
Calls: gtfs_write ... clusterApplyLB -> dynamicClusterApply -> checkForRemoteErrors
Execution halted

I'm struggling to debug this issue, and was looking for any assistance! Alternatively is there a better way of processing the daily updates through modifying the code - for example being able to specify multiple input files to the nr2gtfs function to be processed sequntially?

e.g.
nr2gtfs(
path_in = ["/var/NetworkRailCIFToGTFS/data/toc-full.CIF.gz","/var/NetworkRailCIFToGTFS/data/toc-update-1.CIF.gz"]
silent = FALSE,
ncores = 2,
full_import = TRUE
)

Attached is an example of the CIF update file I have been trying to process.

toc-full.CIF.gz

Clearly it would be helpful if UK2GTFS were able to process daily updates - particularly with all the volatility in the timetable at the moment with the Industrial Action

Many thanks

Daniel Chick
Zipabout

Can anybody help with this? My R skills aren't great - so any assistance on how to resolve this issue would be appreciated!

I found the error

Error in if (all(calendar.sub.day$STP == "C")) { : 
  missing value where TRUE/FALSE needed

> calendar_split[[14]]
       UID start_date   end_date    Days STP rowID Headcode ATOC Code Retail Train ID Train Status duration
481 H03943 2024-04-24       <NA>    <NA>   O  5717     <NA>      <NA>            <NA>         <NA>  NA days
482 H03943 2024-04-24 2024-04-25 0011000   C  5718     <NA>      <NA>            <NA>         <NA>   2 days
483 H03943 2024-04-25       <NA>    <NA>   O  5719     <NA>      <NA>            <NA>         <NA>  NA days

The problem is that the end_date and Days variables are missing in the data, this is causing the code to crash