vecna/trackmap

Running of perform_analysis.py triggers an assertion error at the end

hellais opened this issue · 0 comments

After applying my patch in #70 and running perform_analysis it completes with the following error (note that the output directory contains indeed a bunch of stuff):

Unable to read telegraph.co.uk_2 [Errno 2] No such file or directory: 'output/telegraph.co.uk_2/__urls' skipping
Unable to read borehamwoodtimes.co.uk_39 [Errno 2] No such file or directory: 'output/borehamwoodtimes.co.uk_39/__urls' skipping
Unable to read dewsburyreporter.co.uk_79 [Errno 2] No such file or directory: 'output/dewsburyreporter.co.uk_79/__urls' skipping
Unable to read elystandard.co.uk_95 [Errno 2] No such file or directory: 'output/elystandard.co.uk_95/__urls' skipping
Unable to read northamptonchron.co.uk_24 [Errno 2] No such file or directory: 'output/northamptonchron.co.uk_24/__urls' skipping
Unable to read hartlepoolmail.co.uk_125 [Errno 2] No such file or directory: 'output/hartlepoolmail.co.uk_125/__urls' skipping
Unable to read brighouseecho.co.uk_43 [Errno 2] No such file or directory: 'output/brighouseecho.co.uk_43/__urls' skipping
Unable to read coventrytelegraph.net_72 [Errno 2] No such file or directory: 'output/coventrytelegraph.net_72/__urls' skipping
Unable to read expressandstar.com_101 [Errno 2] No such file or directory: 'output/expressandstar.com_101/__urls' skipping
Unable to read gazettelive.co.uk_105 [Errno 2] No such file or directory: 'output/gazettelive.co.uk_105/__urls' skipping
Unable to read eastbourneherald.co.uk_88 [Errno 2] No such file or directory: 'output/eastbourneherald.co.uk_88/__urls' skipping
Unable to read chelmsfordweeklynews.co.uk_61 [Errno 2] No such file or directory: 'output/chelmsfordweeklynews.co.uk_61/__urls' skipping
Unable to read gazetteandherald.co.uk_103 [Errno 2] No such file or directory: 'output/gazetteandherald.co.uk_103/__urls' skipping
Unable to read icnetwork.co.uk_20 [Errno 2] No such file or directory: 'output/icnetwork.co.uk_20/__urls' skipping
Unable to read buchanobserver.co.uk_46 [Errno 2] No such file or directory: 'output/buchanobserver.co.uk_46/__urls' skipping
Unable to read bucksfreepress.co.uk_49 [Errno 2] No such file or directory: 'output/bucksfreepress.co.uk_49/__urls' skipping
Unable to read echo-news.co.uk_92 [Errno 2] No such file or directory: 'output/echo-news.co.uk_92/__urls' skipping
Unable to read liverpoolecho.co.uk_136 [Errno 2] No such file or directory: 'output/liverpoolecho.co.uk_136/__urls' skipping
Unable to read gloucestercitizen.co.uk_111 [Errno 2] No such file or directory: 'output/gloucestercitizen.co.uk_111/__urls' skipping
Unable to read dudleynews.co.uk_84 [Errno 2] No such file or directory: 'output/dudleynews.co.uk_84/__urls' skipping
Unable to read exeterexpressandecho.co.uk_100 [Errno 2] No such file or directory: 'output/exeterexpressandecho.co.uk_100/__urls' skipping
Unable to read halifaxcourier.co.uk_117 [Errno 2] No such file or directory: 'output/halifaxcourier.co.uk_117/__urls' skipping
Unable to read dailymail.co.uk_0 [Errno 2] No such file or directory: 'output/dailymail.co.uk_0/__urls' skipping
Unable to read banburyguardian.co.uk_27 [Errno 2] No such file or directory: 'output/banburyguardian.co.uk_27/__urls' skipping
Unable to read cornwallcommunitynews.co.uk_69 [Errno 2] No such file or directory: 'output/cornwallcommunitynews.co.uk_69/__urls' skipping
Unable to read hertfordshiremercury.co.uk_132 [Errno 2] No such file or directory: 'output/hertfordshiremercury.co.uk_132/__urls' skipping
Unable to read shropshirestar.com_146 [Errno 2] No such file or directory: 'output/shropshirestar.com_146/__urls' skipping
Unable to read hampshirechronicle.co.uk_120 [Errno 2] No such file or directory: 'output/hampshirechronicle.co.uk_120/__urls' skipping
Unable to read cambstimes.co.uk_56 [Errno 2] No such file or directory: 'output/cambstimes.co.uk_56/__urls' skipping
Unable to read businessweekly.co.uk_7 [Errno 2] No such file or directory: 'output/businessweekly.co.uk_7/__urls' skipping
Unable to read examiner.co.uk_99 [Errno 2] No such file or directory: 'output/examiner.co.uk_99/__urls' skipping
Unable to read yorkshireeveningpost.co.uk_15 [Errno 2] No such file or directory: 'output/yorkshireeveningpost.co.uk_15/__urls' skipping
Unable to read eastnorthumberland.com_90 [Errno 2] No such file or directory: 'output/eastnorthumberland.com_90/__urls' skipping
Unable to read buckinghampost.com_47 [Errno 2] No such file or directory: 'output/buckinghampost.com_47/__urls' skipping
Unable to read dunmowbroadcast.co.uk_85 [Errno 2] No such file or directory: 'output/dunmowbroadcast.co.uk_85/__urls' skipping
Unable to read harrogateadvertiser.co.uk_123 [Errno 2] No such file or directory: 'output/harrogateadvertiser.co.uk_123/__urls' skipping
Unable to read express.co.uk_102 [Errno 2] No such file or directory: 'output/express.co.uk_102/__urls' skipping
Unable to read burytimes.co.uk_54 [Errno 2] No such file or directory: 'output/burytimes.co.uk_54/__urls' skipping
Unable to read barnsley-chronicle.co.uk_28 [Errno 2] No such file or directory: 'output/barnsley-chronicle.co.uk_28/__urls' skipping
Unable to read wiltsglosstandard.co.uk_174 [Errno 2] No such file or directory: 'output/wiltsglosstandard.co.uk_174/__urls' skipping
Unable to read efinancialnews.com_94 [Errno 2] No such file or directory: 'output/efinancialnews.com_94/__urls' skipping
Unable to read birminghammail.co.uk_35 [Errno 2] No such file or directory: 'output/birminghammail.co.uk_35/__urls' skipping
Unable to read haringeyindependent.co.uk_122 [Errno 2] No such file or directory: 'output/haringeyindependent.co.uk_122/__urls' skipping
Unable to read halsteadgazette.co.uk_118 [Errno 2] No such file or directory: 'output/halsteadgazette.co.uk_118/__urls' skipping
Unable to read hemeltoday.co.uk_129 [Errno 2] No such file or directory: 'output/hemeltoday.co.uk_129/__urls' skipping
Unable to read eadt.co.uk_86 [Errno 2] No such file or directory: 'output/eadt.co.uk_86/__urls' skipping
Unable to read leamingtoncourier.co.uk_151 [Errno 2] No such file or directory: 'output/leamingtoncourier.co.uk_151/__urls' skipping
Unable to read standard.co.uk_13 [Errno 2] No such file or directory: 'output/standard.co.uk_13/__urls' skipping
Unable to read monmouth-today.co.uk_162 [Errno 2] No such file or directory: 'output/monmouth-today.co.uk_162/__urls' skipping
Unable to read harrowtimes.co.uk_124 [Errno 2] No such file or directory: 'output/harrowtimes.co.uk_124/__urls' skipping
Unable to read creweguardian.co.uk_74 [Errno 2] No such file or directory: 'output/creweguardian.co.uk_74/__urls' skipping
Unable to read belpernews.co.uk_31 [Errno 2] No such file or directory: 'output/belpernews.co.uk_31/__urls' skipping
Unable to read skegnessstandard.co.uk_161 [Errno 2] No such file or directory: 'output/skegnessstandard.co.uk_161/__urls' skipping
Unable to read chorley-guardian.co.uk_65 [Errno 2] No such file or directory: 'output/chorley-guardian.co.uk_65/__urls' skipping
Unable to read bbc.co.uk_6 [Errno 2] No such file or directory: 'output/bbc.co.uk_6/__urls' skipping
Unable to read crawleyobserver.co.uk_73 [Errno 2] No such file or directory: 'output/crawleyobserver.co.uk_73/__urls' skipping
Unable to read thisismoney.co.uk_137 [Errno 2] No such file or directory: 'output/thisismoney.co.uk_137/__urls' skipping
Unable to read themuslimweekly.com_168 [Errno 2] No such file or directory: 'output/themuslimweekly.com_168/__urls' skipping
Unable to read blackburncitizen.co.uk_36 [Errno 2] No such file or directory: 'output/blackburncitizen.co.uk_36/__urls' skipping
Unable to read dissexpress.co.uk_80 [Errno 2] No such file or directory: 'output/dissexpress.co.uk_80/__urls' skipping
Unable to read warringtonguardian.co.uk_144 [Errno 2] No such file or directory: 'output/warringtonguardian.co.uk_144/__urls' skipping
Unable to read anglobalticnews.co.uk_26 [Errno 2] No such file or directory: 'output/anglobalticnews.co.uk_26/__urls' skipping
Unable to read bridlingtonfreepress.co.uk_42 [Errno 2] No such file or directory: 'output/bridlingtonfreepress.co.uk_42/__urls' skipping
Unable to read cityam.com_68 [Errno 2] No such file or directory: 'output/cityam.com_68/__urls' skipping
Unable to read haylingtoday.co.uk_128 [Errno 2] No such file or directory: 'output/haylingtoday.co.uk_128/__urls' skipping
Unable to read ripleyandheanornews.co.uk_166 [Errno 2] No such file or directory: 'output/ripleyandheanornews.co.uk_166/__urls' skipping
Unable to read eveningnews24.co.uk_97 [Errno 2] No such file or directory: 'output/eveningnews24.co.uk_97/__urls' skipping
Unable to read hackneygazette.co.uk_115 [Errno 2] No such file or directory: 'output/hackneygazette.co.uk_115/__urls' skipping
Unable to read gethampshire.co.uk_107 [Errno 2] No such file or directory: 'output/gethampshire.co.uk_107/__urls' skipping
Unable to read watfordobserver.co.uk_175 [Errno 2] No such file or directory: 'output/watfordobserver.co.uk_175/__urls' skipping
Unable to read cambridge-news.co.uk_55 [Errno 2] No such file or directory: 'output/cambridge-news.co.uk_55/__urls' skipping
Unable to read hastingsobserver.co.uk_126 [Errno 2] No such file or directory: 'output/hastingsobserver.co.uk_126/__urls' skipping
Unable to read whtimes.co.uk_158 [Errno 2] No such file or directory: 'output/whtimes.co.uk_158/__urls' skipping
Unable to read stroudnewsandjournal.co.uk_154 [Errno 2] No such file or directory: 'output/stroudnewsandjournal.co.uk_154/__urls' skipping
Unable to read thesundaytimes.co.uk_11 [Errno 2] No such file or directory: 'output/thesundaytimes.co.uk_11/__urls' skipping
Unable to read eveshamjournal.co.uk_98 [Errno 2] No such file or directory: 'output/eveshamjournal.co.uk_98/__urls' skipping
Unable to read kirkintilloch-herald.co.uk_169 [Errno 2] No such file or directory: 'output/kirkintilloch-herald.co.uk_169/__urls' skipping
Unable to read gazetteherald.co.uk_104 [Errno 2] No such file or directory: 'output/gazetteherald.co.uk_104/__urls' skipping
Unable to read navynews.co.uk_23 [Errno 2] No such file or directory: 'output/navynews.co.uk_23/__urls' skipping
Unable to read camdennewjournal.co.uk_57 [Errno 2] No such file or directory: 'output/camdennewjournal.co.uk_57/__urls' skipping
Unable to read blackmorevale.co.uk_37 [Errno 2] No such file or directory: 'output/blackmorevale.co.uk_37/__urls' skipping
Unable to read buckinghamtoday.co.uk_48 [Errno 2] No such file or directory: 'output/buckinghamtoday.co.uk_48/__urls' skipping
Unable to read whitbygazette.co.uk_157 [Errno 2] No such file or directory: 'output/whitbygazette.co.uk_157/__urls' skipping
Unable to read northantstelegraph.co.uk_25 [Errno 2] No such file or directory: 'output/northantstelegraph.co.uk_25/__urls' skipping
Unable to read penarthtimes.co.uk_159 [Errno 2] No such file or directory: 'output/penarthtimes.co.uk_159/__urls' skipping
Unable to read barryanddistrictnews.co.uk_29 [Errno 2] No such file or directory: 'output/barryanddistrictnews.co.uk_29/__urls' skipping
Unable to read derbyshiretimes.co.uk_77 [Errno 2] No such file or directory: 'output/derbyshiretimes.co.uk_77/__urls' skipping
Unable to read macclesfield-express.co.uk_167 [Errno 2] No such file or directory: 'output/macclesfield-express.co.uk_167/__urls' skipping
Unable to read thejc.com_14 [Errno 2] No such file or directory: 'output/thejc.com_14/__urls' skipping
Unable to read hamhigh.co.uk_119 [Errno 2] No such file or directory: 'output/hamhigh.co.uk_119/__urls' skipping
Traceback (most recent call last):
  File "perform_analysis.py", line 1134, in <module>
    main()
  File "perform_analysis.py", line 950, in main
    assert included_url_dict, "No url included after phantom scraping and collection !?"
AssertionError: No url included after phantom scraping and collection !?