Indicia-Team/warehouse

Link to import errors file not shown

Opened this issue · 9 comments

After importing occurrences there is an occasional problem with showing the final page of the import wizard on which there should be a button to download errors.

When it works correctly the web server log looks as follows:

GET /index.php/services/import/upload offset=5000&limit=50&filepos=0&uploaded_csv=1663320770675.csv&model=occurrence&allow_commit_to_db=1 - - 200 0 0 687
POST /index.php/services/security/get_read_write_nonces - 443 - 192.171.199.231 - - 200 0 0 281
POST /index.php/services/import/get_upload_result uploaded_csv=1663320770675.csv - - 200 0 0 312
POST /index.php/occurrence/importer/1 uploaded_csv=1663320770675.csv - - 200 0 0 1000
GET /import/1663320770675-errors.csv - - 200 0 0 156

This shows

  • The ajax request to import the final chunk of the file. When this completes, the html submit request for occurrence/importer/1 is initiated,
  • The web service request for authorization which is sent when showing the importer view,
  • The web service request for results of the import to display,
  • The completion of the request to show the final page of the importer,
  • The user's request to get the error file, linked from the page.

On failure we see the following

GET /index.php/services/import/upload offset=5000&limit=50&filepos=0&uploaded_csv=1663321817618.csv&model=occurrence&allow_commit_to_db=1 - - 200 0 0 890
POST /index.php/services/security/get_read_write_nonces - - 500 0 64 336235
POST /index.php/occurrence/importer/1 uploaded_csv=1663321817618.csv - - 500 0 258 337000
POST /index.php/services/security/get_read_write_nonces - - 500 0 4294967294 203
POST /index.php/occurrence/importer/1 uploaded_csv=1663321817618.csv - - 200 0 0 1640

This shows

  • The request to import the final chunk of the file. When this completes, the request for occurrence/importer/1 is initiated,
  • The web service request for authorization is received which IIS finally kills with a windows error code of 64 (The specified network name is no longer available)
  • The request for the importer page, which had been waiting for the authorization response, is killed by IIS with an error code of 258 (The wait operation timed out.). I suspect this timeout is the trigger which causes the auth request to be terminated but why is the auth request not completing quickly?
  • Another failed attempt to get authorization some seconds later but with a fast response.
  • Some part of the importer wizard but we can't tell which.

In the warehouse error log, at the time of the second auth failure, there is an IIS error message with the following information
Requested URL: https://testwarehouse.indicia.org.uk:443/index.php/services/security/get_read_write_nonces
Module: FastCgiModule
Notification: ExecuteRequestHandler
Handler: php-8.0.22
Error Code: 0xfffffffe

0xfffffffe is the hex representation of 4294967294 seen in the IIS log.

The Windows Event Viewer shows a corresponding error in the Application log from Zend OPcache.

Opcode handlers are unusable due to ASLR. 
Please setup opcache.file_cache and opcache.file_cache_fallback directives for more convenient Opcache usage
Attempt to access invalid address

This second error may just be a consequence of the first.

Unlikely to be relevant but there is a javascript bug in the code to request the final page of the importer which I can fix.

See Indicia-Team/client_helpers@6540061

Experienced in person, rather than reported, this time without the OPcache error.
The IIS log shows

GET /index.php/services/import/upload offset=4000&limit=50&filepos=0&uploaded_csv=1663693302780.csv&model=occurrence&allow_commit_to_db=1- - 200 0 0 3203
POST /index.php/services/security/get_read_write_nonces - - 500 0 64 325271
POST /index.php/occurrence/importer/1 uploaded_csv=1663693302780.csv - - 500 0 258 325666
POST /index.php/services/security/get_read_write_nonces - - 200 0 0 375
POST /index.php/services/import/get_upload_result uploaded_csv=1663693302780.csv - - 200 0 0 309
POST /index.php/occurrence/importer/1 uploaded_csv=1663693302780.csv - - 200 0 0 1240

This shows

  • The final ajax request to import a chunk
  • The timed-out/aborted attempt to get the final page of the importer wizzard.
  • A successful use of the browser refresh button to obtain the page desired.

I enabled the failed request tracing in IIS and captured some more detail, portions of which look as follows

From the request for https://testwarehouse.indicia.org.uk:443/index.php/occurrence/importer/1?uploaded_csv=1663763991820.csv

No. EventName Details Time
67 NOTIFY MODULE START ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", fIsPostNotification="false" 13:04:49.855
68 FASTCGI ASSIGN PROCESS CommandLine="C:\PHP\v8022\php-cgi.exe", IsNewProcess="false", ProcessId="3340", RequestNumber="83" 13:04:49.855
69 FASTCGI START   13:04:49.855
70 GENERAL READ ENTITY START   13:04:49.855
71 GENERAL READ ENTITY END BytesReceived="1592", ErrorCode="The operation completed successfully. (0x0)" 13:04:49.855
72 GENERAL REQUEST ENTITY Buffer="setting[website_id]=142&setting[survey_id]=354&setting[sample:entered_sref_system]=4326&setting[occurrence:fkFilter:taxa_taxon_list:taxon_list_id]=228&setting[fkFilter:location:location_type_id]=19&mapping[ID]=<Not+imported>&mapping[Location_name]=sample:fk_location&mapping[Plot_ID]=smpAttr:954&mapping[Date_start]=sample:date_start&mapping[Date_end]=sample:date_end&mapping[Date_type]=sample:date_type&mapping[Sample_Technique]=smpAttr:fk_962&mapping[Reported_Sample_Effort]=smpAttr:fk_955&mapping[Sample_Effort_1]=smpAttr:956&mapping[Sample_Effort_Unit_1]=smpAttr:fk_963&mapping[Sample_Effort_2]=smpAttr:965&mapping[Sample_Effort_Unit_2]=smpAttr:fk_964&mapping[Sample_Realm]=smpAttr:fk_957&mapping[Data_Purpose]=smpAttr:fk_958&mapping[Data_Purpose_Information]=smpAttr:959&mapping[Sample_Completeness]=smpAttr:fk_966&mapping[Treatment]=smpAttr:fk_960&mapping[Treatment_Information]=smpAttr:961&mapping[Grid_Ref]=sample:entered_sref&mapping[Count_Type]=occAttr:fk_528&mapping[Count]=occAttr:529&mapping[Taxon_Certainty]=occAttr:fk_531&mapping[Taxon_Name_in_Source]=occAttr:535&mapping[Species_or_Taxon_Name]=occurrence:fk_taxa_taxon_list&mapping[Taxonomy_Notes]=occAttr:536&mapping[Lifestage]=occAttr:fk_532&mapping[Sex]=occAttr:fk_527&mapping[Caste]=occAttr:fk_533&mapping[Sensitive_Data]=occAttr:534&mapping[Sensitivity_Precision]=occurrence:sensitivity_precision&total=4033" 13:04:49.855
73 NOTIFY MODULE COMPLETION ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", fIsPostNotificationEvent="false", CompletionBytes="1592", ErrorCode="The operation completed successfully. (0x0)" 13:04:49.855
74 GENERAL READ ENTITY START   13:04:49.855
75 GENERAL READ ENTITY END BytesReceived="0", ErrorCode="Reached the end of the file. (0x80070026)" 13:04:49.855
76 FASTCGI WAITING FOR RESPONSE   13:04:49.855
 

From the request for https://testwarehouse.indicia.org.uk:443/index.php/services/security/get_read_write_nonces

No. EventName Details Time
68 NOTIFY MODULE START ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", fIsPostNotification="false" 13:04:50.182
69 FASTCGI QUEUE REQUEST PositionInQueue="0", MaxInstances="0" 13:04:50.182
70 FASTCGI ASSIGN PROCESS CommandLine="C:\PHP\v8022\php-cgi.exe", IsNewProcess="true", ProcessId="5816", RequestNumber="1" 13:12:15.385
71 FASTCGI START   13:12:15.385
72 NOTIFY MODULE COMPLETION ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", fIsPostNotificationEvent="false", CompletionBytes="14", ErrorCode="The operation completed successfully. (0x0)" 13:12:15.385
73 GENERAL READ ENTITY START   13:12:15.385
74 GENERAL READ ENTITY END BytesReceived="0", ErrorCode="The specified network name is no longer available. (0x80070040)" 13:12:15.385
75 MODULE SET RESPONSE ERROR STATUS ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", HttpStatus="500", HttpReason="Internal Server Error", HttpSubStatus="0", ErrorCode="The specified network name is no longer available. (0x80070040)", ConfigExceptionInfo="" 13:12:15.385
76 NOTIFY MODULE END ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", fIsPostNotificationEvent="false", NotificationStatus="NOTIFICATION_FINISH_REQUEST" 13:12:15.385

Which is then followed by the timeout of the original request.

No. EventName Details Time
77 FASTCGI REQUEST TIMEOUT   13:12:15.386
78 SET RESPONSE ERROR DESCRIPTION ErrorDescription="C:\PHP\v8022\php-cgi.exe - The FastCGI process exceeded configured request timeout" 13:12:15.386
79 MODULE SET RESPONSE ERROR STATUS ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", HttpStatus="500", HttpReason="Internal Server Error", HttpSubStatus="0", ErrorCode="The wait operation timed out. (0x80070102)", ConfigExceptionInfo="" 13:12:15.386
80 NOTIFY MODULE END ModuleName="FastCgiModule", Notification="EXECUTE_REQUEST_HANDLER", fIsPostNotificationEvent="false", NotificationStatus="NOTIFICATION_FINISH_REQUEST" 13:12:15.386
 

I captured another example with a different sequence of timings from the order above. They were consistent with the following explanation of events.

  1. The request for occurrence/importer/1 is received
  2. It issues a request for services/security/get_read_write_nonces
  3. This request is queued (FASTCGI_QUEUE_REQUEST). I.e. it is blocked
  4. The request for occurrence/importer/1 times out and is aborted
  5. The request for services/security/get_read_write_nonces becomes unblocked, completes successfully, but cannot return a response as its requester has gone away.

At the time the request was blocked, there was only one php-cgi.exe process running. On occasions where the problem is not observed, a second php-cgi.exe process is seen to be started. You can only have as many concurrent requests as you have fast cgi processes. The maximum number of fast cgi processes is configured by the Max Instances setting. This is currently set to 0 meaning that the server will adjust automatically according to demand and resources.

It seems that the conditions in which the fault arises are

  1. The web site is receiving little or no other traffic so the number of php-cgi.exe processes drops to one.
  2. The server is using a lot of resources for something else (scheduled tasks, notably) so that is not inclined to start a second cgi process when it is needed.

I have set Max Instances in the FastCGI settings to 4 (the default value) to see if that alleviates the situation. While no information is available to back this up, my reasoning is that, when not using the special setting of 0, the server will start additional instances of php-cgi.exe more willingly.

has there been any solution to this: I am facing an exactly similar problem.

The proposed solution was to set Max Instances to 4 in the FastCGI settings. Looking at our server, warehouse1.indicia.org.uk, which perhaps you are using, I can see that the setting has reverted to 0. I have put it back to 4.

I've remembered that I previously made the change to Max Instances on testwarehouse.indicia.org.uk which has more limited resources. I'd be grateful to hear of any evidence, however anecdotal, as to whether this

  1. helps upload
  2. reduces performance

On 1st February we reverted warehouse1.indicia.org.uk back to MaxInstances = 0 because the warehouse user interface was responding very slowly despite the server not being under obvious heavy load. This seemed to improve performance.