microsoft/PubSec-Info-Assistant

Upload of Json file hanging

rashodqaim opened this issue · 5 comments

Upload a Json file on Friday March 15th and file is still not loaded. Status of file says Queued. Here is the log file from Cosmo DB

{
"id": "dXBsb2FkL0NsaW5pY2FsX3RyYWlscy9jbGluaWNhbF90cmFpbHMuanNvbg==",
"file_path": "upload/Clinical_trails/clinical_trails.json",
"file_name": "clinical_trails.json",
"state": "Queued",
"start_timestamp": "2024-03-15 18:55:59",
"state_description": "",
"state_timestamp": "2024-03-15 19:26:35",
"status_updates": [
{
"status": "File uploaded from browser to Azure Blob Storage",
"status_timestamp": "2024-03-15 18:55:59",
"status_classification": "Info"
},
{
"status": "Pipeline triggered by Blob Upload",
"status_timestamp": "2024-03-15 18:56:09",
"status_classification": "Info"
},
{
"status": "FileUploadedFunc - FileUploadedFunc function started",
"status_timestamp": "2024-03-15 18:56:09",
"status_classification": "Debug"
},
{
"status": "FileUploadedFunc - json file sent to submit queue. Visible in 249 seconds",
"status_timestamp": "2024-03-15 18:56:09",
"status_classification": "Debug"
},
{
"status": "FileLayoutParsingOther - Starting to parse the non-PDF file",
"status_timestamp": "2024-03-15 19:00:43",
"status_classification": "Info"
},
{
"status": "FileLayoutParsingOther - Message received from non-pdf submit queue",
"status_timestamp": "2024-03-15 19:00:43",
"status_classification": "Debug"
},
{
"status": "FileLayoutParsingOther - SAS token generated to access the file",
"status_timestamp": "2024-03-15 19:00:43",
"status_classification": "Debug"
},
{
"status": "FileLayoutParsingOther - partitioning complete",
"status_timestamp": "2024-03-15 19:03:49",
"status_classification": "Debug"
},
{
"status": "FileLayoutParsingOther - chunking complete. 19021 chunks created",
"status_timestamp": "2024-03-15 19:03:52",
"status_classification": "Debug"
},
{
"status": "FileLayoutParsingOther - chunking stored.",
"status_timestamp": "2024-03-15 19:26:35",
"status_classification": "Debug"
},
{
"status": "FileLayoutParsingOther - message sent to enrichment queue",
"status_timestamp": "2024-03-15 19:26:35",
"status_classification": "Debug"
}
],
"_rid": "5hNMAOg52NbbAAAAAAAAAA==",
"_self": "dbs/5hNMAA==/colls/5hNMAOg52NY=/docs/5hNMAOg52NbbAAAAAAAAAA==/",
"_etag": ""13006aeb-0000-0100-0000-65f4a0eb0000"",
"_attachments": "attachments/",
"_ts": 1710530795
}
error1
error2

Can you please check if the enrichment app is started?
is it possible for you to share the json file via email, not posting here?
What is the size of the file?

Thank you

How do I check if the enrichment app has started?? Yes I can email the file if you would like. Please send me your email and I will send you the file. Thank you sooo much

You can send the sample file to isat-support@microsoft.com. Please cite this ticket number in the email please.

It appears as if the messages are not getting picked up from the "enrichment_queue". You can investigate errors with the Azure Functions or the Embeddings WebApp using the Azure Workbook provided in the deployed resource group. Look for a resource of type Azure Workbook and open it to view the Application Logs for many of the services running in Information Assistant. Most likely a more detailed error and or call stack should be available there.

You may need to upload the file again to reproduce the error to get a recent log entry.

I have sent over the email, with the files that are not uploading. Please let me know if there is anything else that you need from me.

Closing this as it seems the issue is related to large csv processing which was fixed in a recent push to main. If you encounter this do a pull from main, rebuild the functions app and resubmit the files.