Version 2_4_0 workbook.load
Closed this issue · 11 comments
Bug description
function workbook.load(filename.text) in version 2-4-0 goes to infinite loop
previous version 2_3_3 load same file properly
Environment (please complete the following information):
- OS: [e.g. Windows 10, latest version]
- .NET framework [e.g. .NET Standard 4.8]
- Your system locale en-US
- Version of NanoXLSX [e.g. 2.4.0]
Hi,
Thanks for the report.
Can you give me some further details:
What is in filename.text
? Is this a renamed XLSX file, an old XLS file (also renamed) or a plain text?
What happens during the infinit loop? Do you get a stack overflow, is the process just not ending, or are there other events that happens (e.g. a crash, memory consumption is going up, 100% CPU usage)?
Thank you in advance for clarification.
Thank for the answer.
Really it seems that will be a challenge for You !
this is the piece of code in my large program that recall the load function :
//
// Verifico se posso aprire il file xlsx
//
workbook=new Workbook();
try
{
workbook = Workbook.Load(SelectedFileXLSX.Text);
}
catch (Exception ex)
{
MessageBox.Show("errore apertura file xls - Aperto in Excel?");
return;
}
My application was tested since the first release of NanoXlsx and this piece of code works with the release 2_3_3
What is strange is that if use the demo program , just patching the Basic demo function 👍
private static void BasicDemo()
{
Workbook workbook;
workbook = Workbook.Load("E:\DownLoad\F_2024_07_18.xlsx");
it works.
When in my program the load stops it remain just in some timing loops , no disc activity,
no memory activity.
About the file I'm sending the file xlsx.
this is a file that time to time is updated , new lines eventually inserted, other updates and is the same
since many months from now.
In this file the only field that is particular is the field DATA FINE that when 'blank' l for the
application mean 'undefined date'.
I dont have unfortunally any other info.
Best regards
--> Attached file removed
Hi,
Thank you for the update.
I testes the reader with the attached XLSX file and your code.
The file could be read here without any issues. I even renamed the file to the extension .Text (no problem to read too).
It should also be no general problem, otherwise several dozens to hundred of unit tests would possibly fail.
But here are some things you could test:
- Try to read the file first as Stream and then read this Stream with NanoXLSX, instead of the file directly (use
public static Workbook Load(Stream stream, ImportOptions options = null)
in the Workbook class) - Extremely important: Make sure the file is not open in Excel or even in a preview in Windows Explorer. Excel is notorious for locking files. And it's very hard to get even read permission, once the file was locked
- Try to read an empty or very short version of your Excel file. Could it be a problem because of a huge data amount?
- Ensure that your file does not contain invalid/corrupting characters (like NUL, ESC in texts etc.). NanoXLSX is validating a lot of things but there may be something missing. This can be tested with a new Excel file or if you gradually remove content of the file that is blocking, until it suddenly can be read
- Roll back the Package in the NuGet manager to the last version ant try to execute your program without any changes. Is really just 2.4.0 blocking or suddenly the older version too?
- Try to set a different .NET version. I tested with .NET 4.5, .NET 5.0 and .NET Core 2.0
- Check the location where your file is stored. Ensure that no other stream is trying to access it during the read attempt (e.g. a Virus Scanner, a deployment process or script)
Two other additions:
- I have deleted the test file in your last comment since it looked like it should not be in the public
- If you could post the program logic you are using to read, maybe we could figure out the problem too. As you mentioned, if read with the demo code, it seems to work
I tried all Yr suggestion , and I debug my application using version 2_4_0.
The problem is in the function
private async Task ReadInternal() in module XlsxReader.cs
that was implemented starting from 2_4_0 in lines
await fs.CopyToAsync(memoryStream);
await inputStream.CopyToAsync(memoryStream);
since the CopyToAsyn stay in loop without exiting.
Just changing the above 2 lines avoiding async copying ,
ie
fs.CopyTo(memoryStream);
inputStream.CopyTo(memoryStream);
the application now run properly.
The CopyToAsync although nice programming pratice I'snt (for my opinion) unnecessary as included
in another async Task ReadInternal and does not produce noticeable speed effects.
Waiting for comments
Regards
There is probably a deadlock issue with the stream you are using.
It seems that Task.GetAwaiter().GetResult()
can lead to deadlocks under certain circumstances. Using this approach is BTW not a matter of speed but to have a unified platform. This attempt works in the majority of use cases. However, I have to investigate the deadlock cases of course.
For that, it is important to know whether you are already in a asynchronous state or in a tread, when calling the Workbook.Read
method.
Can you somehow provide the code, how you read files? Otherwise, it is very hard to analyze the issue to find a solution that fixes the problem and do not affect the setup of other users.
Thanks in advance.
I'm sending small application I was able to reproduce the problem
c# Visual studio community 2022
Hope this will help
regards
Test_For_DeadLock.zip
Thank you for the code.
I think I have found a solution for you. Since you are with WinForms already in a threaded context, you can (an probably should) indeed used the async methods. This should also be beneficial to keep the UI responsive, if I'm not wrong (sorry, my last WinForms project is some years back).
So, this code should help:
if (SelectedFileXLSX.Text.Trim() != "")
{
try
{
workbook = await Task.Run(() => Workbook.LoadAsync(SelectedFileXLSX.Text));
}
catch (Exception ex)
{
MessageBox.Show("errore apertura file xls - Aperto in Excel?");
return;
}
MessageBox.Show("DONE");
}
Can you try this out?
Thanks for Yr effort to explain the problem.
Yr solution works on my example if inserted in the dependent Form, but does not give a complete answer
to question I opened.
In fact I would like now an explanation why using the Demo program the Load method works !
Thanks
Best regards
Because the demo program (assumed executed in this class) runs in a single thread. The demo program is started, it runs in the main thread and ends. The main Thread is blocking for a short time during the process.
A WinForms project (same for WPF) uses a ThreadPool and has normally at least two threads. One of them is the UI thread. This is also the reason why cross-form actions must be invoked. The used ThreadPool constellation seems to have an issue with GetAwaiter().GetResult()
. I still have to figure out how to handle this. Even it is highly encouraged to use the async methods in a multi-thread environment, it should at least throw an exception or run into a timeout if used in a blocking manner. I try to fix this in a future release. Maybe I also add a note in the API-doc / readme to avoid using non-async methods in UI projects
This problem should now be solved with the new version v2.5.2.
The Nuget package should be available in some minutes.
Thanks again for the report