eiceblue/Spire.Doc-for-.NET

Add a way to add buffer limit to when reading

Opened this issue · 1 comments

I am trying to use the library to convert file stream to text. Here is my current code.

public async Task<string> GetTextAsync(string path, Stream fileStream)
  {
      try
      {
          using var doc = new Document();

          doc.LoadFromStream(fileStream, FileFormat.Auto);

          using var memoryStream = new MemoryStream();

          doc.SaveToStream(memoryStream, FileFormat.Txt);

          memoryStream.Seek(0, SeekOrigin.Begin);

          using var reader = new StreamReader(memoryStream);
          var content = await reader.ReadToEndAsync();

          return content;
      }
      catch
      {
          return String.Empty;
      }
  }

The problem with the above code is that the fileStream could be coming from a zipped file "small size". However, when it is converted and saved to a stream using doc.SaveToStream(memoryStream, FileFormat.Txt) the saved text stream could be very large "larger than the memory can handle" which will crash the system.

I suggest adding some sort of configuration to allow setting a limit of the buffer size. This way the generated stream will never exceed the max-allowed buffer size. The default can be null so that it does not break existing users. However, if that limit is set, the new stream should not exceed the allowed size. This could corrupt the conversion processes, but at least it will be a safe to not process the file to avoid memory exhaustion. If the conversion process is able to convert the file, it would return all the text that can fix the defined buffer. Yes this could result in partial conversion. so maybe two options,

  1. Max Memory Allowed
  2. ConversionBehavior (Complete "default", Allow partials) so the user can also define what to do if the max limit is reached.

Hello,

For handling larger documents, we still recommend increasing your system memory. By default, we convert the entire file data as a whole. We currently have no plans to set constraints during the conversion process to limit the size of the file stream.

If you need us to conduct a deeper investigation, please send your input file to support@e-iceblue.com.