pypa/build

Python 3 -m Build Unable to Parse PowerShell Generated pyproject.toml Using Out-File on Windows

hollowaykeanho opened this issue · 2 comments

Bug report

Bug description:

This ticket is for tracking purposes in case anyone bump into the same issue. Please conclude it in accordingly (whether it's a bug or close it immediately).

For unknown reason, pyproject.toml generated by PowerShell using UTF8 encoder ($__content | Out-File -FilePath "pyproject.toml" -Encoding UTF8) is not parse-able by Python build command python -m build --sdist --whell .. On Windows OS, it consistently throws this error despite being a healthy file (see pyproject.windows in the uploaded artifact:

* Creating venv isolated environment...
ERROR Failed to parse D:\a\AutomataCI\AutomataCI\tmp\pypi_automataci-src_1.5.0_windows-amd64\.\pyproject.toml: Invalid statement (at line 1, column 1) 

This problem is not seen on UNIX system like Ubuntu and MacOS (as in using their native printf "..." > "pyproject.toml" method).

This problem is also not seen when Set-Content -Path "pyproject.toml" -Value $__content in PowerShell file generation. Hence, please use Set-Content to workaround the matter.

How to reproduce

  1. Unpack this CI workspace artifact. sample.tar.gz
  2. Rename/copy either one of the pyproject and set to pyproject.toml
  3. Change directory and run python -m build --sdist --whell . across 3 different OSes.
  4. Observe. Only Windows' version will throw an error.

References

GitHub Actions OS matrices run (Observe PACKAGE step)

  1. https://github.com/corygalyna/AutomataCI/actions/runs/6241976144
  2. https://github.com/corygalyna/AutomataCI/actions/runs/6242157791
  3. https://github.com/corygalyna/AutomataCI/actions/runs/6241903843/job/16944903291
  4. https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/out-file
  5. https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/set-content
  6. python/cpython#109588

CPython versions tested on:

3.12

Operating systems tested on:

Linux, macOS, Windows

layday commented

The issue here is that the UTF-8 encoding in PowerShell prepends a BOM; the equivalent encoding in Python is utf-8-sig, which will strip the BOM on read. UTF-8 with BOM is very rarely used in Python, so even if we were to support it (this would mean decoding the TOML file in build rather than tomli/tomllib), it's unlikely other Python packaging tools will follow suit.

@layday, this actually happens even I change the UTF-8 with/without BOM. Either way, I believe this is a PowerShell problem related to the Out-File cmdlet.

Currently, I changed from:

$__content | Out-File -FilePath "pyproject.toml" -Encoding UTF8

into:

$null = Add-Content -Path "pyproject.toml" -Value $__content
# OR
$null = Set-Content -Path "pyproject.toml" -Value $__content

and the problem is gone. As titled, only for records for Windows user encountering this issue. IMO, I don't think Python 3 or PyPa are to be blamed here. Cheers and thanks for closing the matter.