pypa/setuptools

[FR] Preserve the use of spaces/tabs in setup.cfg/setup.py in sdist tarball

Opened this issue · 8 comments

dvzrv commented

setuptools version

65.1.1

Python version

3.10.8

OS

Arch Linux

Additional environment information

I have noticed this issue several times already when trying to apply patches for Arch Linux packages, that I supplied to upstream myself (e.g. for the mailman ecosystem). I had to basically write the patches all over again or switch to git sources to be able to apply them, which means a lot of unnecessary overhead (or may not be possible if we are relying on PGP validated source tarballs).

Description

Using python setup.py sdist, setuptools unconditionally adds tabs to setup.cfg and setup.py in the sdist tarball.

From a downstream perspective, this is extremely tedious behavior, as it requires to adapt/backport all patches that touch those files (e.g. version updates or other data in those files).

Expected behavior

Setuptools does not touch the indentation of setup.cfg and setup.py (or any other file for that matter) when creating sdist tarballs.

How to Reproduce

  1. git clone https://github.com/pycontribs/selinux
  2. cd selinux
  3. python setup.py sdist
  4. tar -zxvf dist/selinux-*.tar.gz selinux-*/setup.cfg (you'll have to choose a more specific dir in the 2nd argument!)
  5. diff -ruN selinux-*/setup.cfg setup.cfg

Output

/usr/lib/python3.10/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead.
  warnings.warn(msg, warning_class)
/usr/lib/python3.10/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
  warnings.warn(
/usr/lib/python3.10/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead.
  warnings.warn(msg, warning_class)
running sdist
running egg_info
writing selinux.egg-info/PKG-INFO
writing dependency_links to selinux.egg-info/dependency_links.txt
writing requirements to selinux.egg-info/requires.txt
writing top-level names to selinux.egg-info/top_level.txt
adding license file 'LICENSE'
writing manifest file 'selinux.egg-info/SOURCES.txt'
running check
creating selinux-0.1.dev73+ge90f38e
creating selinux-0.1.dev73+ge90f38e/.github
creating selinux-0.1.dev73+ge90f38e/.github/workflows
creating selinux-0.1.dev73+ge90f38e/selinux
creating selinux-0.1.dev73+ge90f38e/selinux.egg-info
creating selinux-0.1.dev73+ge90f38e/tests
creating selinux-0.1.dev73+ge90f38e/tests/roles
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/defaults
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/meta
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/molecule
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/molecule/default
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/tasks
creating selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/vars
creating selinux-0.1.dev73+ge90f38e/tools
creating selinux-0.1.dev73+ge90f38e/zuul.d
copying files to selinux-0.1.dev73+ge90f38e...
copying .ansible-lint -> selinux-0.1.dev73+ge90f38e
copying .flake8 -> selinux-0.1.dev73+ge90f38e
copying .gitignore -> selinux-0.1.dev73+ge90f38e
copying .pre-commit-config.yaml -> selinux-0.1.dev73+ge90f38e
copying LICENSE -> selinux-0.1.dev73+ge90f38e
copying README.rst -> selinux-0.1.dev73+ge90f38e
copying ansible.cfg -> selinux-0.1.dev73+ge90f38e
copying pyproject.toml -> selinux-0.1.dev73+ge90f38e
copying setup.cfg -> selinux-0.1.dev73+ge90f38e
copying setup.py -> selinux-0.1.dev73+ge90f38e
copying tox.ini -> selinux-0.1.dev73+ge90f38e
copying .github/FUNDING.yml -> selinux-0.1.dev73+ge90f38e/.github
copying .github/release-drafter.yml -> selinux-0.1.dev73+ge90f38e/.github
copying .github/workflows/release-drafter.yml -> selinux-0.1.dev73+ge90f38e/.github/workflows
copying selinux/__init__.py -> selinux-0.1.dev73+ge90f38e/selinux
copying selinux.egg-info/PKG-INFO -> selinux-0.1.dev73+ge90f38e/selinux.egg-info
copying selinux.egg-info/SOURCES.txt -> selinux-0.1.dev73+ge90f38e/selinux.egg-info
copying selinux.egg-info/dependency_links.txt -> selinux-0.1.dev73+ge90f38e/selinux.egg-info
copying selinux.egg-info/requires.txt -> selinux-0.1.dev73+ge90f38e/selinux.egg-info
copying selinux.egg-info/top_level.txt -> selinux-0.1.dev73+ge90f38e/selinux.egg-info
copying selinux.egg-info/zip-safe -> selinux-0.1.dev73+ge90f38e/selinux.egg-info
copying tests/roles/ensure_ansible/defaults/main.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/defaults
copying tests/roles/ensure_ansible/meta/main.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/meta
copying tests/roles/ensure_ansible/molecule/Dockerfile.j2 -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/molecule
copying tests/roles/ensure_ansible/molecule/default/molecule.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/molecule/default
copying tests/roles/ensure_ansible/molecule/default/playbook.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/molecule/default
copying tests/roles/ensure_ansible/tasks/main.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/tasks
copying tests/roles/ensure_ansible/vars/centos-7.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/vars
copying tests/roles/ensure_ansible/vars/centos-8.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/vars
copying tests/roles/ensure_ansible/vars/debian.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/vars
copying tests/roles/ensure_ansible/vars/redhat-8.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/vars
copying tests/roles/ensure_ansible/vars/redhat.yml -> selinux-0.1.dev73+ge90f38e/tests/roles/ensure_ansible/vars
copying tools/test-setup.sh -> selinux-0.1.dev73+ge90f38e/tools
copying zuul.d/layout.yaml -> selinux-0.1.dev73+ge90f38e/zuul.d
Writing selinux-0.1.dev73+ge90f38e/setup.cfg
creating dist
Creating tar archive
removing 'selinux-0.1.dev73+ge90f38e' (and everything under it)

selinux-0.1.dev73+ge90f38e/setup.cfg

--- selinux-0.1.dev73+ge90f38e/setup.cfg	2022-11-12 11:30:49.269458000 +0100
+++ setup.cfg	2022-11-12 11:30:34.955041392 +0100
@@ -7,49 +7,50 @@
 [metadata]
 name = selinux
 url = https://github.com/pycontribs/selinux
-project_urls = 
-	Bug Tracker = https://github.com/pycontribs/selinux/issues
-	Release Management = https://github.com/pycontribs/selinux/releases
-	CI = https://dashboard.zuul.ansible.com/t/ansible/builds?project=pycontribs/selinux
-	Source Code = https://github.com/pycontribs/selinux
+project_urls =
+    Bug Tracker = https://github.com/pycontribs/selinux/issues
+    Release Management = https://github.com/pycontribs/selinux/releases
+    CI = https://dashboard.zuul.ansible.com/t/ansible/builds?project=pycontribs/selinux
+    Source Code = https://github.com/pycontribs/selinux
 description = shim selinux module
 long_description = file: README.rst
 long_description_content_type = text/x-rst; charset=UTF-8
-history = file: HISTORY.rst
+
+history =  file: HISTORY.rst
 author = Sorin Sbarnea
 author_email = sorin.sbarnea@gmail.com
 maintainer = Sorin Sbarnea
 maintainer_email = sorin.sbarnea@gmail.com
 license = MIT license
 license_file = LICENSE
-classifiers = 
-	Development Status :: 5 - Production/Stable
-	
-	Environment :: Console
-	
-	Intended Audience :: Developers
-	Intended Audience :: Information Technology
-	Intended Audience :: System Administrators
-	
-	License :: OSI Approved :: MIT License
-	
-	Natural Language :: English
-	
-	Operating System :: OS Independent
-	
-	Programming Language :: Python :: 2
-	Programming Language :: Python :: 2.7
-	Programming Language :: Python :: 3
-	Programming Language :: Python :: 3.5
-	Programming Language :: Python :: 3.6
-	Programming Language :: Python :: 3.7
-	Programming Language :: Python :: 3.8
-	
-	Topic :: System :: Systems Administration
-	Topic :: Utilities
-keywords = 
-	selinux
-	virtualenv
+classifiers =
+    Development Status :: 5 - Production/Stable
+
+    Environment :: Console
+
+    Intended Audience :: Developers
+    Intended Audience :: Information Technology
+    Intended Audience :: System Administrators
+
+    License :: OSI Approved :: MIT License
+
+    Natural Language :: English
+
+    Operating System :: OS Independent
+
+    Programming Language :: Python :: 2
+    Programming Language :: Python :: 2.7
+    Programming Language :: Python :: 3
+    Programming Language :: Python :: 3.5
+    Programming Language :: Python :: 3.6
+    Programming Language :: Python :: 3.7
+    Programming Language :: Python :: 3.8
+
+    Topic :: System :: Systems Administration
+    Topic :: Utilities
+keywords =
+    selinux
+    virtualenv
 
 [options]
 use_scm_version = True
@@ -57,17 +58,13 @@
 packages = find:
 include_package_data = True
 zip_safe = True
-install_requires = 
-	distro>=1.3.0
-	setuptools>=39.0
-setup_requires = 
-	setuptools_scm >= 1.15.0
-	setuptools_scm_git_archive >= 1.0
+install_requires =
+    distro>=1.3.0
+    setuptools>=39.0
 
+# These are required during `setup.py` run:
+setup_requires =
+    setuptools_scm >= 1.15.0
+    setuptools_scm_git_archive >= 1.0
 [options.packages.find]
 where = .
-
-[egg_info]
-tag_build = 
-tag_date = 0
-

Hi @dvzrv, thank you very much for reporting this.

I believe the following is happening:

  • Setuptools internally uses its egg_info command as a step to create a sdist.
  • egg_info uses configparser.ConfigParser.write from Python's standard library, to store build-time information in the setup.cfg file (e.g. version parts like tag_build and tag_date).
  • configparser.ConfigParser.write does not seem to preserve indentation spaces and instead replace them with tabs.

I went through configparser's doc and I could not find a configuration that enforces configparser to preserve the indentation.

I believe that before we implement any change in the setuptools, that needs to be implemented first1. Please let me know if you have a different solution (also please feel free to submit a PR).

Meanwhile, I will reclassify this issue as a Feature Request and tag it as dependent of the configparser behaviour.

Footnotes

  1. Please note that it is not in the scope of the setuptools project to implement a custom version of configparser. We also want to minimize the number of dependencies as those have to be vendorized to avoid cycles during dependence resolution, so a solution from the stdlib would be ideal here.

dvzrv commented

@abravalheri hmm, so altering/extending the configparser behavior would be the ideal solution for this?

Hi @dvzrv , I would word it differently: without support for preserving indentation implemented in configparser it is not viable for setuptools to support this feature.

when trying to apply patches

same here

patching file setup.cfg
Hunk #1 FAILED at 44.
1 out of 1 hunk FAILED -- saving rejects to file setup.cfg.rej

quickfix: convert tab-indent back to space-indent, and then apply the patch

here: indent with 4 spaces

sed -i 's/\t/    /g' setup.cfg
# or
expand -i -t4 setup.cfg | sponge setup.cfg

egg_info uses configparser.ConfigParser.write from Python's standard library, to store build-time information in the setup.cfg file (e.g. version parts like tag_build and tag_date).

so setup.cfg is modified... otherwise we could just copy the original setup.cfg

ideally, such dynamic fields should be spaced by 3 empty lines from other fields
to avoid "hunk FAILED" errors when patching

without support for preserving indentation implemented in configparser it is not viable for setuptools to support this feature.

i guess we will wait 1000 years until configparser can do that : P

this would require a concrete syntax tree (CST) parser like tree-sitter
to produce a minimal diff when editing the file

tree-sitter-ini fails to parse setup.cfg files
because multi-line values are not supported

git clone --depth=1 https://github.com/justinmk/tree-sitter-ini
cd tree-sitter-ini
tree-sitter generate
wget https://github.com/django/django/raw/main/setup.cfg
tree-sitter parse setup.cfg | grep ERROR | wc -l
# 2

tree-sitter-toml (online) fails to parse setup.cfg files
because in toml, string values must be quoted

tree-sitter-yaml (online) fails to parse setup.cfg files
because yaml has no sections like [metadata]
because yaml has : separator instead of =
etc...

I wasted time with this problem when trying to create a Conda patch, although I'm using the absolutely uniquely named and searchable "build" Python module. I tried to open an issue there but was redirected here.

The whitespace handling is documented behavior and therefore unlikely to change, imho. Although I hope that configparser (or setuptools) at least might be convinced to not produce trailing whitespaces.

I think the easiest workaround that doe not require waiting for ConfigParser to change behavior would be this workflow:

  1. Generate new setup.cfg with ConfigParser
  2. Generate a diff ignoring whitespace changes with diff -w or an equivalent Python library.
  3. Apply the minimal non-whitespace-changing diff to the original setup.cfg.

There is a similar issue in bumpversion

You might wish to look into migrating to pyproject.toml-based configuration, which wasn't available at the time when this issue was opened and which setuptools won't rewrite.

speaking of workarounds

Generate new setup.cfg with ConfigParser

use ConfigParser to patch setup.cfg
aka syntax-aware patching

import configparser
cfg = configparser.ConfigParser()
assert cfg.read("setup.cfg") == ["setup.cfg"]

# set value
try:
    cfg.add_section("metadata")
except configparser.DuplicateSectionError:
    pass
cfg.set("metadata", "name", "some_name")

# remove value
try:
    cfg.remove_option("some_section", "some_key")
except configparser.NoSectionError:
    pass

# remove section
cfg.remove_section("some_section")

with open("setup.cfg", "w") as f:
    cfg.write(f)

this could be shorter with a command line tool

pycfg-patch setup.cfg --set metadata.name=some_name

You might wish to look into migrating to pyproject.toml-based configuration, which wasn't available at the time when this issue was opened and which setuptools won't rewrite.

Thanks for the pointer. That might work for me as a workaround.

It seems that support was added in setuptools 61, which was released 2022-03-24. This makes me a bit hesistant regarding backwards compatibility, but I'm already using pyproject.toml build, which seems to be supported since pip 10.0 released 2018-04-14, so I should still have quite a bit of compatibility after simply increasing the required setuptools version inside the pyproject.toml.