Make it possible to disable build of r, rpy2 using ./configure --disable-r
mkoeppe opened this issue · 111 comments
Prompted by a build failure of r on cygwin-standard in https://github.com/mkoeppe/sage/runs/1915093157
[r-3.6.3] blas.o: in function `dsymm_':
[r-3.6.3] /opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/var/tmp/sage/build/r-3.6.3/src/src/extra/blas/blas.f:2986:(.text+0x4352): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `xerbla_'
we make it possible to disable R using a configure options so that one can get a working Sage installation in this way.
(The original build failure has hopefully resolved been resolved by #29537.)
Previous tickets and discussions:
- #29486
- #29441
- #29170
- https://groups.google.com/g/sage-devel/c/IX3niEdmCEM/m/LovmF6gmIwAJ
- https://wiki.sagemath.org/ReleaseTours/sage-9.2#rpy2_and_R
- https://groups.google.com/g/sage-devel/c/alkG4veIsBk/m/432cxzpYAQAJ
Depends on #29537
Depends on #30383
CC: @embray @dimpase @orlitzky @jhpalmieri @kiwifb @novoselt @EmmanuelCharpentier @videlec @kliem
Component: porting: Cygwin
Author: Matthias Koeppe
Branch/Commit: f04c134
Reviewer: François Bissey, John Palmieri
Issue created by migration from https://trac.sagemath.org/ticket/31409
Again we have a problem with r. As cygwin is one of our supported platforms and r is a standard package, this is a blocker.
(More problems with R are around the corner, for wheel builds of Sage - #31396.)
This justifies another round of discussion for the proposal to downgrade the r and rpy2 to "optional" packages - after #29486, #29170, https://groups.google.com/g/sage-devel/c/IX3niEdmCEM/m/LovmF6gmIwAJ
In #29170, @novoselt pointed out that supporting R is important for SageMathCell; and @kiwifb mentioned that "Suddenly dropping R/rpy2 feels rough. However, we don't have usage data to know how rough it would be."
I think that we do need a bit of substantiation here given to make the case of keeping R supported as a standard package - and its cost of maintenance. Note that our R is also nontrivially patched and an update to the 4.x series is overdue, and nobody has stepped up to do this upgrade.
Maybe this is asking for a new type of package: standard on some platforms, optional on others.
I have been "using" the system's R for a while now ("using" mean that Sage doesn't build it, not that I actually use it for anything). Can we distribute binaries that will use a system R installation? (Maybe this is a bad question, because at least on OS X, can we distribute functioning binaries in the first place?)
rpy2 3.4.x has wheels for macOS - https://pypi.org/project/rpy2/#files
We should find out with what installation of R these wheels are supposed to work - hopefully R's official binary packages that are on CRAN - see https://mirror.las.iastate.edu/CRAN/
New commits:
b2d4ab8 | build/pkgs/{r,rpy2}: Downgrade to optional |
Replying to @jhpalmieri:
(Maybe this is a bad question, because at least on OS X, can we distribute functioning binaries in the first place?)
Binaries have been broken for a long time. I would be in favor of just following Dima's current practice of directing people to just use conda.
Author: Matthias Koeppe
I am fine with that. But at the very least most of the doctests in sage/interface/r.py have to be marked optional. I haven't checked for import of rpy at this stage.
sage/stats/r.py relies on R being present. All doctests in there should be optional.
There are two doctests in sage/repl/ipython_tests.py that run R.
Tests for _r_init_ in sage/structure/sage_object.pyx, I note interfaces for other optional packages (octave, mathematica, polymake...) do not have tests.
That's all that I can see and think of right now.
Instead of adding an optional tag to all doctests in these files, perhaps we should go through #30778.
Yes, that would be the right way to go. In the spirit of splitting, r interface stuff should be splitted in its own package eventually. I guess it is now tagged "early candidate".
I have reduced #30778 to a more modest version that just handles "optional" tags as file directives instead of using "sage_setup: distribution" tags.
Branch pushed to git repo; I updated commit sha1. New commits:
8732076 | src/sage/repl/ipython_tests.py: Mark R interface tests # optional - rpy2 |
Branch pushed to git repo; I updated commit sha1. New commits:
dbdf084 | src/sage/structure/sage_object.pyx: Mark R interface test # optional - rpy2 |
Branch pushed to git repo; I updated commit sha1. New commits:
737b21c | src/sage/stats/r.py: Mark all 2 doctests in this file as # optional - rpy2 |
Branch pushed to git repo; I updated commit sha1. New commits:
4403924 | src/sage/interfaces/r.py: Mark all tests # optional - rpy2 |
Because #30778 appears to be controversial, for the present ticket I have just added the line by line optional tags. Ready for review.
Replying to @mkoeppe:
Because #30778 appears to be controversial, for the present ticket I have just added the line by line optional tags. Ready for review.
I feel your pain. Having to do that is kind of terrible. Further checking will have to be with the bots. Let's go with the positive review now.
Reviewer: François Bissey
Thank you!
Looks like there are a few more files with R tests (https://github.com/mkoeppe/sage/runs/1942659778):
- src/doc/en/prep/Quickstarts/Statistics-and-Distributions.rst
- src/sage/interfaces/interface.py
- src/sage/misc/sageinspect.py
Branch pushed to git repo; I updated commit sha1. New commits:
405ebb9 | More # optional - rpy2 |
Right, those were more difficult to spot since they rely on the fact that the R interface is automatically imported in sage/interfaces/all.py. That run should have got all of them.
Thanks! I also just ran make ptestlong-nodoc and there are no more errors.
If addition of standard packages requires a mention on sage-devel then IMHO removal should, too...
Also, doesn't pass the testsuite:
**********************************************************************
File "src/sage/repl/interpreter.py", line 242, in sage.repl.interpreter.SageShellOverride.system_raw
Failed example:
shell.system_raw('R --version')
Expected:
R version ...
Got:
/bin/sh: 1: R: not found
**********************************************************************
File "src/sage/repl/interpreter.py", line 244, in sage.repl.interpreter.SageShellOverride.system_raw
Failed example:
shell.user_ns['_exit_code']
Expected:
0
Got:
127
**********************************************************************
1 item had failures:
2 of 10 in sage.repl.interpreter.SageShellOverride.system_raw
[141 tests, 2 failures, 3.57 s]
**********************************************************************
File "src/sage/tests/cmdline.py", line 578, in sage.tests.cmdline.test_executable
Failed example:
out.find("R version ") >= 0
Expected:
True
Got:
False
**********************************************************************
File "src/sage/tests/cmdline.py", line 580, in sage.tests.cmdline.test_executable
Failed example:
err
Expected:
''
Got:
'/home/buildbot/slave/sage_git/build/src/bin/sage: line 627: exec: R: not found\n'
**********************************************************************
File "src/sage/tests/cmdline.py", line 582, in sage.tests.cmdline.test_executable
Failed example:
ret
Expected:
0
Got:
127
**********************************************************************
1 item had failures:
3 of 231 in sage.tests.cmdline.test_executable
[230 tests, 3 failures, 29.96 s]
----------------------------------------------------------------------
sage -t --long --random-seed=0 src/sage/repl/interpreter.py # 2 doctests failed
sage -t --long --random-seed=0 src/sage/tests/cmdline.py # 3 doctests failed
----------------------------------------------------------------------
Replying to @vbraun:
If addition of standard packages requires a mention on sage-devel then IMHO removal should, too...
I think this is a good point.
There's nothing to discuss; it's trivial to make the packages standard again if someone steps up and fixes the packages.
The purpose of asking for a sage-devel vote when making a package standard, to my understanding, is that the additional future maintenance burden (and bloat) coming from a standard package should receive more scrutiny than one developer's "positive review". None of this applies to the present situation.
You really don't think that people should be made aware that R is being downgraded?
There will be a line in the list of merged tickets for the beta version that merges the ticket. I don't think this has greater urgency.
Description changed:
---
+++
@@ -5,3 +5,8 @@
[r-3.6.3] /opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/var/tmp/sage/build/r-3.6.3/src/src/extra/blas/blas.f:2986:(.text+0x4352): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `xerbla_'
```
+Discussions:
+- #29486
+- #29170
+- https://groups.google.com/g/sage-devel/c/IX3niEdmCEM/m/LovmF6gmIwAJ
+- https://groups.google.com/g/sage-devel/c/alkG4veIsBk/m/432cxzpYAQAJI agree with the ticket, but I also think its common courtesy to give other developers a heads-up here.
Blocking 9.3, not 9.4
There will be a line in the list of merged tickets for the beta version that merges the ticket. I don't think this has greater urgency.
Yeah, that will definitely catch everyone's attention.
R is one of the oldest wheels of the car (not the current Python-style wheels, figurative wheels from days of yore). We all know that once it's not supported, it will be more than broken. At the very least, before this is merged there should be some kind of automated message when R is invoked via Sage (via the innumerable examples out in the wild now, including in various books about "how to use Sage") that says "here is a Sage website that will help you install an R that works with our interface, and here is what to do to get Sage to recognize it". For non-POSIX geeks, but for an ordinary user. And then these now-optional doctests should be tested.
In particular, this is another piece of mission creep from "viable competitor to M, M, M..." I do get the desire to decouple and ease upgrade processes for external software. But for such an important piece of software (not least because of the innumerable packages) there should at the very least be an easy on-ramp to nudging people to install it when needed, and then some kind of testing. Kind of like how OS X "reminds" one if you try to use git and don't yet have command line tools.
Replying to @kcrisman:
In particular, this is another piece of mission creep from "viable competitor to M, M, M..." I do get the desire to decouple and ease upgrade processes for external software.
This ticket is about the very concrete problem that R does not build on one of our standard platforms, Cygwin, and we are not able to use the system R either.
Because failing standard packages block the entire build, we either need to fix it, or downgrade it from standard before the 9.3 release.
Replying to @kcrisman:
... when R is invoked via Sage (via the innumerable examples out in the wild now, including in various books about "how to use Sage")
Do you have any pointers to such examples?
Description changed:
---
+++
@@ -7,6 +7,8 @@
Discussions:
- #29486
+- #29441
- #29170
- https://groups.google.com/g/sage-devel/c/IX3niEdmCEM/m/LovmF6gmIwAJ
+- https://wiki.sagemath.org/ReleaseTours/sage-9.2#rpy2_and_R
- https://groups.google.com/g/sage-devel/c/alkG4veIsBk/m/432cxzpYAQAJIIRC correctly, R is not mentioned in http://sagebook.gforge.inria.fr/english.html
As a point of reference, I've twice been in workshops with R experts where both Sage and R were being taught to students. When I told the R people that since students have already installed Sage, they can use R through Sage. But this did not interest them as they wanted to have students install RStudio anyways.
Do we know about more than one person who uses R primarily through Sage?
That said, I haven't really kept up with this issue. What's the problem with R on cygwin? Maybe I can just fix it.
When did the problem start? Did we upgrade R at some point?
I'm not against the current proposal to make R optional, though I'm inclined to agree with William's point about having a deprecation period first. I'll look into the build issue with R.
Alternatively, if the R version was upgraded, resulting in this problem, could we not just revert the upgrade until a later version / when this is able to be fixed?
I'm looking at the Sage build on my Windows machine, and it seems I haven't updated it in some time (I think not since 9.3b2). I already have r-3.6.3 built successfully. But since I last updated it looks like openblas has been updated, so right now openblas is recompiling, then we'll see about r...
As of now, this issue is causing the CI tests of Cygwin to fail so we cannot even see other issues that affect Cygwin.
So let's please get this in to the next beta --- it is trivial to revert in a follow-up ticket if you succeed to fix the issue.
Let me at least see if I can reproduce it locally first...
I wonder if there's some way we could make some packages optional on a platform-specific basis (maybe when running bootstrap?)
Replying to @embray:
Let me at least see if I can reproduce it locally first...
I wonder if there's some way we could make some packages optional on a platform-specific basis (maybe when running bootstrap?)
Aaaaarghhh...
This would be a documentation nightmare (explaining why a ticket is "optionally optional" is awkward at the very minimum...). It would also "froze" the idea of Windows 10 being a "second-class citizen" as far as Sage is concerned.
I'm starting to consider promoting a WSL2 port as the preffered windows platform,and preparing a deprecation of the Cygwin port, which remains necessary as long as Windows
versions <10 remain relevant only as rthe "sanest" way out of this predicament ("sane" being an hyperbole, of course...).
But this also should be discussed on sage-devel. Hence this post...
online statistics says that among Windows installations, version 10 is approaching 80%. High time to think about WSL2 as a primary platform.
Moreover, the 2nd biggest, qua # of installs, is Windows version 7, on 16%,
which is past its EOL, so it's either on very old machines which won't really be capable of running resouce-hungry Sage well, anyway, or on machines having some old, unupgradeable, soft installed. Windows 8 (EOLed since 2016) is at 1% and 8.1, still supported, on 3.7%.
https://gs.statcounter.com/windows-version-market-share/desktop/worldwide/
I think we should really put more effort in WSL2, and much less effort in Cygwin.
WSL2 is an installation nightmare for typical users. But this is off-topic.
I'm not even so sure the problem here has to do much with Cygwin. I haven't been able to reproduce it on my local Cygwin install.
Is this build also installing the system R? I have reported problems before with installing the system R resulting in problems linking to the right BLAS.
WSL will just give you a new and different set of problems.
Looking again at the build log where R failed, it looks like it is installing the system openblas and the system R. I wonder if that could somehow be part of the problem.
Have you tried modifying the build script to not install them and see if it works? I know we also want to try getting it working with the system packages, but since they don't work yet anyways maybe we can forego their installation for now.
In the meantime I'll try a fresh cygwin install and see if I can reproduce the environment being used in the CI.
Replying to @embray:
In the meantime I'll try a fresh cygwin install and see if I can reproduce the environment being used in the CI.
Easy to do by using tox -e local-cygwin-choco-standard in a minimal install of Cygwin + tox
Let's please not continue discussing WSL etc. here on this technical ticket. I have opened #31485 (Meta-ticket: Sage on WSL (Windows Subsystem for Linux)) for it. It would be helpful if someone could organize the discussion and the existing tickets in this direction there.
Replying to @mkoeppe:
Replying to @embray:
In the meantime I'll try a fresh cygwin install and see if I can reproduce the environment being used in the CI.
Easy to do by using
tox -e local-cygwin-choco-standardin a minimal install of Cygwin + tox
Cool, thanks for the pointer. I don't have chocolatey installed, but I did the equivalent using apt-cyg and the (very nice) sage-get-system-packages command.
Of course if there's something very specific to the way chocolatey is installing the packages then that's a problem and I'll have to dig deeper. Seems unlikely?
Fresh Cygwin install. I installed all the :standard: packages recommended by sage-get-system-packages.
Ran ./bootstrap then ./configure with no additional arguments. Then ran make r.
First it builds and installs OpenBLAS. Several hours later it builds and installs R with no problem.
I am downloading the build artifacts from the GitHub build to see if there are any more detailed logs about the R build there...
Here's one thing that stands out already. On GitHub:
checking for dgemm_ in -lopenblas ... yes
checking whether double complex BLAS can be used... no
On my local build it has:
checking for dgemm_ in -lopenblas ... yes
checking whether double complex BLAS can be used... yes
checking whether the BLAS is complete... yes
The GitHub build says no support for "double complex BLAS" and does not even mention the third check "BLAS is complete"
So the end result is on GitHub R is not linking with openblas:
External libraries: readline, curl
Additional capabilities: PNG, JPEG, TIFF, NLS, ICU
Options enabled: shared R library, shared BLAS, R profiling
Capabilities skipped: cairo
Options not enabled: memory profiling
Whereas on my local build it has:
External libraries: readline, BLAS(OpenBLAS), LAPACK(generic), curl
Additional capabilities: PNG, JPEG, TIFF, NLS, ICU
Options enabled: shared R library, R profiling
Capabilities skipped: cairo
Options not enabled: shared BLAS, memory profiling
And the build error is occurring somewhere related to building R's baseline BLAS which AFAIK I've never tested on Cygwin since it always successfully links with OpenBLAS.
Now the question is just why is the double complex check failing?
Replying to @embray:
Of course if there's something very specific to the way chocolatey is installing the packages then that's a problem and I'll have to dig deeper. Seems unlikely?
Unlikely. Chocolatey really just calls cygwin's setup_....exe many times.
The openblas logs on GitHub do show successful build an installation, though the make check fails with a bunch of things like:
gfortran -O2 -Wall -frecursive -fno-optimize-sibling-calls -m64 -msse3 -mssse3 -msse4.1 -mavx -mavx2 -mfma -march=skylake-avx512 -fno-asynchronous-unwind-tables -mavx2 -L/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib -Wl,-rpath,/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib -o sblat1 sblat1.o ../cygopenblas_skylakexp-r0.3.13.a -lgfortran -lgfortran -L/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib/../lib -L/usr/lib/gcc/x86_64-pc-cygwin/10 -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/lib/../lib -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../lib -L/lib/../lib -L/usr/lib/../lib -L/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/lib -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../.. -lcygwin
/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: sblat1.o:sblat1.f:(.text+0x67d): undefined reference to `srot_'
sblat1.o:sblat1.f:(.text+0x67d): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `srot_'
/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: sblat1.o:sblat1.f:(.text+0xbae): undefined reference to `srot_'
sblat1.o:sblat1.f:(.text+0xbae): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `srot_'
/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: sblat1.o:sblat1.f:(.text+0xe99): undefined reference to `sdsdot_'
sblat1.o:sblat1.f:(.text+0xe99): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `sdsdot_'
/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld: sblat1.o:sblat1.f:(.text+0xfde): undefined reference to `sdot_'
...
But this looks more like a problem in the make check. Why is it trying to link to some ../cygopenblas_skylakexp-r0.3.13.a when the build produced cygopenblas_haswellp-r0.3.13.a not skylake?
Probably not related though...
It looks like it produced cygopenblas.dll just fine:
gcc -O2 -g -march=native -O2 -DMAX_STACK_ALLOC=2048 -Wall -m64 -DF_INTERFACE_GFORT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=2 -DMAX_PARALLEL_NUMBER=1 -DUSE_TLS -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.13\" -msse3 -mssse3 -msse4.1 -mavx -mavx2 -mfma -mavx2 -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME= -DASMFNAME=_ -DNAME=_ -DCNAME= -DCHAR_NAME=\"_\" -DCHAR_CNAME=\"\" -DNO_AFFINITY -I. -O2 -DMAX_STACK_ALLOC=2048 -Wall -m64 -DF_INTERFACE_GFORT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=2 -DMAX_PARALLEL_NUMBER=1 -DUSE_TLS -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.13\" -msse3 -mssse3 -msse4.1 -mavx -mavx2 -mfma -mavx2 -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME= -DASMFNAME=_ -DNAME=_ -DCNAME= -DCHAR_NAME=\"_\" -DCHAR_CNAME=\"\" -DNO_AFFINITY -I.. -L/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib -Wl,-rpath,/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib cygopenblas.def dllinit.o \
-shared -o ../cygopenblas.dll -Wl,--out-implib,../libopenblas.dll.a \
-Wl,--whole-archive ../cygopenblas_haswellp-r0.3.13.a -Wl,--no-whole-archive -L/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib/../lib -L/usr/lib/gcc/x86_64-pc-cygwin/10 -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/lib/../lib -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../lib -L/lib/../lib -L/usr/lib/../lib -L/opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/lib -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/lib -L/usr/lib/gcc/x86_64-pc-cygwin/10/../../.. -lgfortran -lquadmath -lm -lcygwin -lgfortran -lgfortran
Here is the relevant part in the R configure that checks for double complex support in the BLAS lib: https://github.com/wch/r-source/blob/78f3b069d6269ddde40d76515a1eef67df674435/m4/R.m4#L2642
There is a note in there:
dnl Now check if zdotu works (fails on AMD64 with the wrong compiler;
dnl also fails on macOS with Accelerate/vecLib and gfortran;
dnl but in that case we have a work-around using USE_VECLIB_G95FIX)
I don't know what they mean by that. What is "the wrong compiler" in this case?
Perhaps we could just toss in a patch to disable this check and see what happens?
I meant to add, I downloaded the build artifacts from GH and confirmed that the cygopenblas.dll is in the correct place and also exports the relevant symbols.
Also in the previous check it did successfully link against -lopenblas and find the dgemm_ symbol, so it's probably not a linking issue, but some problem with compiling and running the test for double complex support which is apparently known to fail sometimes?
Replying to @embray:
But this looks more like a problem in the make check. Why is it trying to link to some
../cygopenblas_skylakexp-r0.3.13.awhen the build producedcygopenblas_haswellp-r0.3.13.anot skylake?Probably not related though...
Looks like we forgot to update openblas/spkg-check.in -- it duplicates an old copy of settings made in openblas/spkg-install.in!!
This should really go through a common script
I tried manually building the conftest.f from R's configure check and got this warning:
$ gfortran -c conftest.f
conftest.f:15:72:
15 | 10 ztemp = ztemp + zx(i)*zx(i)
| 1
Warning: Fortran 2018 deleted feature: DO termination statement which is not END DO or CONTINUE with label 10 at (1)
And in fact this was fixed upstream just a few months ago: wch/r-source@b2e1bf7#diff-519bfce994ea1571b861581c2c200f396eb96ad94ba9bc74f20103dc1bda38dc
I applied the upstream fix and the warning went away. Maybe somewhere -Werror is getting set in FFLAGS.
I figured out that it's possible to download the failed R build artifacts from GH (very nice, thank you). Unfortunately the config.log does not say anything very helpful here:
configure:38593: checking whether double complex BLAS can be used
conftestf.f:15:72:
15 | 10 ztemp = ztemp + zx(i)*zx(i)
| 1
Warning: Fortran 2018 deleted feature: DO termination statement which is not END DO or CONTINUE with label 10 at (1)
configure:38674: result: no
Something the way the test is written it's not producing useful output. It looks like the fortran compilation succeeds, with the warning, but only as a warning. Probably the C compilation (or linking?) fails. But it's not producing any output to the log. I still can't reproduce a failure in this locally.
Replying to @mkoeppe:
Replying to @embray:
But this looks more like a problem in the make check. Why is it trying to link to some
../cygopenblas_skylakexp-r0.3.13.awhen the build producedcygopenblas_haswellp-r0.3.13.anot skylake?Probably not related though...
Looks like we forgot to update
openblas/spkg-check.in-- it duplicates an old copy of settings made inopenblas/spkg-install.in!!
This should really go through a common script
I rebuilt openblas locally with SAGE_CHECK=yes and the checks passed.
Replying to @embray:
configure:38593: checking whether double complex BLAS can be used conftestf.f:15:72: 15 | 10 ztemp = ztemp + zx(i)*zx(i) | 1 Warning: Fortran 2018 deleted feature: DO termination statement which is not END DO or CONTINUE with label 10 at (1) configure:38674: result: noSomething the way the test is written it's not producing useful output. It looks like the fortran compilation succeeds, with the warning, but only as a warning. Probably the C compilation (or linking?) fails. But it's not producing any output to the log. I
Probably should run this with CONFIG_SHELL="bash -x"
Looking at the test code, I think it's likely the run test in https://github.com/wch/r-source/blob/78f3b069d6269ddde40d76515a1eef67df674435/m4/R.m4#L2704 that's failing.
Replying to @mkoeppe:
Replying to @embray:
But this looks more like a problem in the make check. Why is it trying to link to some
../cygopenblas_skylakexp-r0.3.13.awhen the build producedcygopenblas_haswellp-r0.3.13.anot skylake?Probably not related though...
Looks like we forgot to update
openblas/spkg-check.in-- it duplicates an old copy of settings made inopenblas/spkg-install.in!!
This should really go through a common script
I'll do this in #31490
Replying to @mkoeppe:
Looking at the test code, I think it's likely the run test in https://github.com/wch/r-source/blob/78f3b069d6269ddde40d76515a1eef67df674435/m4/R.m4#L2704 that's failing.
Nice call! After trying to manually compile it with what I think are all the right flags I get:
$ ./conftest.exe
Illegal instruction (core dumped)
Replying to @embray:
Replying to @mkoeppe:
Looking at the test code, I think it's likely the run test in https://github.com/wch/r-source/blob/78f3b069d6269ddde40d76515a1eef67df674435/m4/R.m4#L2704 that's failing.
Nice call! After trying to manually compile it with what I think are all the right flags I get:
$ ./conftest.exe Illegal instruction (core dumped)
This was linking against the OpenBLAS that was included in the build artifacts from GitHub. If I link against my locally built OpenBLAS it works.
So the problem is again OpenBLAS, and apparently the fact that the OpenBLAS from the stage-i-a build artifacts is not portable, and possibly running on different CPUs between jobs?
That's quite possible... Would be great if all of our problems could be fixed by making OpenBLAS properly portable!
Looking at the openblas build logs, I see it is not building with DYNAMIC_ARCH=1. Does this mean the GH builds are not using SAGE_FAT_BINARY? Perhaps that would fix all this.
They do use SAGE_FAT_BINARY - see top of ci-cygwin-standard.yml
Replying to @mkoeppe:
They do use SAGE_FAT_BINARY - see top of
ci-cygwin-standard.yml
I see that now.
But looking again at the openblas build log from GH one of the first lines output, from the echo in spkg-install.in, is:
Building OpenBLAS: make
This should say:
Building OpenBLAS: make DYNAMIC_ARCH=1
So somewhere along the way the setting is being lost...
Can you check what its value in build/bin/sage-build-env-config is?
Replying to @embray:
But looking again at the openblas build log [...]
This should say:Building OpenBLAS: make DYNAMIC_ARCH=1So somewhere along the way the setting is being lost...
You are right, thanks for catching this. I may have a fix for this in https://github.com/mkoeppe/sage/runs/2098475692?check_suite_focus=true
A fix is now in #29537, running in https://github.com/mkoeppe/sage/actions/runs/647547421
I've reduced it from "blocker" as I am hopeful that #29537 will fix it. And instead of making the packages "optional", I now propose to keep them as "standard" but disablable.
Branch pushed to git repo; I updated commit sha1. New commits:
49c10c1 | Revert "build/pkgs/{r,rpy2}: Downgrade to optional" |
Branch pushed to git repo; I updated commit sha1. Last 10 new commits:
5141bc3 | build/pkgs/widgetsnbextension: Patch out dependency on notebook (backport from widgetsnbextension-4) |
fee8379 | Merge #31278 |
82d8030 | m4/sage_spkg_enable.m4: Cosmetic change to help strings |
3e92998 | configure --disable-notebook: Fix up what packages are disabled |
74a83fc | build/pkgs/matplotlib/dependencies: Undo removal of tornado |
93d31b9 | Remove use of unicode superscripts in configure --help |
18fbb85 | m4/sage_spkg_collect.m4: Fix description of SAGE_OPTIONAL_CLEANED_PACKAGES |
c3e4093 | Rename SAGE_OPTIONAL_CLEANED_PACKAGES to SAGE_OPTIONAL_UNINSTALLED_PACKAGES |
a67300b | Merge #30383 |
f0a5fb5 | configure.ac: Add option --disable-r |
Description changed:
---
+++
@@ -1,11 +1,14 @@
-https://github.com/mkoeppe/sage/runs/1915093157
+Prompted by a build failure of `r` on `cygwin-standard` in https://github.com/mkoeppe/sage/runs/1915093157
```
[r-3.6.3] blas.o: in function `dsymm_':
[r-3.6.3] /opt/sage-1e693c9ae51a53dad0fe099efc41439dc898f02f/var/tmp/sage/build/r-3.6.3/src/src/extra/blas/blas.f:2986:(.text+0x4352): relocation truncated to fit: R_X86_64_PC32 against undefined symbol `xerbla_'
```
+we make it possible to disable R using a configure options so that one can get a working Sage installation in this way.
-Discussions:
+(The original build failure has hopefully resolved been resolved by #29537.)
+
+Previous tickets and discussions:
- #29486
- #29441
- #29170There seems to be a simpler bug here that I don't think is addressed yet by any of your fixes. I didn't know there was now an --enable-fat-binary flag to configure. I'm glad there is, and it makes sense. But perhaps because of that the SAGE_FAT_BINARY environment variable is being ignored. I added set -x to the splg-install for openblas and can see that its value is getting unset, which is why the CI configuration wasn't picking it up either despite having it in env. I think, if set, SAGE_FAT_BINARY should still override whatever was passed to configure.
Replying to @embray:
There seems to be a simpler bug here that I don't think is addressed yet by any of your fixes.
The simple bug was -- as you correctly suspected -- in my CI script. The testing is done through tox, which does not pass through environment variables (except for those explicitly configured to be passed through).
Replying to @embray:
I didn't know there was now an
--enable-fat-binaryflag to configure. I'm glad there is, and it makes sense. But perhaps because of that theSAGE_FAT_BINARYenvironment variable is being ignored.
This was done in #30375. Both SAGE_FAT_BINARY and the option --enable-fat-binary are respected at configure time and are made persistent for the build through build/bin/sage-build-env-config. Mixing in the environment variable set at make time is not done; I don't think this is a problem.
Branch pushed to git repo; I updated commit sha1. New commits:
4916415 | Merge tag '9.3.beta9' into t/30383/new_package_type__optional_enabled_by_default |
f04c134 | Merge branch 't/30383/new_package_type__optional_enabled_by_default' into t/31409/cygwin_standard__r_build_fails_____downgrade_r__rpy2_to_optional |
Replying to @mkoeppe:
Replying to @embray:
I didn't know there was now an
--enable-fat-binaryflag to configure. I'm glad there is, and it makes sense. But perhaps because of that theSAGE_FAT_BINARYenvironment variable is being ignored.This was done in #30375. Both
SAGE_FAT_BINARYand the option--enable-fat-binaryare respected atconfiguretime and are made persistent for the build throughbuild/bin/sage-build-env-config. Mixing in the environment variable set atmaketime is not done; I don't think this is a problem.
Still, it used to be possible to set it when running make as well. I guess it's not so important, but it feels like a regression, and was confusing to me.