Performance degradation after strawberry perl 5.32.
Closed this issue · 14 comments
Hi,
In a simple performance test, I found symptoms of performance degradaion from versions after strawberry perl 5.32.
For example, the results of code from https://www.tek-tips.com/viewthread.cfm?qid=1119203
* strawberry perl 5.32.1.1
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.27 usr + 0.00 sys = 0.27 CPU) @ 187969.92/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.20 usr + 0.00 sys = 0.20 CPU) @ 246305.42/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 187970/s -- -24%
Kevin 246305/s 31% --
* strawberry perl 5.38.2.1
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.34 usr + 0.00 sys = 0.34 CPU) @ 145348.84/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.25 usr + 0.00 sys = 0.25 CPU) @ 200000.00/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 145349/s -- -27%
Kevin 200000/s 38% --
[ gcc optimize option comparison from "perl -V"]
* strawberry perl 5.32.1.1
Compiler:
cc='gcc'
ccflags =' -DWIN32 -DWIN64 -D__USE_MINGW_ANSI_STDIO -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -fwrapv -fno-strict-aliasing -mms-bitfields'
optimize='-s -O2'
...
*strawberry perl 5.38.2.1
Compiler:
cc='gcc'
ccflags =' -DWIN32 -DWIN64 -DPERL_TEXTMODE_SCRIPTS -DMULTIPLICITY -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -mms-bitfields'
optimize='-Os'
...
Different gcc optimization options affect this?
Different gcc optimization options affect this?
I don't actually know, but I think there will be little difference between -Os
and -O2
optimizations.
With the newer mingw-w64 compilers, there are problems with -O2
optimization on x64 builds, and the GNUmakefile was therefore altered to specify -Os
.
However, on 32-bit (x86) builds, there are no problems with-O2
optimization.
I haven't looked at what the SP 32-bit builds specify, but they could (and therefore probably should) optimize to level -O2
if they aren't already doing that.
That's something I could (and therefore probably should) have mentioned earlier ;-)
...... and I certainly would have mentioned it if I had not forgotten all about it.
UPDATE: Comments in the GNUmakefile about this refer to Perl/perl5#20081
I suspect this is due to a change in perl itself. There looks to be a substantial speedup late in the 5.35 series when running on Ubuntu via WSL2 (see results and modified code below). This roughly corresponds with the slowdown on Windows, noting we don't have an SP 5.34 available.
Perhaps an optimisation has been applied that works well under unices but not well under Windows.
@sisyphus - do you have some VS compiled perls spanning these versions? That would help determine if it is mingw related.
Some other examples of windows performance issues are Perl/perl5#21654 and Perl/perl5#21360. These affect different version spans but maybe there are similar root causes.
perlbrew exec perl gh160.pl
perl-5.38.0
==========
Rate kevin fish
kevin 455106/s -- -9%
fish 500262/s 10% --
perl-5.36.0
==========
Rate kevin fish
kevin 465891/s -- -8%
fish 504119/s 8% --
perl-5.35.11
==========
Rate fish kevin
fish 139910/s -- -20%
kevin 174831/s 25% --
perl-5.34.1
==========
Rate fish kevin
fish 127064/s -- -33%
kevin 190860/s 50% --
use strict;
use warnings;
use Benchmark qw /cmpthese/;
my @x = ("in the", "skipping along");
my @y = ("we walk in the park", "we walk in the dark", "we walk holding hands", "we walk skipping along");
my @regexen = map { $_ = qr/$_/ } @x; # precompile regexes
#print kevin();
#print fish();
cmpthese -2, {
kevin => \&kevin,
fish => \&fish,
};
sub kevin {
my @arr;
for (@x) {
foreach my $line (@y) {
next if ($line !~ /$_/);
push @arr, "$line\n";
}
}
return @arr;
}
sub fish {
my @arr;
Y: foreach my $line (@y) {
foreach my $re (@regexen) {
if ( $line =~ $re ) {
push @arr, "$line\n";
next Y;
}
}
}
return @arr;
}
Using the script posted (above) by @shawnlaffan.
For perl-5.38.0:
D:\pscrpt\msvc>perl -MConfig -le "print $]; print $Config{archname};print $Config{ccversion};"
5.038000
MSWin32-x64-multi-thread
19.33.31630
D:\pscrpt\msvc>perl bench.pl
Rate kevin fish
kevin 286903/s -- -16%
fish 340481/s 19% --
For perl-5.36.1:
D:\pscrpt\msvc>perl -MConfig -le "print $]; print $Config{archname};print $Config{ccversion};"
5.036001
MSWin32-x64-multi-thread
19.33.31630
D:\pscrpt\msvc>perl bench.pl
Rate kevin fish
kevin 305177/s -- -13%
fish 351348/s 15% --
For perl-5.36.0:
D:\pscrpt\msvc>perl -MConfig -le "print $]; print $Config{archname};print $Config{ccversion};"
5.036000
MSWin32-x64-multi-thread
19.33.31630
D:\pscrpt\msvc>perl bench.pl
Rate kevin fish
kevin 304766/s -- -14%
fish 352462/s 16% --
For perl-5.32.1:
D:\pscrpt\msvc>perl -MConfig -le "print $]; print $Config{archname};print $Config{ccversion};"
5.032001
MSWin32-x64-multi-thread
19.33.31630
D:\pscrpt\msvc>perl bench.pl
Rate kevin fish
kevin 300280/s -- -8%
fish 325420/s 8% --
Using the script linked to by @aero
For perl-5.32.1:
D:\pscrpt\msvc>perl -MConfig -le "print $]; print $Config{archname}; print $Config{ccversion};"
5.032001
MSWin32-x64-multi-thread
19.33.31630
D:\pscrpt\msvc>perl bench0.pl
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.20 usr + 0.00 sys = 0.20 CPU) @ 246305.42/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.14 usr + 0.00 sys = 0.14 CPU) @ 354609.93/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 246305/s -- -31%
Kevin 354610/s 44% --
For perl-5.38.0:
D:\pscrpt\msvc>perl -MConfig -le "print $]; print $Config{archname}; print $Config{ccversion};"
5.038000
MSWin32-x64-multi-thread
19.33.31630
D:\pscrpt\msvc>perl bench0.pl
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 1 wallclock secs ( 0.20 usr + 0.00 sys = 0.20 CPU) @ 245098.04/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.14 usr + 0.00 sys = 0.14 CPU) @ 357142.86/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 245098/s -- -31%
Kevin 357143/s 46% --
Not quite sure what that demonstrates.
@shawnlaffan, @aero, if there's any more (msvc-built) perl versions you'd like results for, please specify.
I'm about to build such a perl-5.34.2 perl-5.34.3, anyway - because I don't have a perl-5.34 built with this (VS 2022) toolset.
Thanks @sisyphus. That suggests there is no meaningful difference between versions when using MSVC so perhaps it is a gcc issue.
You don't happen to have any 5.36 or 5.38 perls built using the gcc-8 that comes with SP 5.32?
I have my own build perl binary of perl-5.38.0 with '-s O2' gcc(same gcc version 13.1.0) optimaztion option.
(but perl-5.38.2 cannot be compiled using the '-s O2' option, I got an GNUMakefile error.)
[perl 5.38.0 performance comparison according to different options]
* Original strawberry perl 5.38.0 binary (-Os)
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 1 wallclock secs ( 0.34 usr + 0.00 sys = 0.34 CPU) @ 145348.84/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.25 usr + 0.00 sys = 0.25 CPU) @ 200000.00/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 145349/s -- -27%
Kevin 200000/s 38% --
* My own build perl 5.38.0 (-s -O2)
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 1 wallclock secs ( 0.27 usr + 0.00 sys = 0.27 CPU) @ 187969.92/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.19 usr + 0.00 sys = 0.19 CPU) @ 267379.68/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 187970/s -- -30%
Kevin 267380/s 42% --
It appears that differences in optimization options make a difference in performance.
Thanks @aero.
Re-reading the issue @sisyphus linked to shows this comment in which compilation works with nearly all of the -O2 flags:
OPTIMIZE = -Os -falign-functions -falign-jumps -falign-labels -falign-loops -freorder-blocks -freorder-blocks-algorithm=stc -freorder-blocks-and-partition
Does that compile for you? And if so, what is the performance like?
The optimisation flags might be simplified to:
OPTIMIZE = -O2 -finline-functions -fno-prefetch-loop-arrays
Or maybe even this since inline-functions
seems to be a -Os
flag:
OPTIMIZE = -O2 -fno-prefetch-loop-arrays
I have my own build perl binary of perl-5.38.0 with '-s O2' gcc(same gcc version 13.1.0) optimaztion option.
Could you provide the perl -V
output of that build ?
Does that compile for you? And if so, what is the performance like?
I ended up compiling a 5.38.2 with the extra flags. Results below are from a Windows 10 desktop.
Script has been modified to output more info.
The more optimised 5.38.2 is slower than 5.38.0 here and on average but there is not much difference across multiple runs.
v5.32.1
optimize: -s -O2
Rate kevin fish
kevin 237889/s -- -16%
fish 282288/s 19% --
v5.38.0
optimize: -Os
Rate kevin fish
kevin 218270/s -- -12%
fish 248032/s 14% --
v5.38.2
optimize: -Os -falign-functions -falign-jumps -falign-labels -falign-loops -freorder-blocks -freorder-blocks-algorithm=stc -freorder-blocks-and-partition
Rate kevin fish
kevin 209478/s -- -3%
fish 215027/s 3% --
I have my own build perl binary of perl-5.38.0 with '-s O2' gcc(same gcc version 13.1.0) optimaztion option.
Could you provide the perl -V output of that build ?
mybuild.bat
set IO_COMPRESS_SKIP_STDIN_TESTS=1
set IPC_CMD_SKIP_TESTS=1
gmake -j4 INST_TOP=c:\perl-5.38.0-64bit\perl CCHOME=C:\strawberry-perl-5.38-64bit\c USE_MINGW_ANSI_STDIO=define USE_64_BIT_INT=define OPTIMIZE="-s -O2" man1dir=none man3dir=none html1dir=none html3dir=none INSTALLSITESCRIPT=c:\perl-5.38.0-64bit\perl\site\bin
gmake install
output
C:\perl-5.38.0-64bit\perl\bin>perl -V
Summary of my perl5 (revision 5 version 38 subversion 0) configuration:
Platform:
osname=MSWin32
osvers=10.0.19045.3758
archname=MSWin32-x64-multi-thread
uname=''
config_args='undef'
hint=recommended
useposix=true
d_sigaction=undef
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=undef
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
Compiler:
cc='gcc'
ccflags =' -DWIN32 -DWIN64 -DPERL_TEXTMODE_SCRIPTS -DMULTIPLICITY -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -mms-bitfields'
optimize='-s -O2'
cppflags='-DWIN32'
ccversion=''
gccversion='13.1.0'
gccosandvers=''
intsize=4
longsize=4
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='long long'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='g++'
ldflags ='-s -L"c:\perl-5.38.0-64bit\perl\lib\CORE" -L"C:\strawberry-perl-5.38-64bit\c\lib" -L"C:\strawberry-perl-5.38-64bit\c\x86_64-w64-mingw32\lib" -L"C:\strawberry-perl-5.38-64bit\c\lib\gcc\x86_64-w64-mingw32\13.1.0"'
libpth=C:\strawberry-perl-5.38-64bit\c\lib C:\strawberry-perl-5.38-64bit\c\x86_64-w64-mingw32\lib C:\strawberry-perl-5.38-64bit\c\lib\gcc\x86_64-w64-mingw32\13.1.0
libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
libc=
so=dll
useshrplib=true
libperl=libperl538.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_win32.xs
dlext=dll
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags='-shared -s -L"c:\perl-5.38.0-64bit\perl\lib\CORE" -L"C:\strawberry-perl-5.38-64bit\c\lib" -L"C:\strawberry-perl-5.38-64bit\c\x86_64-w64-mingw32\lib" -L"C:\strawberry-perl-5.38-64bit\c\lib\gcc\x86_64-w64-mingw32\13.1.0"'
Characteristics of this binary (from libperl):
Compile-time options:
HAS_LONG_DOUBLE
HAS_TIMES
HAVE_INTERP_INTERN
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_HASH_FUNC_SIPHASH13
PERL_HASH_USE_SBOX32
PERL_IMPLICIT_SYS
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_SAFE_PUTENV
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
Built under MSWin32
Compiled at Dec 9 2023 19:51:21
@INC:
C:/perl-5.38.0-64bit/perl/site/lib
C:/perl-5.38.0-64bit/perl/lib
C:\perl-5.38.0-64bit\perl\bin>perl \temp\bench.pl
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.25 usr + 0.00 sys = 0.25 CPU) @ 200000.00/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.17 usr + 0.00 sys = 0.17 CPU) @ 290697.67/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 200000/s -- -31%
Kevin 290698/s 45% --
I don't know why it didn't compile before.
I tried compiling perl-5.38.2 again and it worked fine.
Performance improved.
mybuild.bat
set IO_COMPRESS_SKIP_STDIN_TESTS=1
set IPC_CMD_SKIP_TESTS=1
gmake -j4 INST_TOP=c:\perl-5.38.2-64bit\perl CCHOME=C:\strawberry-perl-5.38-64bit\c USE_MINGW_ANSI_STDIO=define USE_64_BIT_INT=define OPTIMIZE="-s -O2" man1dir=none man3dir=none html1dir=none html3dir=none INSTALLSITESCRIPT=c:\perl-5.38.2-64bit\perl\site\bin
gmake install
output
>perl -V
Summary of my perl5 (revision 5 version 38 subversion 2) configuration:
Platform:
osname=MSWin32
osvers=10.0.19045.3758
archname=MSWin32-x64-multi-thread
uname=''
config_args='undef'
hint=recommended
useposix=true
d_sigaction=undef
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=undef
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
Compiler:
cc='gcc'
ccflags =' -DWIN32 -DWIN64 -DPERL_TEXTMODE_SCRIPTS -DMULTIPLICITY -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D__USE_MINGW_ANSI_STDIO -fwrapv -fno-strict-aliasing -mms-bitfields'
optimize='-s -O2'
cppflags='-DWIN32'
ccversion=''
gccversion='13.1.0'
gccosandvers=''
intsize=4
longsize=4
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='long long'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='g++'
ldflags ='-s -L"c:\perl-5.38.2-64bit\perl\lib\CORE" -L"C:\strawberry-perl-5.38-64bit\c\lib" -L"C:\strawberry-perl-5.38-64bit\c\x86_64-w64-mingw32\lib" -L"C:\strawberry-perl-5.38-64bit\c\lib\gcc\x86_64-w64-mingw32\13.1.0"'
libpth=C:\strawberry-perl-5.38-64bit\c\lib C:\strawberry-perl-5.38-64bit\c\x86_64-w64-mingw32\lib C:\strawberry-perl-5.38-64bit\c\lib\gcc\x86_64-w64-mingw32\13.1.0
libs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
perllibs= -lmoldname -lkernel32 -luser32 -lgdi32 -lwinspool -lcomdlg32 -ladvapi32 -lshell32 -lole32 -loleaut32 -lnetapi32 -luuid -lws2_32 -lmpr -lwinmm -lversion -lodbc32 -lodbccp32 -lcomctl32
libc=
so=dll
useshrplib=true
libperl=libperl538.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_win32.xs
dlext=dll
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags='-shared -s -L"c:\perl-5.38.2-64bit\perl\lib\CORE" -L"C:\strawberry-perl-5.38-64bit\c\lib" -L"C:\strawberry-perl-5.38-64bit\c\x86_64-w64-mingw32\lib" -L"C:\strawberry-perl-5.38-64bit\c\lib\gcc\x86_64-w64-mingw32\13.1.0"'
Characteristics of this binary (from libperl):
Compile-time options:
HAS_LONG_DOUBLE
HAS_TIMES
HAVE_INTERP_INTERN
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_HASH_FUNC_SIPHASH13
PERL_HASH_USE_SBOX32
PERL_IMPLICIT_SYS
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_SAFE_PUTENV
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
Built under MSWin32
Compiled at Dec 9 2023 20:23:18
@INC:
C:/perl-5.38.2-64bit/perl/site/lib
C:/perl-5.38.2-64bit/perl/lib
[Benchmark]-----------------------------------------------------------------------------
* This build
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.27 usr + 0.00 sys = 0.27 CPU) @ 187969.92/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 1 wallclock secs ( 0.17 usr + 0.00 sys = 0.17 CPU) @ 290697.67/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 187970/s -- -35%
Kevin 290698/s 55% --
* Original strawberry perl 5.38.2
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.34 usr + 0.00 sys = 0.34 CPU) @ 145348.84/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.23 usr + 0.00 sys = 0.23 CPU) @ 213675.21/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 145349/s -- -32%
Kevin 213675/s 47% --
mybuild.bat
Thanks, @aero.
I have built perl-5.38.0 with OPTIMIZE="-s -O2"
and found that to "work" - though a number of test scripts reported failures during gmake test
.
However, I also found things to be very much the same with perl-5.38.2. (What was the error you got ?)
The "-s -O2" optimization does (despite the failing tests) provide the best performance of both of the benchmarking scripts that have been presented here.
However, of course, we cannot advocate thar "-s -O2" should become the default, until we can solve the issue of those failing tests.
IMO, the performance loss with "-Os" is not excessive - though I admit that it's a little more significant than I anticipated.
@aero, I encourage you to instead concentrate on looking at the current perl-5.39.x devel releases as they are released - with a view to investigating how they might be modified to improve the upcoming perl-5.40.0 release.
Sure, there's something to be learnt from looking back, and seeing problems with what was done in the past - but the real advances will be made by looking at what's coming next.
You could do that using the 5.38.0 toolchain that ships with SP-5.38.0,
However, I think it would be better to switch to a gcc-13.2.0 UCRT toolchain provided by https://winlibs.com.
(There's also a gcc-14.0.0 pre-release that threw up no new issues when I tried it recently.)
UCRT is what Visual Studio toolchains use; it's less troublesome than MSVCRT (especially wrt locales); and I'd be surprised if SP-5.40 is not built using a "UCRT" toolchain.
If you really need a perl that is best optimized to run those benchmarking tests, then you should probably use a static perl (ie built without threads - USE_MULTI=undef
, USE_ITHREADS=undef
and USE_IMP_SYS=undef
).
Such builds disable threads and don't provide the fork() function. They are therefore considered unfit for general usage - so don't expect StrawberryPerl to ever provide them,
As regards these benchmarking scripts on perl-5.38.0 (MSWin32-x64-mult-thread), I found that the best performing configuration was mingw-w64-built with "-s -O2", followed by mingw-w64-built with "-Os", followed by msvc143-built with "-O1 -Zi -GL -fp:precise".
But I didn't think that any of the differences were outstandingly bad or good.
I've started experimenting with UCRT builds. I have a set of external libs but have yet to try building perl (the recent perl releases took priority).
Issues are being tracked under #152
Performance improved.
@aero, I've checked to see how much further improvement you'll see with an unthreaded build.
I have an unthreaded build of 5.38.0 ('-s -O2') and a threaded build of 5.38.0 ('-s -O2')
Threaded:
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.17 usr + 0.00 sys = 0.17 CPU) @ 290697.67/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.11 usr + 0.00 sys = 0.11 CPU) @ 458715.60/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 290698/s -- -37%
Kevin 458716/s 58% --
Unthreaded:
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.14 usr + 0.00 sys = 0.14 CPU) @ 357142.86/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 1 wallclock secs ( 0.09 usr + 0.00 sys = 0.09 CPU) @ 537634.41/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 357143/s -- -34%
Kevin 537634/s 51% --
The other advantage with that unthreaded 5.38.0 build (over that threaded 5.38.0 build) is that it passes all tests.
You should run gmake test
on your 5.38.x ('-s -O2) builds, just so you know how horribly broken they are.
For me, repeated runs of gmake test
can throw up different failures - ie there are some test scripts that don't fail every time.
If you're interested in trying the unthreaded build, just add USE_MULTI=undef USE_ITHREADS=undef USE_IMP_SYS=undef
to your existing args.
For the record, the same script on the same machine, using StrawberryPer-5.38.0 ('-Os') produced:
Benchmark: timing 50000 iterations of Fish, Kevin...
Fish: 0 wallclock secs ( 0.19 usr + 0.00 sys = 0.19 CPU) @ 265957.45/s (n=50000)
(warning: too few iterations for a reliable count)
Kevin: 0 wallclock secs ( 0.12 usr + 0.00 sys = 0.12 CPU) @ 400000.00/s (n=50000)
(warning: too few iterations for a reliable count)
Rate Fish Kevin
Fish 265957/s -- -34%
Kevin 400000/s 50% --
AFAIK, the only differences between 5.38.0 and 5.38.2 is that 5.38.2 includes 2 security fixes.
You should find very little (if any) difference in performance between 5.38.0 and 5.38.2 that were built with the same optimization level.
(Same goes for 5.34.3 and 5.36.3 - all that has changed is the inclusion of the 2 security fixes.)