Change the assignment paradigm of the allocatable array in `test-drive` demos
Opened this issue · 4 comments
Currently, test-drive
uses this paradigm:
testsuites = [ &
new_testsuite("suite1", collect_suite1), &
new_testsuite("suite2", collect_suite2) &
]
In fpm test -V
with default flags will result in the following warning:
test\check.f90:17:54:
17 | type(testsuite_type), allocatable :: testsuites(:)
| ^
note: 'testsuites' declared here
test\check.f90:28:18:
28 | ]
| ^
Warning: 'testsuites.dim[0].lbound' may be used uninitialized [-Wmaybe-uninitialized]
Should we pursue zero warnings here and use the following allocation style to update the test-drive
demos?
allocate (testsuites, source=[ &
new_testsuite("suite1", collect_suite1), &
new_testsuite("suite2", collect_suite2) &
])
This is indeed a false positive warning. We use automatic LHS allocation here. An alternative would be to allocate the array with a dummy size of zero and than assign.
Actually, I think LHS
is the most efficient, it only involves fewer operations; but I also feel that allocate(.., source=..)
will be optimized by the compiler, I tried to write performance test code below, it seems that we are too worrying about the impact of a small amount of allocation on performance, I believe that the number of unit tests for ordinary users will not reach the order of N=10000000. Even if it is achieved, the efficiency difference of these schemes will be very close:
Bench Code with N=10000000
!> Time functions
module time_m
implicit none
private
public :: tic, toc
integer, save :: time_save !! save the time
contains
!> Start timer
impure subroutine tic()
call system_clock(time_save)
end subroutine tic
!> Stop timer and return the time
impure subroutine toc(t)
class(*), optional :: t !! time in seconds
integer :: time_now, time_rate
call system_clock(time_now, time_rate)
associate (dt => real(time_now - time_save)/time_rate)
if (present(t)) then
select type (t)
type is (real)
t = dt
type is (double precision)
t = real(dt, 8)
type is (integer)
t = nint(dt)
type is (character(*))
write (*, "(2a,g0.3,a)") t, ', time elapsed: ', dt, " s"
class default
write (*, '(a)') 'Error: unknown type of t in toc()'
end select
else
write (*, "(a,g0.3,a)") 'Time elapsed: ', dt, " s"
end if
end associate
end subroutine toc
end module time_m
program main
use time_m, only: tic, toc
implicit none
type node_t
real(8), allocatable :: item(:)
end type node_t
integer, parameter :: N = 10000000
type(node_t) :: x(N), y(N), z(N)
integer :: i
call tic()
do i = 1, N
x(i)%item = [real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
end do
call toc("1. LHS")
call tic()
do i = 1, N
allocate (z(i)%item, source=[real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
end do
call toc("2. allocate(z, source=..)")
call tic()
do i = 1, N
allocate (y(i)%item(0))
y(i)%item = [real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
end do
call toc("3. allocate(y(0)), LHS")
end program main
>> fpm run --profile release # On Windows-MSYS2 gfortran, N=10000000 (on my laptop)
1. LHS, time elapsed: 1.73 s
2. allocate(z, source=..), time elapsed: 1.86 s
3. allocate(y(0)), LHS, time elapsed: 3.30 s
On godbolt.org with N=300000:
# ifort 2021.6.0
1. LHS, time elapsed: .593E-01 s
2. allocate(z, source=..), time elapsed: .636E-01 s
3. allocate(y(0)), LHS, time elapsed: .893E-01 s
# gfortran 12.2
1. LHS, time elapsed: 0.410E-1 s
2. allocate(z, source=..), time elapsed: 0.420E-1 s
3. allocate(y(0)), LHS, time elapsed: 0.570E-1 s
From the bench results, we can see that the performances of 1 and 2 are close, but the efficiency ranking is 1 > 2 > 3.
Using allocate(.., source=..)
does make the code look a little unintuitive (complicated), but in fact allocate(.., source=..)
is the way we may use most.
A Different Bench Code with N=10000: 1 > 3 > 2
!> Time functions
module time_m
implicit none
private
public :: tic, toc
integer, save :: time_save !! save the time
contains
!> Start timer
impure subroutine tic()
call system_clock(time_save)
end subroutine tic
!> Stop timer and return the time
impure subroutine toc(t)
class(*), optional :: t !! time in seconds
integer :: time_now, time_rate
call system_clock(time_now, time_rate)
associate (dt => real(time_now - time_save)/time_rate)
if (present(t)) then
select type (t)
type is (real)
t = dt
type is (double precision)
t = real(dt, 8)
type is (integer)
t = nint(dt)
type is (character(*))
write (*, "(2a,g0.3,a)") t, ', time elapsed: ', dt, " s"
class default
write (*, '(a)') 'Error: unknown type of t in toc()'
end select
else
write (*, "(a,g0.3,a)") 'Time elapsed: ', dt, " s"
end if
end associate
end subroutine toc
end module time_m
program main
use time_m, only: tic, toc
implicit none
type node_t
real(8), allocatable :: item(:)
end type node_t
integer, parameter :: N = 20000
type(node_t) :: x(N), y(N), z(N)
integer :: i
real(8), allocatable :: items(:)
allocate (items, source=[real(8) :: (i, i=1, N)])
call tic()
do i = 1, N
x(i)%item = items
end do
call toc("1. LHS")
call tic()
do i = 1, N
allocate (z(i)%item, source=items)
end do
call toc("2. allocate(z, source=..)")
call tic()
do i = 1, N
allocate (y(i)%item(0))
y(i)%item = items
end do
call toc("3. allocate(y(0)), LHS")
end program main
>> fpm run --profile release # On Windows-MSYS2 gfortran, N=10000 (on my laptop)
1. LHS, time elapsed: 0.672 s
2. allocate(z, source=..), time elapsed: 0.828 s
3. allocate(y(0)), LHS, time elapsed: 0.688 s
The collect interface is only invoked once, the actual performance critical part is running the body of the tests which are referenced via a procedure pointer. I wouldn't worry too much about the performance difference between those solutions.