fortran-lang/test-drive

Change the assignment paradigm of the allocatable array in `test-drive` demos

Opened this issue · 4 comments

Currently, test-drive uses this paradigm:

testsuites = [ &
    new_testsuite("suite1", collect_suite1), &
    new_testsuite("suite2", collect_suite2) &
    ]

In fpm test -V with default flags will result in the following warning:

test\check.f90:17:54:

   17 |     type(testsuite_type), allocatable :: testsuites(:)
      |                                                      ^
note: 'testsuites' declared here
test\check.f90:28:18:

   28 |                  ]
      |                  ^
Warning: 'testsuites.dim[0].lbound' may be used uninitialized [-Wmaybe-uninitialized]

Should we pursue zero warnings here and use the following allocation style to update the test-drive demos?

allocate (testsuites, source=[ &
          new_testsuite("suite1", collect_suite1), &
          new_testsuite("suite2", collect_suite2) &
          ])

This is indeed a false positive warning. We use automatic LHS allocation here. An alternative would be to allocate the array with a dummy size of zero and than assign.

Actually, I think LHS is the most efficient, it only involves fewer operations; but I also feel that allocate(.., source=..) will be optimized by the compiler, I tried to write performance test code below, it seems that we are too worrying about the impact of a small amount of allocation on performance, I believe that the number of unit tests for ordinary users will not reach the order of N=10000000. Even if it is achieved, the efficiency difference of these schemes will be very close:

Bench Code with N=10000000
!> Time functions
module time_m

    implicit none
    private

    public :: tic, toc

    integer, save :: time_save !! save the time

contains

    !> Start timer
    impure subroutine tic()
        call system_clock(time_save)
    end subroutine tic

    !> Stop timer and return the time
    impure subroutine toc(t)
        class(*), optional :: t    !! time in seconds
        integer :: time_now, time_rate

        call system_clock(time_now, time_rate)

        associate (dt => real(time_now - time_save)/time_rate)

            if (present(t)) then
                select type (t)
                type is (real)
                    t = dt
                type is (double precision)
                    t = real(dt, 8)
                type is (integer)
                    t = nint(dt)
                type is (character(*))
                    write (*, "(2a,g0.3,a)") t, ', time elapsed: ', dt, " s"
                class default
                    write (*, '(a)') 'Error: unknown type of t in toc()'
                end select
            else
                write (*, "(a,g0.3,a)") 'Time elapsed: ', dt, " s"
            end if

        end associate

    end subroutine toc

end module time_m

program main

    use time_m, only: tic, toc
    implicit none

    type node_t
        real(8), allocatable :: item(:)
    end type node_t

    integer, parameter :: N = 10000000
    type(node_t) :: x(N), y(N), z(N)
    integer :: i

    call tic()
    do i = 1, N
        x(i)%item = [real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    end do
    call toc("1. LHS")

    call tic()
    do i = 1, N
        allocate (z(i)%item, source=[real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    end do
    call toc("2. allocate(z, source=..)")

    call tic()
    do i = 1, N
        allocate (y(i)%item(0))
        y(i)%item = [real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    end do
    call toc("3. allocate(y(0)), LHS")

end program main
>> fpm run --profile release   # On Windows-MSYS2 gfortran, N=10000000 (on my laptop)
1. LHS, time elapsed: 1.73 s
2. allocate(z, source=..), time elapsed: 1.86 s
3. allocate(y(0)), LHS, time elapsed: 3.30 s

On godbolt.org with N=300000:

# ifort 2021.6.0
1. LHS, time elapsed: .593E-01 s
2. allocate(z, source=..), time elapsed: .636E-01 s
3. allocate(y(0)), LHS, time elapsed: .893E-01 s
# gfortran 12.2
1. LHS, time elapsed: 0.410E-1 s
2. allocate(z, source=..), time elapsed: 0.420E-1 s
3. allocate(y(0)), LHS, time elapsed: 0.570E-1 s

From the bench results, we can see that the performances of 1 and 2 are close, but the efficiency ranking is 1 > 2 > 3.

Using allocate(.., source=..) does make the code look a little unintuitive (complicated), but in fact allocate(.., source=..) is the way we may use most.

A Different Bench Code with N=10000: 1 > 3 > 2
!> Time functions
module time_m

    implicit none
    private

    public :: tic, toc

    integer, save :: time_save !! save the time

contains

    !> Start timer
    impure subroutine tic()
        call system_clock(time_save)
    end subroutine tic

    !> Stop timer and return the time
    impure subroutine toc(t)
        class(*), optional :: t    !! time in seconds
        integer :: time_now, time_rate

        call system_clock(time_now, time_rate)

        associate (dt => real(time_now - time_save)/time_rate)

            if (present(t)) then
                select type (t)
                type is (real)
                    t = dt
                type is (double precision)
                    t = real(dt, 8)
                type is (integer)
                    t = nint(dt)
                type is (character(*))
                    write (*, "(2a,g0.3,a)") t, ', time elapsed: ', dt, " s"
                class default
                    write (*, '(a)') 'Error: unknown type of t in toc()'
                end select
            else
                write (*, "(a,g0.3,a)") 'Time elapsed: ', dt, " s"
            end if

        end associate

    end subroutine toc

end module time_m

program main

    use time_m, only: tic, toc
    implicit none

    type node_t
        real(8), allocatable :: item(:)
    end type node_t

    integer, parameter :: N = 20000
    type(node_t) :: x(N), y(N), z(N)
    integer :: i
    real(8), allocatable :: items(:)

    allocate (items, source=[real(8) :: (i, i=1, N)])

    call tic()
    do i = 1, N
        x(i)%item = items
    end do
    call toc("1. LHS")

    call tic()
    do i = 1, N
        allocate (z(i)%item, source=items)
    end do
    call toc("2. allocate(z, source=..)")

    call tic()
    do i = 1, N
        allocate (y(i)%item(0))
        y(i)%item = items
    end do
    call toc("3. allocate(y(0)), LHS")

end program main
>> fpm run --profile release   # On Windows-MSYS2 gfortran, N=10000 (on my laptop)
1. LHS, time elapsed: 0.672 s
2. allocate(z, source=..), time elapsed: 0.828 s
3. allocate(y(0)), LHS, time elapsed: 0.688 s

The collect interface is only invoked once, the actual performance critical part is running the body of the tests which are referenced via a procedure pointer. I wouldn't worry too much about the performance difference between those solutions.