Change the assignment paradigm of the allocatable array in `test-drive` demos

Question

Change the assignment paradigm of the allocatable array in `test-drive` demos

Opened this issue 2 years ago · 4 comments

Currently, test-drive uses this paradigm:

testsuites = [ &
    new_testsuite("suite1", collect_suite1), &
    new_testsuite("suite2", collect_suite2) &
    ]

In fpm test -V with default flags will result in the following warning:

test\check.f90:17:54:

   17 |     type(testsuite_type), allocatable :: testsuites(:)
      |                                                      ^
note: 'testsuites' declared here
test\check.f90:28:18:

   28 |                  ]
      |                  ^
Warning: 'testsuites.dim[0].lbound' may be used uninitialized [-Wmaybe-uninitialized]

Should we pursue zero warnings here and use the following allocation style to update the test-drive demos?

allocate (testsuites, source=[ &
          new_testsuite("suite1", collect_suite1), &
          new_testsuite("suite2", collect_suite2) &
          ])

Answer 1 · 2022-09-02T05:39:27.000Z

This looks like a false-positive message from the compiler to me. There is no such use. If this change pacifies the compiler, then perhaps such a change is desirable, but I must say the code looks more complicated and it seems less efficient to me (that does not really matter in this case, but it constructs the combined array, then allocates space to store that amount of memory, copies the contents and then cleans up the temporary). How about: allocate( testsuites(0) ) testsuites = ... with a proper comment as to why? Op vr 2 sep. 2022 om 04:06 schreef zoziha ***@***.***>:

…

Currently, test-drive uses this paradigm: testsuites = [ & new_testsuite("suite1", collect_suite1), & new_testsuite("suite2", collect_suite2) & ] In fpm test -V with default flags will result in the following *warning*: test\check.f90:17:54: 17 | type(testsuite_type), allocatable :: testsuites(:) | ^ note: 'testsuites' declared here test\check.f90:28:18: 28 | ] | ^ Warning: 'testsuites.dim[0].lbound' may be used uninitialized [-Wmaybe-uninitialized] Should we pursue *zero warnings* here and use the following allocation style to update the test-drive demos? allocate (testsuites, source=[ & new_testsuite("suite1", collect_suite1), & new_testsuite("suite2", collect_suite2) & ]) — Reply to this email directly, view it on GitHub <#22>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAN6YRYW4ETYD54XPNZXJVDV4FOITANCNFSM6AAAAAAQC335YY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Answer 2 · 2022-09-02T07:07:57.000Z

This is indeed a false positive warning. We use automatic LHS allocation here. An alternative would be to allocate the array with a dummy size of zero and than assign.

Answer 3 · 2022-09-02T09:13:19.000Z

Actually, I think LHS is the most efficient, it only involves fewer operations; but I also feel that allocate(.., source=..) will be optimized by the compiler, I tried to write performance test code below, it seems that we are too worrying about the impact of a small amount of allocation on performance, I believe that the number of unit tests for ordinary users will not reach the order of N=10000000. Even if it is achieved, the efficiency difference of these schemes will be very close:

Bench Code with N=10000000

!> Time functions
module time_m

    implicit none
    private

    public :: tic, toc

    integer, save :: time_save !! save the time

contains

    !> Start timer
    impure subroutine tic()
        call system_clock(time_save)
    end subroutine tic

    !> Stop timer and return the time
    impure subroutine toc(t)
        class(*), optional :: t    !! time in seconds
        integer :: time_now, time_rate

        call system_clock(time_now, time_rate)

        associate (dt => real(time_now - time_save)/time_rate)

            if (present(t)) then
                select type (t)
                type is (real)
                    t = dt
                type is (double precision)
                    t = real(dt, 8)
                type is (integer)
                    t = nint(dt)
                type is (character(*))
                    write (*, "(2a,g0.3,a)") t, ', time elapsed: ', dt, " s"
                class default
                    write (*, '(a)') 'Error: unknown type of t in toc()'
                end select
            else
                write (*, "(a,g0.3,a)") 'Time elapsed: ', dt, " s"
            end if

        end associate

    end subroutine toc

end module time_m

program main

    use time_m, only: tic, toc
    implicit none

    type node_t
        real(8), allocatable :: item(:)
    end type node_t

    integer, parameter :: N = 10000000
    type(node_t) :: x(N), y(N), z(N)
    integer :: i

    call tic()
    do i = 1, N
        x(i)%item = [real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    end do
    call toc("1. LHS")

    call tic()
    do i = 1, N
        allocate (z(i)%item, source=[real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
    end do
    call toc("2. allocate(z, source=..)")

    call tic()
    do i = 1, N
        allocate (y(i)%item(0))
        y(i)%item = [real(8) :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    end do
    call toc("3. allocate(y(0)), LHS")

end program main

>> fpm run --profile release   # On Windows-MSYS2 gfortran, N=10000000 (on my laptop)
1. LHS, time elapsed: 1.73 s
2. allocate(z, source=..), time elapsed: 1.86 s
3. allocate(y(0)), LHS, time elapsed: 3.30 s

On godbolt.org with N=300000:

# ifort 2021.6.0
1. LHS, time elapsed: .593E-01 s
2. allocate(z, source=..), time elapsed: .636E-01 s
3. allocate(y(0)), LHS, time elapsed: .893E-01 s
# gfortran 12.2
1. LHS, time elapsed: 0.410E-1 s
2. allocate(z, source=..), time elapsed: 0.420E-1 s
3. allocate(y(0)), LHS, time elapsed: 0.570E-1 s

From the bench results, we can see that the performances of 1 and 2 are close, but the efficiency ranking is 1 > 2 > 3.

Using allocate(.., source=..) does make the code look a little unintuitive (complicated), but in fact allocate(.., source=..) is the way we may use most.

A Different Bench Code with N=10000: 1 > 3 > 2

!> Time functions
module time_m

    implicit none
    private

    public :: tic, toc

    integer, save :: time_save !! save the time

contains

    !> Start timer
    impure subroutine tic()
        call system_clock(time_save)
    end subroutine tic

    !> Stop timer and return the time
    impure subroutine toc(t)
        class(*), optional :: t    !! time in seconds
        integer :: time_now, time_rate

        call system_clock(time_now, time_rate)

        associate (dt => real(time_now - time_save)/time_rate)

            if (present(t)) then
                select type (t)
                type is (real)
                    t = dt
                type is (double precision)
                    t = real(dt, 8)
                type is (integer)
                    t = nint(dt)
                type is (character(*))
                    write (*, "(2a,g0.3,a)") t, ', time elapsed: ', dt, " s"
                class default
                    write (*, '(a)') 'Error: unknown type of t in toc()'
                end select
            else
                write (*, "(a,g0.3,a)") 'Time elapsed: ', dt, " s"
            end if

        end associate

    end subroutine toc

end module time_m

program main

    use time_m, only: tic, toc
    implicit none

    type node_t
        real(8), allocatable :: item(:)
    end type node_t

    integer, parameter :: N = 20000
    type(node_t) :: x(N), y(N), z(N)
    integer :: i
    real(8), allocatable :: items(:)

    allocate (items, source=[real(8) :: (i, i=1, N)])

    call tic()
    do i = 1, N
        x(i)%item = items
    end do
    call toc("1. LHS")

    call tic()
    do i = 1, N
        allocate (z(i)%item, source=items)
    end do
    call toc("2. allocate(z, source=..)")

    call tic()
    do i = 1, N
        allocate (y(i)%item(0))
        y(i)%item = items
    end do
    call toc("3. allocate(y(0)), LHS")

end program main

>> fpm run --profile release   # On Windows-MSYS2 gfortran, N=10000 (on my laptop)
1. LHS, time elapsed: 0.672 s
2. allocate(z, source=..), time elapsed: 0.828 s
3. allocate(y(0)), LHS, time elapsed: 0.688 s

Answer 4 · 2022-09-02T11:07:15.000Z

The collect interface is only invoked once, the actual performance critical part is running the body of the tests which are referenced via a procedure pointer. I wouldn't worry too much about the performance difference between those solutions.