WojciechMula/simd-sort

bug

bbqz007 opened this issue · 1 comments

speed.cpp, PerformanceTest ::run line 61

        int k = iterations;
        while (k--) {
            memcpy(tmp, input.pointer(), input.size());

            uint64_t t1, t2;

#ifdef USE_RDTSC
            RDTSC_START(t1);
#else
            t1 = get_time();
#endif

/**BUG**/            sort(input.pointer(), 0, input.count() - 1);

#ifdef USE_RDTSC
            RDTSC_START(t2);
#else
            t2 = get_time();
#endif

            const uint64_t dt = t2 - t1;

            if (time == 0) {
                time = dt;
            } else if (dt < time) {
                time = dt;
            }
        }

you only sort the sorted result input again and again.

in your bug version, the quicksort.cpp is even more than 1.3x faster than the std::sort. the men working on the std::aglorithm should be fired according your bug version performance tests.

it wastes my so much time to find out the bug is not the about the std::sort , it is your test program.

second, not the average time. this test program only compares the max times.

            if (time == 0) {
                time = dt;
            } else if (dt < time) {
                time = dt;
            }

fix the bugs, and performs on a tigerlake machine with os centos8.

rdtsc_overhead set to 61
items count: 10000 (40000 bytes), input ascending
                     std::sort ...     293845 cycles
                    std::qsort ...    1419950 cycles (0.21)
              std::stable_sort ...     298183 cycles (0.99)
                    quick sort ...     336029 cycles (0.87)
               AVX2 quick sort ...     266871 cycles (1.10)
             AVX2 nate nodutch ...     903878 cycles (0.33)
            AVX2 alt quicksort ...     179822 cycles (1.63)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 47
items count: 100000 (400000 bytes), input ascending
                     std::sort ...    2547071 cycles
                    std::qsort ...    8688265 cycles (0.29)
              std::stable_sort ...    2136310 cycles (1.19)
                    quick sort ...    2326408 cycles (1.09)
               AVX2 quick sort ...    2651341 cycles (0.96)
             AVX2 nate nodutch ...   10011040 cycles (0.25)
            AVX2 alt quicksort ...    2007549 cycles (1.27)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 33
items count: 1000000 (4000000 bytes), input ascending
                     std::sort ...   26875915 cycles
                    std::qsort ...   94237159 cycles (0.29)
              std::stable_sort ...   25566262 cycles (1.05)
                    quick sort ...   24926151 cycles (1.08)
               AVX2 quick sort ...   27061265 cycles (0.99)
             AVX2 nate nodutch ...   90338075 cycles (0.30)
            AVX2 alt quicksort ...   15205451 cycles (1.77)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 36
items count: 2000000 (8000000 bytes), input ascending
                     std::sort ...   54740927 cycles
                    std::qsort ...  191259449 cycles (0.29)
              std::stable_sort ...   63663366 cycles (0.86)
                    quick sort ...   57873498 cycles (0.95)
               AVX2 quick sort ...   50341740 cycles (1.09)
             AVX2 nate nodutch ...  183899881 cycles (0.30)
            AVX2 alt quicksort ...   34638685 cycles (1.58)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 33
items count: 5000000 (20000000 bytes), input ascending
                     std::sort ...  160784679 cycles
                    std::qsort ...  516988610 cycles (0.31)
              std::stable_sort ...  170147452 cycles (0.94)
                    quick sort ...  148666860 cycles (1.08)
               AVX2 quick sort ...  144404060 cycles (1.11)
             AVX2 nate nodutch ...  461961441 cycles (0.35)
            AVX2 alt quicksort ...  103343265 cycles (1.56)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 34
items count: 10000000 (40000000 bytes), input ascending
                     std::sort ...  297299485 cycles
                    std::qsort ... 1020188718 cycles (0.29)
              std::stable_sort ...  368241326 cycles (0.81)
                    quick sort ...  307971587 cycles (0.97)
               AVX2 quick sort ...  297500223 cycles (1.00)
             AVX2 nate nodutch ...  937188146 cycles (0.32)
            AVX2 alt quicksort ...  223754460 cycles (1.33)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 39
items count: 20000000 (80000000 bytes), input ascending
                     std::sort ...  624617407 cycles
                    std::qsort ... 2092604659 cycles (0.30)
              std::stable_sort ...  802141593 cycles (0.78)
                    quick sort ...  637360602 cycles (0.98)
               AVX2 quick sort ...  602131585 cycles (1.04)
             AVX2 nate nodutch ... 1885045568 cycles (0.33)
            AVX2 alt quicksort ...  452525617 cycles (1.38)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 39
items count: 10000 (40000 bytes), input descending
                     std::sort ...     146284 cycles
                    std::qsort ...    1451871 cycles (0.10)
              std::stable_sort ...    1646797 cycles (0.09)
                    quick sort ...     356479 cycles (0.41)
               AVX2 quick sort ...     515933 cycles (0.28)
             AVX2 nate nodutch ...    1327611 cycles (0.11)
            AVX2 alt quicksort ...     399184 cycles (0.37)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 37
items count: 100000 (400000 bytes), input descending
                     std::sort ...    1504597 cycles
                    std::qsort ...   13542669 cycles (0.11)
              std::stable_sort ...   14153273 cycles (0.11)
                    quick sort ...    2742896 cycles (0.55)
               AVX2 quick sort ...    5019962 cycles (0.30)
             AVX2 nate nodutch ...    9546738 cycles (0.16)
            AVX2 alt quicksort ...    3954357 cycles (0.38)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 47
items count: 1000000 (4000000 bytes), input descending
                     std::sort ...   19098915 cycles
                    std::qsort ...  138738652 cycles (0.14)
              std::stable_sort ...  133652920 cycles (0.14)
                    quick sort ...   27803720 cycles (0.69)
               AVX2 quick sort ...   38790951 cycles (0.49)
             AVX2 nate nodutch ...   97976825 cycles (0.19)
            AVX2 alt quicksort ...   40924990 cycles (0.47)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 37
items count: 2000000 (8000000 bytes), input descending
                     std::sort ...   43333322 cycles
                    std::qsort ...  272768213 cycles (0.16)
              std::stable_sort ...  278923696 cycles (0.16)
                    quick sort ...   59057807 cycles (0.73)
               AVX2 quick sort ...   77366248 cycles (0.56)
             AVX2 nate nodutch ...  199392592 cycles (0.22)
            AVX2 alt quicksort ...   84424114 cycles (0.51)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 47
items count: 5000000 (20000000 bytes), input descending
                     std::sort ...  112433737 cycles
                    std::qsort ...  715098593 cycles (0.16)
              std::stable_sort ...  696716979 cycles (0.16)
                    quick sort ...  147709739 cycles (0.76)
               AVX2 quick sort ...  195062178 cycles (0.58)
             AVX2 nate nodutch ...  479599873 cycles (0.23)
            AVX2 alt quicksort ...  213483925 cycles (0.53)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 38
items count: 10000000 (40000000 bytes), input descending
                     std::sort ...  219947984 cycles
                    std::qsort ... 1458019409 cycles (0.15)
              std::stable_sort ... 1424532879 cycles (0.15)
                    quick sort ...  310495396 cycles (0.71)
               AVX2 quick sort ...  398211681 cycles (0.55)
             AVX2 nate nodutch ...  986584455 cycles (0.22)
            AVX2 alt quicksort ...  430096849 cycles (0.51)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 37
items count: 20000000 (80000000 bytes), input descending
                     std::sort ...  468977025 cycles
                    std::qsort ... 2994025140 cycles (0.16)
              std::stable_sort ... 2897070547 cycles (0.16)
                    quick sort ...  659121499 cycles (0.71)
               AVX2 quick sort ...  808867356 cycles (0.58)
             AVX2 nate nodutch ... 1998957573 cycles (0.23)
            AVX2 alt quicksort ...  888757937 cycles (0.53)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 39
items count: 10000 (40000 bytes), input random
                     std::sort ...    1452628 cycles
                    std::qsort ...    3051441 cycles (0.48)
              std::stable_sort ...    1736144 cycles (0.84)
                    quick sort ...    1928882 cycles (0.75)
               AVX2 quick sort ...    2169444 cycles (0.67)
             AVX2 nate nodutch ...    1478561 cycles (0.98)
            AVX2 alt quicksort ...    1075248 cycles (1.35)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 48
items count: 100000 (400000 bytes), input random
                     std::sort ...   14358738 cycles
                    std::qsort ...   31180390 cycles (0.46)
              std::stable_sort ...   17035081 cycles (0.84)
                    quick sort ...   20386753 cycles (0.70)
               AVX2 quick sort ...   16262278 cycles (0.88)
             AVX2 nate nodutch ...   12068852 cycles (1.19)
            AVX2 alt quicksort ...    8064445 cycles (1.78)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 48
items count: 1000000 (4000000 bytes), input random
                     std::sort ...  148083425 cycles
                    std::qsort ...  327957155 cycles (0.45)
              std::stable_sort ...  206943778 cycles (0.72)
                    quick sort ...  218183058 cycles (0.68)
               AVX2 quick sort ...  175630175 cycles (0.84)
             AVX2 nate nodutch ...  120765430 cycles (1.23)
            AVX2 alt quicksort ...   92078289 cycles (1.61)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 33
items count: 2000000 (8000000 bytes), input random
                     std::sort ...  312881918 cycles
                    std::qsort ...  678655830 cycles (0.46)
              std::stable_sort ...  424823931 cycles (0.74)
                    quick sort ...  445518637 cycles (0.70)
               AVX2 quick sort ...  356472596 cycles (0.88)
             AVX2 nate nodutch ...  247205879 cycles (1.27)
            AVX2 alt quicksort ...  176727991 cycles (1.77)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 38
items count: 5000000 (20000000 bytes), input random
                     std::sort ...  831042516 cycles
                    std::qsort ... 1775855783 cycles (0.47)
              std::stable_sort ... 1138433431 cycles (0.73)
                    quick sort ... 1167561776 cycles (0.71)
               AVX2 quick sort ...  900630689 cycles (0.92)
             AVX2 nate nodutch ...  619907874 cycles (1.34)
            AVX2 alt quicksort ...  457956725 cycles (1.81)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 37
items count: 10000000 (40000000 bytes), input random
                     std::sort ... 1707817457 cycles
                    std::qsort ... 3770678107 cycles (0.45)
              std::stable_sort ... 2403872136 cycles (0.71)
                    quick sort ... 2455152568 cycles (0.70)
               AVX2 quick sort ... 1883393399 cycles (0.91)
             AVX2 nate nodutch ... 1258242784 cycles (1.36)
            AVX2 alt quicksort ...  940110168 cycles (1.82)
           AVX2 Nate's variant ... tee: results.txt: Permission denied
rdtsc_overhead set to 38
items count: 20000000 (80000000 bytes), input random
                     std::sort ... 3574529158 cycles
                    std::qsort ... 3485499836 cycles (0.46)
              std::stable_sort ...  736216223 cycles (0.71)
                    quick sort ...  785188885 cycles (0.70)
               AVX2 quick sort ... 3866388924 cycles (0.92)
             AVX2 nate nodutch ... 2598757997 cycles (1.38)
            AVX2 alt quicksort ... 1927607322 cycles (1.85)
           AVX2 Nate's variant ... results.txt created