google/autofdo

Unrecognized sample profile encoding format

joshua-arch1 opened this issue · 13 comments

After I use create_llvm_prof to read perf.data and the binary file ./code, I get the profile data in code.prof. However, when I build the code again using the collected profile, I got an error:

./code.prof: Could not open profile: Unrecognized sample profile encoding format

I think there is something wrong with my code.prof. What may cause this failure?

Also, the generation of code.prof also emit some errors.

E20221114 17:06:22.664913 92878 sample_reader.cc:280] No buildid found in binary
W20221114 17:06:22.676327 92878 profile.cc:102] use_lbr was enabled but range_count_map was empty!
W20221115 11:12:29.006178 69856 llvm_profile_writer.cc:46] Got an empty profile map. The output file might still be not empty (e.g., containing symbol list in binary format) but might be not helpful as a profile

Are the any connections between these errors?

Hi, could you share the command that generated "code.prof"?

As to E20221114 17:06:22.664913 92878 sample_reader.cc:280] No buildid found in binary, it means the binary is not built with "-Wl,--build-id", could you try rebuilding the binary with linking-time flag "-Wl,--build-id" and see?

Hi, could you share the command that generated "code.prof"?

As to E20221114 17:06:22.664913 92878 sample_reader.cc:280] No buildid found in binary, it means the binary is not built with "-Wl,--build-id", could you try rebuilding the binary with linking-time flag "-Wl,--build-id" and see?

I have rebuilt the binary with "-Wl,--build-id" and there is no 'No buildid found in binary' any more. Thanks! The other error stil exists.

The commands that generate the binary, perf.data and code.prof as well as rebuild the code using the collected profile are as follows.

clang -O3  -gline-tables-only -DLINUX  -Wl,--build-id -Wno-error=int-conversion -Wno-reserved-user-defined-literal -w  -Wno-return-type -Wno-c++11-narrowing -Wno-reserved-user-defined-literal -Ilinux64 -I. -DFLAGS_STR=\""   -lrt"\" code.c -o ./code.exe -lrt

perf record -b ./code.exe

../autofdo/build/create_llvm_prof --binary=code.exe --out=code.prof

clang -O3  -gline-tables-only -fprofile-sample-use=code.prof -DLINUX  -Wl,--build-id -Wno-error=int-conversion -Wno-reserved-user-defined-literal -w  -Wno-return-type -Wno-c++11-narrowing -Wno-reserved-user-defined-literal -Ilinux64 -I. -DFLAGS_STR=\""   -lrt"\" code.c -o ./code.exe -lrt

Hi, if I generate "code.prof" with --use_lbr=false, I can use the collected profile to rebuild the code without any errors. However, I cannot see any performance improvement in the rebuilt binary. Is that because LBR is disabled? I'm wondering what role LBR is playing in AutoFDO.

Hi, LBR is required for autofdo (in short, we compute edge counters from LBR and save block counters deduced from edge counters in the autofdo profile).

And after you add -Wl,--build-id, the error message was gone, but the warning messages below

W20221114 17:06:22.676327 92878 profile.cc:102] use_lbr was enabled but range_count_map was empty!
W20221115 11:12:29.006178 69856 llvm_profile_writer.cc:46] Got an empty profile map. The output file might still be not empty (e.g., containing symbol list in binary format) but might be not helpful as a profile

still persists, right? Do you get "code.prof" file or nothing for the following commands?
../autofdo/build/create_llvm_prof --binary=code.exe --out=code.prof

Hi, LBR is required for autofdo (in short, we compute edge counters from LBR and save block counters deduced from edge counters in the autofdo profile).

And after you add -Wl,--build-id, the error message was gone, but the warning messages below

W20221114 17:06:22.676327 92878 profile.cc:102] use_lbr was enabled but range_count_map was empty!
W20221115 11:12:29.006178 69856 llvm_profile_writer.cc:46] Got an empty profile map. The output file might still be not empty (e.g., containing symbol list in binary format) but might be not helpful as a profile

still persists, right? Do you get "code.prof" file or nothing for the following commands? ../autofdo/build/create_llvm_prof --binary=code.exe --out=code.prof

I can get "code.prof", but it cannot be used to rebuild the code, with the error "Unrecognized sample profile encoding format".
However, if I generate "code.prof" with --use_lbr=false, I can use it to rebuild the code without any errors. Under this circumstance, there is no performance improvement in the rebuilt binary.

Hi, LBR is required for autofdo (in short, we compute edge counters from LBR and save block counters deduced from edge counters in the autofdo profile).

So does it mean AutoFDO can only be applied in Intel CPUs?

I can get "code.prof", but it cannot be used to rebuild the code, with the error "Unrecognized sample profile encoding format".
I see. This is not expected, even if the code.prof contains no useful, it should be consumed correctly. Could you share one of your compiler commands with all the flags?

So does it mean AutoFDO can only be applied in Intel CPUs?
I would say - AutoFDO profiles can only be collected for Intel CPUs, but binaries optimized by such profiles see similar performance improvement on both Intel and AMD platforms. In other words, such profiles are applicable to both Intel and AMD CPUs.

So does it mean AutoFDO can only be applied in Intel CPUs? I would say - AutoFDO profiles can only be collected for Intel CPUs, but binaries optimized by such profiles see similar performance improvement on both Intel and AMD platforms. In other words, such profiles are applicable to both Intel and AMD CPUs.

So AutoFDO profiles are only applicable to CPUs with X86 architecture, right? There will be no performance improvement on CPUs with RISCV architecture?

So AutoFDO profiles are only applicable to CPUs with X86 architecture, right?
The above is only partially correct. The profiles (which, for now, can only be collected from INTEL CPUs) can be applied to heterogeneous architectures, if the binary on RISCV architecture has the same execution path as on X86. Usually, source code that are meant for different architectures have macros to control which execution path to take according to architecture (for example, "#ifdef ARCH_X86 then do A; #ifdef ARCH_MIPS then do B;"...) (and also note, such paths are usually hot) . For such cases, the compiler will optimize in favor of the X86 paths, and that means degrading the performance of the other execution path.

jwbee commented

Just to chime in on the above, it might not necessarily be counterproductive to apply an AutoFDO profile taken from x86 on an ARM build. I have satisfactory success with an ARM production service where the AutoFDO information is taken from a canary environment that runs on x86.

Thanks @jwbee for providing more context.

Back to the warning message, is there any connection between empty range_count_map and disabled LBR? What may cause empty range_count_map in AutoFDO?

Yes, range_count_map gets its value from LBR, if LBR is empty, then range_count_map will be empty too.