Enable `f16` and `f128` in assembly on platforms that support it
tgross35 opened this issue · 5 comments
The below should work, but errors that f16 is not usable for registers:
#![feature(f16, f128)]
use core::arch::asm;
#[inline(never)]
pub fn f32_to_f16(a: f32) -> f16 {
a as f16
}
#[inline(never)]
pub fn f32_to_f16_asm(a: f32) -> f16 {
let ret: f16;
unsafe {
asm!(
"fcvt {ret:h}, {a:s}",
a = in(vreg) a,
ret = lateout(vreg) ret,
options(nomem, nostack),
);
}
ret
}On aarch64 the first function generates:
example::f32_to_f16::hc897184dfb47f3d6:
fcvt h0, s0
retf16 types should be supported as a vreg on aarch64 in order to reproduce that code.
The following other platforms also apparently have some level of instruction support, but are less well documented:
arm-*,armv7-*,aarch64-*, https://developer.arm.com/documentation/den0024/a/Porting-to-A64/Data-types- PowerPC PowerISA apparently has a half-precision format according to section 7.3.2.1 https://files.openpower.foundation/s/dAYSdGzTfW4j2r2, but I can't get LLVM to emit any instructions for it. Per others, the VSX feature on PowerISA 3.1+ has conversion support for
f16, and the SVP64 feature (which I can't find documented anywhere) adds full hardware support - MIPS with the MSA extension...? https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00868-1D-MSA64-AFP-01.12.pdf section 3.1 says "16-bit floating-point storage format is supported through conversion instructions to/from 32-bit floating-point data.", I am unsure whether its vector registers have any special support
- riscv64gc with the Q extension: https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2023-05-23/src/q-st-ext.adoc
x86specifies an ABI for these types, and AVX512fp16 can use them
Additionally, for f128:
- s390x supports
f128, referred to as "BFP Extended Format" in https://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf. I am not sure if this comes with any special instructions. - PowerPC with
-Ctarget-cpu=pwr9seems to have f128 support via instructions likexsaddqp
Tracking issue: #116909
I'm adding E-Easy because a PR that just enables support for aarch64 should be pretty easy, start around
rust/compiler/rustc_hir_analysis/src/check/intrinsicck.rs
Lines 65 to 66 in b54dd08
Sample for reference: https://rust.godbolt.org/z/zK4qha1qo
@rustbot label +T-compiler +E-Easy +F-f16_and_f128 +A-inline-assembly -needs-triage
@tgross35 I can try to submit a PR, can you give me some guidance?
Hi @lengrongfu, thanks for the interest!
This should be pretty easy I think. Start by making a test in tests/ui/asm/ that contains the assembly function from my original post. Make sure this fails when you run ./x t --stage 1 path/to/your/new/test.rs.
Then just find where the error is emitted (search the codebase for "cannot use value of type") and work backwards from that until the test passes. This will probably mean adding F16 to InlineAsmType and then chasing down errors.
We will need to make sure that this works on platforms with support (e.g. aarch64) but still fails on those without it (e.g. x86). Just focus on getting aarch64 to build first.
There is a compiler help stream on Zulip https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp feel free to ask if you get stuck! Also not a bad idea to post a draft PR as soon as you have some basic work done, even if not yet passing.
I think E-easy label should be removed from this issue.
Fair enough - it is still pretty easy for a compiler change, but does require some background knowledge.