riscv-non-isa/riscv-elf-psabi-doc

Support big-endian/vendor RISC-V target triplets.

Nelson1225 opened this issue · 8 comments

Hi Guys,

Recently, we get the GNU binutils patches, which were implemented by Marcus Comstedt, to support the RISC-V big endian targets [1]. In these patches, we need to support new target names for big endian. For example, riscv[32|64]be-unknown-elf. And we also need to add a new file format, elf[32|64]-bigriscv, when using objdump -h.

Besides, I had heard that Jeremy Bennett from Embecosm also have a vendor specific toolchain proposal [2]. He would like to use the vendor field of the target triplet for the vendor specific tools, so we can use the vendor name instead of the unknown for the original riscv[32|64]-unknown-elf. For example, riscv[32|64]-sifive-elf.

Therefore, there are four issues here,

  1. Is the target riscv[32|64]be-unknown-elf for big endain good to you?
  2. Should we need to add the riscv[32|64]le-unknown-elf for the original little endian toolchain?
  3. How to deal with the original riscv[32|64]-unknown-elf, just keep them as little endian?
  4. How about the riscv[32|64]-sifive-elf?

If the answers to the above four questions are all yes, then we will have the following target triples in the future,
riscv32-unknown-elf, riscv32le-unknown-elf, riscv32be-unknown-elf,
riscv32-vendor-elf, riscv32le-vendor-elf, riscv32be-vendor-elf,
riscv64-unknown-elf, riscv64le-unknown-elf, riscv64be-unknown-elf,
riscv64-vendor-elf, riscv64le-vendor-elf, riscv64be-vendor-elf,...

However, I can not find any spec that had mentioned the target triplets, so we probably need to add them somewhere in the future.

Thanks
Nelson

[1] https://sourceware.org/pipermail/binutils/2020-December/114632.html
[2] https://docs.google.com/document/d/1CgjH2xyC7jmJh-lNAyozbv_OSeK0mBH5D3oJrgGdI7Y/edit

I don't think that we need riscv32le- tuples, and should just assume that riscv32 is always little endian.

Note that riscv- is allowed, but not recommended and not supported.

Presumably the -vendor- is a variable where vendor can be any organization name. I don't think it should be necessary for companies to request permission to use it. Or that we need to track the names being used in the psABI.

The binutils deadline for the next release is roughly Jan 8, so we need a decision soon if this will be included in the next binutils release, which in turn is necessary if we want this in the next gcc release. Otherwise, the gcc support will have to wait a year.

I don't know if there is LLVM big-endian support or patches.

There is one itneresting issue with relocations here. Big-endian RISC-V has little-endian code. So data relocations will be big or little endian depending on the object file format, but code relocations will always be little endian. So this is effectively a change to how relocations are defined, and we may need to start making a distinction between which relocs are data relocs and which ones are code relocs. Currently there is no overlap. But Huawei has proposed a 48-bit l.li instruction that has 16 bits of opcode and a 32-bit immediate. It would make sense to use R_RISCV_32 for the immediate, but that won't work for a big-endian target. So we would need a new reloc which is like R_RISCV_32 but is always little-endian for use in code.

Otherwise, I don't see any issues with adding big-endian support.

As far as I know, there's no Big-Endian LLVM support, or pending patches.

I also don't feel the need to track the list of vendors. The clang driver will parse unhandled vendors into -unknown- anyway, iirc. This is simpler in the driver, but if you use a custom vendor to enable specific incompatible functionality, you will run into problems as mainline clang won't know about that.

I do think that riscv32- and riscv64- should default to little-endian, and riscv32be- and riscv64be- should denote big endian.

Separating code and data relocations makes sense to me if it's needed for big-endian platforms.

I would prefer decoupling vendor filed in target triple and vendor specific extension, extension should always control by -march string otherwise we would result bunch of option and hard to track version.

@kito-cheng Yeah, I'm just saying where someone might run into issues if they do decide to do vendor-specific behaviour.

@lenary Oh, sorry for confusion, my reply is not kind of debating or against your comment, just your reply remind me I should have to post my opinion here, I only talk that privately with Nelson :P

Let's fiercely agree with each other :)

Kito send a patch to config-patches for the riscv*be tuples and it was accepted and merged yesterday. So it is now in upstream config.sub and config.guess which are used by the GNU toolchain.

I guess we should document triple name issue on https://github.com/riscv/riscv-toolchain-conventions rather than here, so I gonna close this issue.

And created an issue for relocation behavior for big-endian.
#176