[Cargo] let build scripts pass (arbitrary) linker arguments to `rustc`

Question

[Cargo] let build scripts pass (arbitrary) linker arguments to `rustc`

japaric opened this issue 8 years ago · 17 comments

Use cases

Bare metal embedded development

Linking Rust programs for bare metal targets always (unless there some C in the mix) require passing a linker script to the linker. This linker script specifies the memory layout of the target device and, if omitted, the resulting binary will not work on the target device (the device will not boot or crash during the boot process).

The current way to pass these flags to the linker (using rustc as a proxy) is via the build.rustflags or target.$triple.rustflags key in .cargo/config:

# .cargo/config
[target.thumbv7em-none-eabihf]
rustflags = [
  "-C", "link-arg=-nostartfiles",
  "-C", "link-arg=-Tlayout.ld",
]

This is troublesome because:

A library can't pass a .cargo/config to its dependent crates. Every user of the library will have to manually copy the library .cargo/config in their binary project.

Other use cases

There are probably other use cases that I don't know about

Straw man proposal

Cargo will learn about a new "build script key": rustc-link-arg. Cargo will collect all the values under that key and pass those to all its rustc invocations via the -C link-arg flag. For example, this build script output:

cargo:rustc-link-arg=-Tlayout.ld
cargo:rustc-link-arg=-nostartfiles

Makes Cargo pass the -C link-arg=-Tlayout.ld -C link-arg=-nostartfiles to all its rustc invocations.

As with other cargo:$keys, Cargo will collect these values from all the dependencies.

Potential problems

Reduces composability of crates. For instance, what happens if two crates want to inject their own linker scripts? iirc, @nagisa mentioned that this form of "non-composability" is already present today in some other feature hmm ... was it symbol names (#[export_name], #[no_mangle])?
Linker arguments are ... linker dependent. So if a crate hard codes some linker arguments for a specific linker then that crate can't be used with any other linker. This doesn't seem like a problem for ELF files because rustc only supports gcc-style linkers. And if a crate needs to support two or more targets that use different linkers then the build.rs can simply pick different linker arguments based on the value of the $TARGET env variable.
Linker arguments are order sensitive. Depending on A and B might not be the same as depending on B and A because one combination will generate linker arguments in a different order than the other. We can't do much in this regard other than encourage users to only use order independent flags (the flags needed in the use case presented above are order independent). FWIW, today, build scripts can pass -l-style arguments to the linker and those are order sensitive.

cc @nagisa @alexcrichton
@jamesmunns and @cbiffle may be interested as well

Answer 1 · 2016-10-06T14:31:14.000Z

Reduces composability of crates. For instance, what happens if two crates want to inject their own linker scripts? iirc, @nagisa mentioned that this form of "non-composability" is already present today in some other feature hmm ... was it symbol names (#[export_name], #[no_mangle])?

I remember mentioning that, but I do not remember the exact context. There’s the #[link_args] attribute which certainly suffers from the issue.

Answer 2 · 2016-10-06T17:36:16.000Z

I'm of two minds about this. On one hand this is a feature we're not exposing, and makes certain flavors of development very painful. On the other hand though, having less surface area to the compiler allows us to provide a uniform and solid interface that works across many platforms and situations.

One example that comes to mind is the ability to select a subsystem on Windows. The actual way to do this is by passing a linker argument, but we're likely going to settle on #1665 which I personally view as a better alternative. If we allowed custom linker arguments, we may even break crates as a part of that change. I also personally feel that many linker arguments are best expressed via this style of crate attribute where possible. This doesn't cover all use cases, of course, but having a layer of abstraction between an intention and the actual linker argument allows us to easily change how it's implemented.

The point about having less surface area also dove-tails into the ability to change how we call the linker at will without worrying about the impact. We relatively frequently tweak how we work with symbols and/or possibly link order, and with custom link arguments it means you could silently be relying on a previous compiler and we could break you as we update.

Finally I've also often felt that if you truly want a robust stability guarantee here you should never use -C link-args. We'd sure love to change to lld at some point, which would break almost everyone using that argument. If you really want to have super precise control about linking that's why we have the staticlib crate type (it's explicitly designed to be consumed by linkers).

So all that's basically just a fancy way of saying that I can see a lot of downsides from allowing such easy propagation of custom linker arguments, but I'm not sure if they outweigh the benefits.

Answer 3 · 2016-10-06T17:49:32.000Z

@alexcrichton Doesn't lld handle arguments for various linkers? Which would give us more freedom.

Answer 4 · 2016-10-06T17:52:27.000Z

@eddyb for now, maybe, but perhaps not always. There's certainly a possible future where we use no command line api of LLD but rather use it entirely as a library where CLI arguments make no sense. In any case though I think the point about being brittle would still stand regardless, and in general rustc is a pretty robust compiler across platforms and such.

Answer 5 · 2016-10-06T17:53:55.000Z

@alexcrichton I meant library + argument parsing. We do the same for LLVM, but yeah, no guarantees.

Answer 6 · 2016-10-07T00:56:29.000Z

I also personally feel that many linker arguments are best expressed via this style of crate attribute where possible.

OK. so I can compromise with a more constrained form of injection of linker arguments. Allowing crates and/or build script to inject linker arguments of the form -T$linkerScript would solve half (*) of my use case (Pure Rust bare metal programs) problem. But people using Rust within C embedded frameworks seem to require a different level of custom linker flags. For example, this is what the teensy3-rs crate is using at the moment:

{
    // ..
    "pre-link-args": [
        "-mcpu=cortex-m4",
        "-mthumb",
        "-Tteensy3-sys/teensy3-core/mk20dx256.ld",
        "-Os",
        "-Wl,--gc-sections,--defsym=__rtc_localtime=0",
        "--specs=nano.specs"
    ],
    "post-link-args": [
        "-lm", "-Wl,--start-group", "-lnosys", "-lc", "-lgcc", "-Wl,--end-group"
    ]
    // ..
}

(Although, I think some of those flags, like -mcpu, -mthumb, etc., are not really needed)

So, there's that to consider.

(*) The other half of my problem is that I want to pass -nostartfiles to the linker because I don't use newlib startup objects (I've implemented crt0 in Rust) but creating a crate attribute just for that flag seems overkill. And there's also the fact that lld doesn't implicitly link startup object so the -nostartfiles flag doesn't exist over there.

Answer 7 · 2016-10-07T02:38:44.000Z

(Although, I think some of those flags, like -mcpu, -mthumb, etc., are not really needed)

They probably are needed because of multilib support in the toolchain. Gotta choose the right libc etc.

Answer 8 · 2016-10-07T04:06:34.000Z

@japaric oh so to clarify I definitely don't believe all use cases for linker arguments can be moved to crate attributes (like linker scripts). Additionally, I also feel like Cargo is too strict today for what ends up amounting to unnecessary reasons. I just personally struggle to reconcile that with also knowing that Cargo is pretty user-friendly today and linker errors are about the most user-unfriendly thing, and I'd be quite sad if Rust newbies hit those kinds of errors early on in working with Rust.

Answer 9 · 2016-10-07T04:24:15.000Z

And there's also the fact that lld doesn't implicitly link startup object so the -nostartfiles flag doesn't exist over there.

But that's because lld behaves like ld or gold, right? i.e. it's a linker, whereas we use gcc to link.

Answer 10 · 2016-10-07T12:51:31.000Z

I think the obvious solution to these sorts of problems is perma-unstable features. The escape hatch is there because the workaround is worse, but none of it is really condoned.

Answer 11 · 2016-10-07T16:12:05.000Z

But that's because lld behaves like ld or gold, right? i.e. it's a linker, whereas we use gcc to link.

Yes, that's true. I may be misremebering but I think lld at some was implictly adding library search path, which is something ld doesn't AFAIK, but the feature was removed/disabled because it caused problems in some distributions. (cf. the --nostdlib flag)

I think the obvious solution to these sorts of problems is perma-unstable features.

I'm in favor of this with the additional condition that we should be looking for ways to stabilize some type of linker arguments (like the -T ones for linker scripts) if it blocks some use cases from using the stable channel.

Answer 12 · 2017-05-13T14:11:52.000Z

Today we are using, with success, .cargo/config and/or custom targets to pass extra arguments to the linker. I think this is good enough for now and no longer see a need to add more linker customization to Cargo / build scripts so I'm going to close this issue.

I have opened a discussion on the rust-embedded/rfcs repo (see rust-embedded/wg#24) to compare the two existing options for linker customization and to hopefully settle on one of them through convention. Please comment over there if you have on opinion on this topic.

Answer 13 · 2017-05-16T12:55:08.000Z

I'm builing the static version of libui with bindings to it on Windows. The only way I have found it to build via GNU (cannot use MSVC for reasons) is to provide the linker params with Windows libraries manually via .cargo/config, as they are required by the C library, so cargo does not know about them. As the config is global, I cannot say it's the solution I'm happy with. This has to be configurable per-project, per-platform.

Answer 14 · 2017-05-16T18:25:24.000Z

@japaric Even if we can submit all the linker arguments we need via .cargo/config, we cannot emit them programmatically from the build script which sometimes is quite necessary.

Answer 15 · 2018-12-30T22:00:38.000Z

Using .cargo/config's [build] rustflags in a root-level binary project seems to apply the flag to all crates in the dependency graph? At least that's what I observe. In particular, cargo build blows up because the linker flag I need (-pagezero_size) only works when linking a binary, but somewhere in my dependency graph there's a proc-macro crate building as a dylib/staticlib.

(This flag is necessary to get a Darwin binary that can mmap below 4GB as required by LuaJIT).

Answer 16 · 2021-02-01T15:31:36.000Z

Today we are using, with success, .cargo/config and/or custom targets to pass extra arguments to the linker. I think this is good enough for now and no longer see a need to add more linker customization to Cargo / build scripts so I'm going to close this issue.

The problem with .cargo/config is that you can only specify hard-coded paths and this is limiting.

For example, I've just encountered a scenario where I need to add -Wl,-rpath=<path> to the linker invocation, but <path> is dependent on where the user has installed a shared object on their system.

Ideally, I'd read the path from the environment, but it's not possible. (And I'm not alone)

So one compromise might be to allow ${VAR} for env lookups in paths in .config/cargo.

I managed to work around my problem by using cargo -Z extra-link-arg (see here). But it's not ideal.

Answer 17 · 2023-01-24T16:41:41.000Z

@japaric oh so to clarify I definitely don't believe all use cases for linker arguments can be moved to crate attributes (like linker scripts). Additionally, I also feel like Cargo is too strict today for what ends up amounting to unnecessary reasons. I just personally struggle to reconcile that with also knowing that Cargo is pretty user-friendly today and linker errors are about the most user-unfriendly thing, and I'd be quite sad if Rust newbies hit those kinds of errors early on in working with Rust.

I believe cross-compilation is one of the most important features a compiler should have. I dont understand why you said cargo is user friendly since it do not really works out of the box and it is even worst with tier3 targets. Besides, documentation for such cases is not really precise. my impression is that having support only for standard targets lock users to some specific architectures.

Hope with some experience i will understand better what is really needed for making rust user-friendly specially during cross-compilation.