rust-lang/project-error-handling

Move `Error` trait into core

yaahc opened this issue ยท 20 comments

yaahc commented

EDIT: Current Status Update


Building upon these prior issues:

From this comment Backtrace stabilization is currently blocked on prototyping one or more of the suggested solutions for moving the std::error::Error trait into core

That comment suggests three possible solutions.

1. Remove `backtrace` from `Error` and instead provide it via some generic context mechanism (RFC 2895).

2. Define a `Backtrace` trait in core and have an implementation of that trait in libstd. `Error::backtrace` would then return a `&dyn Backtrace`.

3. Define `Backtrace` in core but have it dispatch via an internal vtable. This is effectively the same as 2 but the trait is hidden. The `capture` method will have to be moved to a free function in libstd.

Option one is already known to work, and so it does not need to be prototyped. In addition to these backtrace related issues a prototype of the error trait in core will need to find a way to work around pub fn downcast<T: Error + 'static>(self: Box<Self>) -> Result<Box<T>, Box<dyn Error>> since Box won't be available in core. The top level comment in @withoutboats's stabilization PR goes into more detail on this second issue and outlines how a new core::error::Error implementation could use hooks that are defined in std, similar to panic hooks, to work solve the issues around Box.

Boats has already raised issues about option two in this comment so I think the best way forward is to start working on a core::error::Error trait and core::backtrace::Backtrace type that both rely on hooks / manual vtables to maintain backwards compatibility with the existing error trait.

I attempted to prototype the move to libcore with backtrace removed in this branch.

The mentioned downcast is indeed not at all an issue (I applied the suggestion I made in this comment), but

impl<'a> From<&str> for Box<dyn Error + Send + Sync + 'a> {

comes up as an issue. In particular it will trigger coherence's wrath with

error[E0119]: conflicting implementations of trait `core::convert::From<&str>` for type `boxed::Box<dyn core::error::Error + core::marker::Send + core::marker::Sync>`:
    --> library/alloc/src/errors.rs:66:1
     |
66   | impl<'a, E: Error + Send + Sync + 'a> From<E> for Box<dyn Error + Send + Sync + 'a> {
     | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `boxed::Box<dyn core::error::Error + core::marker::Send + core::marker::Sync>`
     |
    ::: library/alloc/src/boxed.rs:1155:1
     |
1155 | impl<'a> From<&str> for Box<dyn Error + Send + Sync + 'a> {
     | --------------------------------------------------------- first implementation here
     |
     = note: upstream crates may add a new impl of trait `core::error::Error` for type `&str` in future versions
yaahc commented

Here's a link to a related zulip discussion thread: https://rust-lang.zulipchat.com/#narrow/stream/257204-project-error-handling/topic/Stabilizing.20Backtrace.20.2F.20Core.20Error.20Proof.20of.20Concept/near/211291775

Once we've talked through potential solutions to the orphan rule issue more I'll come back and summarize the potential steps forward in the top level issue.

yaahc commented

I'm now trying to prototype the actual backtrace hook in core implementation to prove that all works and is something we could merge. I'm not really sure what I'm doing so I'm going to write out what I understand as the necessary steps so that more people more experienced with working on the language can give feedback and guidance. I'm basing this largely on the way rust structures it's panic handler.

Expected Steps

  • Create a Backtrace type in core with the same interface as the current Backtrace type but without the implementation.
    • a function to conditionally construct a backtrace
    • a function to force construction of a backtrace
    • a function to access a backtrace status
    • Display and Debug implementations
  • Add hooks that are called in these core implementations to access the implementations defined in std
    • fn backtrace_enabled_impl() -> bool
    • fn backtrace_capture_impl() -> *mut ()
    • fn backtrace_status_impl(*mut ()) -> BacktraceStatus
    • fn backtrace_fmt_impl(*mut (), use_display: bool, &mut core::fmt::Formatter) -> fmt::Result
  • add these to the weak_lang_items definition in src/librustc_hir/weak_lang_items.rs
  • implement these lang items in libstd
  • move the content of the current implementations of the backtrace interface to these hooks
  • remove the existing std backtrace types and reexport those defined in core

I do not love the idea of defining a bunch of lang items for this, so I'm assuming this is only vaguely correct.

Unanswered Questions

  • what hooks end up getting used in a no_std context when a Backtrace is constructed? Does the compiler automatically insert some or link some dummy impl in when it sees std isn't available?
  • Do we want it to be possible for users to provide implementations for these hooks?
    • If so, how do we guarantee we can add an iterator API to backtraces in the future backwards compatibly?
yaahc commented

I'm still unconvinced that hooks are the best solution as opposed to defining a Backtrace trait in core. Here is my analysis of the trade offs.

Comparison

  • Both traits and hooks are essentially ways to define a vtable
  • Traits carry the vtable as part of the object via a wide pointer
  • hooks store the vtable in a single static location
  • traits are dynamically dispatched via virtual calls
  • hooks are statically dispatched like any other non virtual function
  • With traits you can have many implementations of the vtable in the same binary
  • with hooks you can must have exactly one implementation, unless you never attempt to call the hooks
  • with traits have to pay the cost of virtual function calls
  • traits are likely simpler implementation wise
  • A trait based solution would require either a wide pointer on the stack or an extra layer of indirection

To my pattern matching brain traits feel like a universally quantified form of dynamic dispatch whereas these ad-hoc hooks feel like an existentially quantified form of const dynamic dispatch.

Previously @withoutboats raised concerns about stability when using traits. I don't think this is a fundamental issue though. We could twrap the trait in a newtype and make the trait itself unstable, which is essentially what we're doing with the hooks. Either way if the user is ever allowed to provide the vtable, whether via traits or hooks, adding new items to that vtable will require handling the default case.

The question is, which is more appropriate for backtraces?

  • What use cases are there for being able to work with multiple types of backtraces via a common interface?
    • SpanTrace - doesn't seem useful here unless the generic member access RFC never merges
    • Backtraces of frames of other languages via FFI? - Could this be handled by defining your own backtrace handler via the static hooks that is capable of understanding frames in of all the languages it will interact with?
eddyb commented

This is a full example of what @yaahc, @mystor and I have discussed (I don't plan to be actively involved, but felt like I needed to exemplify my suggestions): https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=63b4720d0347d05391643df9da08c72f

Also here's a more pseudocode version of it (and without most implementation details), to facilitate discussion:

mod core::panic {
    // perma(?)-unstable
    pub trait RawBacktraceImpl: fmt::Debug + fmt::Display + 'static {
        unsafe fn drop_and_free(self: *mut Self);
    }

    #[stable]
    pub struct Backtrace(*mut dyn RawBacktraceImpl);

    impl Backtrace {
        // perma(?)-unstable
        pub unsafe fn new(p: *mut dyn RawBacktraceImpl) -> Self {
            Self(p)
        }

        #[stable]
        pub fn capture() -> Option<Self> {
            extern "Rust" {
                #[lang_item = "backtrace_capture"]
                fn backtrace_capture() -> Option<Backtrace>;
            }
            unsafe { backtrace_capture() }
        }
    }

    impl !Send for Backtrace {}
    impl !Sync for Backtrace {}
    impl Drop for Backtrace {...}
    impl fmt::Debug for Backtrace {...}
    impl fmt::Display for Backtrace {...}
}
mod alloc::panic {
    #[stable]
    pub trait BacktraceImpl: fmt::Debug + fmt::Display + 'static {}

    impl<T: BacktraceImpl> From<Box<T>> for Backtrace {
        fn from(x: Box<T>) -> Self {
            struct Wrapper<T>(T);

            impl<T: BacktraceImpl> RawBacktraceImpl for Wrapper<T> {...}
            impl<T: fmt::Debug> fmt::Debug for Wrapper<T> {...}
            impl<T: fmt::Display> fmt::Display for Wrapper<T> {...}

            unsafe {
                Backtrace::new(Box::into_raw(x) as *mut Wrapper<T> as *mut dyn RawBacktraceImpl)
            }
        }
    }
}
mod std::panic {
    #[lang_item = "backtrace_capture"]
    fn backtrace_capture() -> Option<Backtrace> {
        struct StdBacktrace;

        impl fmt::Debug for StdBacktrace {...}
        impl fmt::Display for StdBacktrace {...}
        impl BacktraceImpl for StdBacktrace {}

        Some(Box::new(StdBacktrace).into())
    }
}
yaahc commented

Status Update

We're getting a clearer and clearer picture of what's needed to move the error trait into core so here is a list of the specific blockers that need to be resolved still:

Integrate negative trait impls with Coherence checks

Currently tracked in:

This feature is necessary to handle the following from impl on box.

impl From<&str> for Box<dyn Error> { ... }

Without having negative traits affect coherence moving the error trait into core and moving that From impl to alloc will cause the from impl to no longer compile because of a potential future incompatibility. The compiler indicates that &str could introduce an Error impl in the future, and thus prevents the From impl in alloc that would cause an overlap with From<E: Error> for Box<dyn Error>. Adding impl !Error for &str {} with the negative trait coherence feature will disable this error by encoding a stability guarantee that &str will never implement Error, making the From impl compile.

Resolve fn backtrace in core issue

After fixing the previous issue it will still not be possible to move Error straight into core because it has a method that includes the Backtrace type in it's signature, which is also confined to std rather than core. There are two potential solutions to this we're currently considering.

  • Move the Backtrace type to core using lang items to define it's functionality in std or provide a dummy impl / custom impl when in no_std contexts. Draft PR
  • Remove fn backtrace from the error trait in favor of generic member access. RFC

We are currently favoring the second option, removing fn backtrace entirely. However this approach is blocked by landing generic member access.

Generic Member Access

In order to remove the backtrace function we need to add an alternative API that maintains equivalent functionality. The Generic member access RFC proposes such a mechanism. This RFC is more or less ready to be implemented but is currently blocked on splitting out the type erasure mechanisms it introduces into its own RFC and landing those first, due to their complexity, and because they're partially justified by usecases unrelated to error handling.

@nrc has written a draft of this RFC: nrc/rfcs#1

Downcast

The last issue is the presence of downcast APIs associated with dyn Errors, some of these APIs reference the Box type, which cannot be moved to core. We currently believe that these methods can be moved to alloc via an impl Box<dyn Error> { ... } block, but this has not been tested yet. Once we resolve the previous three issues we will attempt to move the error trait to core and see if any additional unforeseen issues pop up.

Implement coherence checks for negative trait impls was just merged rust-lang/rust#90104

yaahc commented

Now trialing moving Error into core by just ripping out fn backtrace

WIP PR: rust-lang/rust#90328

yaahc commented

Regarding this comment, rust-lang/rust#72981 (comment), the trial from rust-lang/rust#90328 seems to have shown that moving the impl to be on Box<dyn Error> is still a breaking change, since before you could call downcast as <dyn Error>::downcast(err).

error[E0599]: no function or associated item named `downcast` found for trait object `dyn core::error::Error` in the current scope
    --> library/alloc/src/boxed.rs:2101:22
     |
2101 |         <dyn Error>::downcast(err).map_err(|s| unsafe {
     |                      ^^^^^^^^ function or associated item not found in `dyn core::error::Error`

error[E0599]: no function or associated item named `downcast` found for trait object `dyn core::error::Error` in the current scope
    --> library/alloc/src/boxed.rs:2115:22
     |
2115 |         <dyn Error>::downcast(err).map_err(|s| unsafe {
     |                      ^^^^^^^^ function or associated item not found in `dyn core::error::Error`

For more information about this error, try `rustc --explain E0599`.

The original plan for this was to leverage a lang-item or something to allow the incoherent impl on dyn Error from the alloc crate despite Error being in core. Going to try that now to see if that can easily resolve the issue.

yaahc commented

Update

I was able to get rust-lang/rust#90328 compiling, now testing everything but hopefully that should all be sorted, which means the negative trait impls in coherence and downcast issues have both been resolved ๐ŸŽ‰.

The only remaining blocker to moving the Error trait into core is fn backtrace and code review on the trial PR which will likely turn into the actual PR for moving it once I integrate the generic member access changes (assuming those are how we resolve the fn backtrace issue). The top priority now is getting rust-lang/rfcs#3192 and rust-lang/rfcs#2895 finalized, approved, merged, and implemented. And hopefully we can get the initial implementations done immediately and available on nightly while we hammer out the API details in the RFCs.

Ohh I thought negative trait in coherence wasn't going to work for your use case because it doesn't work in some cases. Cool if the current solution does work for you :). Anyway, we are going to be making some modifications to negative traits in coherence. More info https://rust-lang.github.io/negative-impls-initiative/explainer/coherence-check.html

yaahc commented

Update

There's been a lot of progress in the last few months. Since the last update we've

At this point there are no remaining blockers per-se, though there are due diligence steps that still need to be done.

  • Updating the RFC for generic member access: rust-lang/rfcs#2895 to reflect changes in the API since it's original proposal
  • Setting up a tracking issue for the provider api integration with the error trait
  • Actually write the PR moving the error trait to core

With any luck and assuming I don't have too many distractions pop up I should be able to have this done asap, hopefully within the next few days.

It looks like a relevant change has landed in Rust - does this mean that Rust 1.65 will have the Error trait in core? If so woohoo!

Amazing work @yaahc, thank you very much for your effort!

Amazing work @yaahc, thank you very much for your effort!
Waiting for it for a long time

runiq commented

Wow, how did this fly under the radar? Amazing work, Jane! Thank you so much! <3

It looks like the error can be used in core but is marked as unstable - can someone confirm this is what people meant by

does this mean that Rust 1.65 will have the Error trait in core?

I read that and thought it was stable - happy to use it with unstable features if necessary :)
Sorry if I missed some context, I just found out about this issue from the unstable warning

Beta has it behind the error_in_core feature gate.

For completeness, the tracking issue for error_in_core is rust-lang/rust#103765

Any update on this? Since rust-lang/rust#103765 was done.