gimli-rs/ddbug

Expose more APIs from the crate?

RReverser opened this issue · 10 comments

Generally dealing with different types of debug information, in particular DWARF vs PDB, and then parsing different types of attributes in different formats into meaningful structures is quite painful.

It looks like ddbug solved most of these issues providing a consistent higher-level interface on top of lower-level gimli, object and pdb APIs, but doesn't expose them in the API (https://docs.rs/ddbug/0.2.0/ddbug/struct.File.html is opaque and allows only printing it out).

I wonder if it would make sense to either extract these APIs into a separate crate or expose them in the API of the library part of this one?

I understand these are subject to breaking changes over time, but having such starting point would be still much nicer than trying to reimplement all of it myself.

The current API exposed by this crate was only provided for the purpose of testing ddbug itself.

I think it makes a lot of sense to expose more APIs. Part of my motivation in writing ddbug was to get a some experience in what is useful for those APIs. The current APIs are probably a bit too focused on the needs of ddbug, but they can evolve. In the short term it's probably easier if they remain in this crate (or at least in this git repo as part of a workspace), but long term I'd be fine with moving them to a separate crate.

The DWARF support suffers from needing to load too much into memory, and my recent focus has been on reducing this, but there's still more to go. There's probably a few other issues that I don't recall right now.

The PDB support is currently quite poor, and I've actually disabled it for now while I focus on the DWARF. Part of the problem is that the pdb crate is incomplete.

Is this something you want to work on doing? Or do you want me to make a start on it, and then we can fixes issues as they arise? I can't promise when I'll get time to do it though.

The PDB support is currently quite poor, and I've actually disabled it for now while I focus on the DWARF.

I didn't know about this though, but still, I guess it's okay for a start?

The DWARF support suffers from needing to load too much into memory, and my recent focus has been on reducing this, but there's still more to go.

Same - I suppose this can be split into functions instead of fields so that it could be retrieved on-demand (or maybe even lazily stored inside of the structure), but having a nice ground helps.

Or do you want me to make a start on it, and then we can fixes issues as they arise?

As mentioned above, I think just exposing what is available now would be a good start as I've actually started reimplementing something very similar for my own purposes, but then realised that ddbug must have solves most of these issues already, and indeed, found a rich hidden API :)

I don't know how much I would be able to help as this is for yet another side-project I have in the back of my head, and I'm already feeling guilty for all the OSS projects I don't have much time to contribute to anymore, so don't want to make promises, but would be certainly happy to help whenever possible :)

I didn't know about this though, but still, I guess it's okay for a start?

The PDB support currently only has type information, not symbol information. Depends what you need.

Same - I suppose this can be split into functions instead of fields so that it could be retrieved on-demand (or maybe even lazily stored inside of the structure), but having a nice ground helps.

The idea is that we parse as little as possible initially, and instead refer to functions/types etc by their DIE offset, and then parse the rest of the fields when needed. I think this is mostly done, but it is still using too much memory for the file I was testing on (libxul.so). Not sure if there is more that can be done while keeping all the functionality. It'll be a decent starting point for normal sized files anyway.

The PDB support currently only has type information, not symbol information. Depends what you need.

For start I would want function and type information, you're saying that former is not available, right?

and then parse the rest of the fields when needed. I think this is mostly done, but it is still using too much memory

Ah. Libs I'm interested in are usually much smaller, so initially this shouldn't be an issue, but still interesting to tackle in future.

I've split out a ddbug_parser crate (still in this repo). See if that fits your needs, and let me know any issues you find. I'll probably do a few more cleanups in the coming days. I've avoided the panopticon dependency in ddbug_parser, so I'll be able to publish it to crates.io once things settle.

For start I would want function and type information, you're saying that former is not available, right?

I think it only had public symbols for functions. It's been a while since I've checked and it's currently disabled. I'll have a look at enabling it again sometime.

Sorry, I'll need few more days to get back to this and try that crate out. Thanks for extracting the APIs though, much appreciated!

Going to close this as done. Feel free to reopen or create new issues if needed.

It looks like ddbug_parser hasn't been updated in a while, and trying to use version from Github results in various errors due to relative crate paths in Cargo.toml.

Could you please publish a new version?

That's going to require a moria update first. I've hacked it locally (hence the relative crate paths). Looks like moria was never published at all, and someone else has squatted the name on crates.io. I'm not really interested in maintaining moria. I'm happy to merge any patches that get this working for you.

I deleted the moria support and published to crates.io.