ufrisk/MemProcFS

Rust API: Yara Match Rules are truncated

kaarposoft opened this issue · 7 comments

The struct VmmYaraMatch includes several vectors:

  • tags: Vec
  • meta: Vec<(String, String)>
  • match_strings: Vec

However, it seems that not all items from the Yara rule or result are included.

impl_yara_cb seems to truncate those vectors to only 8 values:

let ctags = std::cmp::min((*yrm).cTags as usize, 8);
let cmeta = std::cmp::min((*yrm).cMeta as usize, 8);
let cmatch_strings = std::cmp::min((*yrm).cStrings as usize, 8);

May I suggest to increase this to at least 32, or even better: make it configurable when calling the API.

As you noted I put the limit to 8. I'll look into increasing it.

But currently there is a fixed struct buffer for that and I had another guy that wanted me to raise the limit of the number of yara findings to more than 65k. I think he wanted to scan potentially millions of hits. And all this eats memory in the worst case scenario. I'll see what I can do though.

But since there is an API change involved (even if minor) it will have to wait until the 5.10 release when that happens. I don't have a strict plan for it, but maybe in a few months time; or at the very least when I've finished support for Win11 24H2 / Server2025.

I'll put this up as an enhancement request for now. I'll definitely look into raising the limit.

I do not see where the "fixed struct buffer" comes into play.
As far as I can see, the C interface returns a list, and the Rust API truncates the list at an arbitrary limit (8).
My particular issue is, that I need some fields from the yara rule meta, which are missing since there are more than 8 key/values in my yara source meta for most rules.
I appreciate your concern for API stability.
However, I have not seen anything in the Rust API description saying that those lists will be truncated.
So, I would suggest that you increase the limit to 32 or 64 in the current 5.9 release.
And an API-specified limit could then be included in 5.10.

It's not an issue with the Rust API, rather the underlying C/C++ API. It's hard coded to max 8 values currently.

typedef struct tdVMMYARA_RULE_MATCH {

Thanks for the info, I did not realize that there was a limit in the C interface as well.
However, there are different limits in C and Rust:
C:
#define VMMYARA_RULE_MATCH_META_MAX 16
Rust:
let cmeta = std::cmp::min((*yrm).cMeta as usize, 8);
For my usecase I just increased:
let cmeta = std::cmp::min((*yrm).cMeta as usize, 32);
and it worked for me (-:

Since there is already a limit in C, why not remove (or significantly increase) the limits in Rust, where we now have

let ctags = std::cmp::min((*yrm).cTags as usize, 8);
let cmeta = std::cmp::min((*yrm).cMeta as usize, 8);
let cmatch_strings = std::cmp::min((*yrm).cStrings as usize, 8);

Thanks. I've messed up the number of meta tags in Rust. I'll increase it to 16 to match the C API.

Even though you increased that value to 32 I suspect you'll max see 16 meta tags anyway due to the # of meta tags being in the cMeta DWORD anyway.

Cool, thanks!
My value of 32 in Rust was just arbitrary, since I did not realize there was a limit in C.

I just published the 5.10 release. The yara matches are now seriously increased. There is still a limit though, but I hope it will be sufficient now.

Please let me know if you should still run into issues around it.