dgollahon/rspectre

Can rspectre be run in parallel?

Opened this issue · 10 comments

I am working on a large project where running the full test suite takes around 30 minutes. So we split into 15 tasks that are run in parallel.

It would be really neat to run rspectre as part of the test suite, but unless in can be parallelised, I would be adding a 30 minute job to an otherwise 2 minute CI run…

Interesting consideration. None of the projects I work on have test suites that take longer than ~5 minutes to run so this hasn't been a priority for me so far. There's nothing fundamentally impossible about doing this, but it would require moving to a client-server kind of architecture or at least writing everything to a file. Right now I'm just mutating a global variable to track rspec invocations.

Definitely something I'd consider, but it just may be a little more effort than I'm up for in the near future for a side-project. I do want to add some more features and improve some stuff soon, but I don't know if parallel runs will be a priority.

@bquorning I know this is super stale but in your use case do you run 15 tasks on the same machine and so you something like a --parallel flag that auto-split the tests would work? Or is it the kind of thing where you need to do something like run on 15 machines, each as a separate chunk, and then join all the results together? Because the form seems easier and nice for smaller test suites but maybe not great for E2E tests or very large test suites split across machines. WDYT?

In my case, I use the parallel_tests gem to split a large spec suite into 45 separate jobs. They then run on 15 machines with 3 processes on each (saving the 4th CPU for mysql). Currently, everything is single threaded.

It sounds quite complex to make rspectre work across multiple processes, let alone across multiple machines. I know that e.g. Simplecov can do a similar thing, by outputting each process’ result to a JSON file, and providing a tool for “merging” all the resulting files. But I totally understand if this is a problem is out of scope.

Yeah, I use parallel_tests too but I just run 16 chunks on a mid-sized box and it's pretty fast on the codebase I work on. I think basically doing the equivalent of that is not actually that hard so I was considering it. It's not like it's super technically complicated to output things in chunks and then have some kind of merge tool, but it isn't something I would personally use and definitely feels more complex v.s. single-threaded or same-machine parallel.

I would like to make this tool more accessible/usable for others but I'm not sure if I will invest in that or not. If you said single-machine parallelism would help you a lot I might have tried the narrower/simpler version but I think when your tests are big enough where this matters people probably want the produce-and-merge-artifacts workflow.

The other way I see this working is just a very long cron job since I don't think it's worth running all your tests twice for this and I also don't think it'd be a good idea to use rspectre as a "real" test runner.

The other angles I have thought about for approaching this:

  • add rspectre:disable comments so that you can comment out places in slow tests that you don't want to run/consider
  • have a --filter-shared or a --filter-something mode where it basically only looks at files it runs. It wouldn't get everything but it would work fine on small chunks and probably get most of what you care about (but it would not be able to find test setup used across tests that is unused).

have a --filter-shared or a --filter-something mode where it basically only looks at files it runs. It wouldn't get everything but it would work fine on small chunks and probably get most of what you care about (but it would not be able to find test setup used across tests that is unused).

I also wonder if I could do something (maybe slightly janky) where I just find all the specs which could likely include or are known to include a shared example and then only report when it seems like those have happened. That would still have false positives and one of the goals of this tool is to be very low on false positives... but it seems like there's not a great way to use it with large/slow test suites right now.

I think it would be only worth the effort to parallelize if it could be integrated into the normal spec run in a convenient way - not as a "spec runner", but more as an rspec plugin of sorts.

I agree that right now, as a standalone utility, approximately doubling CI usage for the average project seems rather wasteful to find the occasional obsoleted let.

If it could run alongsid the regular spec run and there were ready-made github actions to build/collect/merge/validate the results, people could get its benefit on every build without much drag on the build time.

It might be challenging to achieve the same ease of use as the standalone version has, though.

I myself am quite happy to just run it once a quarter or so. In this case it is not affecting any normal pipeline, so IMHO it doesn't matter so much if it takes 5 minutes or 3 hours.

I also wonder if I could do something (maybe slightly janky) where I just find all the specs which could likely include or are known to include a shared example and then only report when it seems like those have happened. That would still have false positives and one of the goals of this tool is to be very low on false positives... but it seems like there's not a great way to use it with large/slow test suites right now.

Another idea, or rather an additional idea: the execution of non-shared examples in a given spec file could be stopped once all let/subject setups in this file have been used at least once. Shared examples would also no longer need to be executed in this case - if they have already been executed at least once, from any file.

If it could run alongsid the regular spec run and there were ready-made github actions to build/collect/merge/validate the results, people could get its benefit on every build without much drag on the build time.

Yeah. I have thought about this as well but I think I lean towards "I don't think I want to do this" on the grounds that this tool does some relatively invasive rewriting of rspec methods. It has slight performance overhead (which I would be unhappy about as a user but isn't a huge deal) and, more importantly, if I subtly break some rspec API I might invalidate your tests. So I think even if I did add that capability, my recommendation would always be to run it out of band, likely on a schedule (unless your project is very small/fast).

Another idea, or rather an additional idea: the execution of non-shared examples in a given spec file could be stopped once all let/subject setups in this file have been used at least once. Shared examples would also no longer need to be executed in this case - if they have already been executed at least once, from any file.

Huh, that's a really interesting idea. I would have to think about that in more detail because I could see it somehow leading to subtle false positives if I prune in the wrong cases (mainly thinking about cross-file shared example support right now) but in theory I should be able to detect if that's relevant. It might be too big of a project for an unknown gain... but I could also see it making a big difference on heavy-weight tests. Tracking in #82 😄

this tool does some relatively invasive rewriting of rspec methods [...] if I subtly break some rspec API I might invalidate your tests

This could probably be avoided by using the TracePoint API instead of overrides, though perhaps at a greater performance cost.

Yeah, I considered that when I created rspectre. A friend of mine actually created an initial prototype of this concept using TracePoint but there are a number of small gotchas and I think that overall I risk just having a buggier/more awkward tool if I rewrote it to use TracePoint. There is also the performance overhead which might be interesting to benchmark. Another thing to generally reflect on.

There is also the performance overhead which might be interesting to benchmark

i did some basic benchmarking because i was interested in this for my own project. i guess the performance overhead in code that does any kind of IO will be barely noticeable. i agree about the awkwardness, though 😁