danmayer/coverband

Track a (shared) gem instead of the app itself

JamesFerguson opened this issue · 9 comments

Is your feature request related to a problem? Please describe.
We have multiple apps that share an in-house gem. Over time, it becomes very difficult to know what classes and methods in the shared gem are still used by one of the apps.

I'm trying to configure Coverband to report coverage just on the shared gem, and then ideally I'd run it on all of the apps at the same time to build a picture of all the parts of the shared gem that are used.

Describe the solution you'd like
I tried just changing config.root with this config/initializers/coverband.rb:

if Rails.env.development? || Rails.env.staging?
  Coverband.configure do |config|
    config.logger = Rails.logger

    config.verbose = !Rails.env.production?

    config.track_views = true

    config.root = Dir.glob(Rails.root.join("vendor/bundle/ruby/*/bundler/gems/<shared gem name>*")).first

    config.service_dev_mode = true
  end
end

<shared gem name is something like mycompany-shared. I'm doing Dir.glob(...).first because bundler tacks a sha or something on the end of the gem name when creating the directory, so I want this to work if that changes. Likewise I do ruby/*/bundler in case someone on the team bumps our ruby version.

Looking at the Configuration tab this seems to set the all_root_paths correctly, but no coverage of any kind is reported. It's like the root is changed but all the files are still filtered out. I can't figure out how, since the ignore paths are all relative. And there are app, lib and config directories in the gem, so they should be picked up.

Describe alternatives you've considered
I was looking at track_gems, but I can see that's deprecated. Otherwise, I'm just spelunking the coverband code trying to track down where files get excluded.

Additional context
Can't think of anything else.

Ah, I see.:

next unless @@ignore_patterns.none? { |pattern| file.match(pattern) } &&

I didn't realise the ignore patterns would match anywhere in the path. I thought you'd only ignore /#{Coverband.configuration.root}\/#{ignore_pattern}/. I guess I should've realised from the .erb$

Isn't that brittle? If I created a namespace called Vendor (not that unlikely) then I'd have a path like e.g. <rails root>/app/controllers/vendor/... that'd inexplicably get excluded. Likewise, any class named Tmp in a tmp.rb file would also get excluded, as well as any tmp namespace like e.g. <rails root>/app/models/tmp/....

Now I need to figure out how to override @ignore given it's protected in config with and append only writer. And what to override it with...

I totally hear you on the difficulty of shared gems... I think for those some sort of custom integration sending stats to datadog, NewRelic, stated, or your log collector is better.

So Coverband intentionally doesn't try to track gems mostly to avoid a lot of overhead of the ruby VM instrumenting all the gem code for tracking... It is much less about just the ignore. To track all the gems for example one needs to ensure Coverband is the first loaded Gem, or at least loaded before any gem you want to track... Then the issue with a shared gem is you would only see the gem usage in the app you are tracking with Coverband, with no way to aggregate across all the different Coverband apps...

At the moment the current gem is missing various things (the views aren't setup to display collected data, the ignore needs a bunch of more complex rules or you end up tracking way too much) that made gem tracking possible, and again because we could never run gem data without a noticeable impact too latency we dropped the feature. I don't anticipate adding support back because of the performance impact.

I would recommend doing some custom analytics code in the gem, that is what we end up doing at my job for various shared gems, they look for datadog to be configured if it is there the gem stats many method calls and actions with a tagged app name which it finds detecting from rails.

      def app_name 
        # detect across rails / ruby versions
        defined?(Rails.application.class.module_parent) ? Rails.application.class.module_parent.to_s : Rails.application.class.parent.to_s
      end

On the ignore, brittle question...

The ignore doesn't end up being that brittle as it is off the root, so naming something vendor nested in models or controllers wouldn't be ignored. I haven't seen the ignore ending up overly aggressive because of that. I have seen it in the past with erb but we changed the regex to make it less likely to accidentally match anything.

For what it is worth I have totally wanted exactly what you are trying to achieve, but just decided it added too much complexity to this gem, and I didn't have good ways to handle things like aggregating data across many apps without a more centralized system... Many apps for example have restricted access to only their Redis, so even a central Redis is out of the question for many deployments.

Thanks for that, that gives me some ideas.

I'll have a talk with devops about what we could do with our datadog as an alternative route.

I get not wanting to bother with allowing coverband to be configured to track all gems, because that'd be a whooole lot. But I thought I could get away with tracking just one. I don't even care about the app, just that shared gem. So I figured if I made that the root_path I might be alright.

We do have a shared redis for all the apps that share that gem, so I (maybe naively) was hoping that if all the apps ran coverband together then they'd end up with one shared log of all the lines hit within the gem. The search paths and ignores all seemed pretty reasonable (so far) -- it might help that the gem is an engine that was extracted (badly) from a monorail, so it's most of an app anyway.

FWIW, I started logging that line I mentioned, delta.rb:57. I'd set the root_path to something like ~/my_company/my_app/vendor/bundle/ruby/2.x.x/bundler/gems/shared_gem_name/ and every file matched the ignore pattern because it had vendor/ in the root path. It was matching against the absolute path and any match against vendor/ anywhere in the path seemed like it would be ignored.

I added this to the coverband config:

    config.ignore = ["vendor\/(.(?!shared_gem_name))*$"] # vendor/.* with negative lookahead regex for the gem name

and also monkeypatched Coverband::Configuration with:

  module Coverband
    class Configuration
      ###
      # Don't allow the ignore to override things like gem tracking
      # (monkey patched to expect a different pattern for 'vendor/' and therefore drop the default)
      ###
      def ignore=(ignored_array)
        @ignore = (@ignore + ignored_array).uniq
        @ignore.shift # drop 'vendor/'
      end
    end
  end

and my debug logging no longer reported any of the files for my gem getting ignored in delta.rb.

Unfortunately, I'm still not getting any data back. I think it might be down to load order as you said, even after I moved coverband to the top of my gemfile and the shared gem to the bottom (if that makes a difference). There was one time starting the server when all the files from the shared gem showed up in the web interface, but for some reason only 10 lines registered coverage, no matter what I did in the app after that.

I'll give a bit more time and then look at the datadog option.

Thanks for your help and all your work on the gem.

ha, yeah coverband root is really really always expected to be the Rails root, so when you are setting it to a gem path, I could see the ignore of vendor causing more issues... I mean if you have a shared redis that all apps can report to, and you hack the gem up, it in theory will do what you want...

WARNING: I will warn again the issue is the performance impacts the ignore in coverband is only in the Ruby world after the Ruby VM instruments all the code, so while it filters our some of the slowest bits (metric collection, summing, and saving)... once you enabled Ruby's Coverage before loading gems, it will instrument ALL gem code and rails code, so the ignore function is called on basically every line of all your dependencies, which gets slow. Which is why we avoid supporting this or making it easy to do.


All that said you are close to technically hacking it in, you probably need a couple things.

in application.rb instead of having bundler load order and various initializers order things you can force Coverband first.

require File.expand_path("../boot", __FILE__)

# any gems you require above coverband won't get instrumented
require "rails"

# this will ensure coverband is required before most gems
require "coverband"

# Require all the remaining gems listed in Gemfile
Bundler.require(*Rails.groups)
...

The railtie won't work if you don't require Rails before coverband. I believe that will show your gem and ensure it is collecting... If not something must still be filtering it out...

This diff shows some of the old readme, but most of those config options have been removed: main...track_gems_readme_update...

Anyways, good luck, but again, if you have access to something like DDstats.count("method_name") adding that in your gem is likely a lot easier than hacking this up and ensuring it is all working as expected.

Ahhh. I had not computed that Coverage would still instrument all gems even if I'm only looking at one.

Thanks for the snippet though, I kinda stubbornly just want to see it work even if it may not be viable. :)

I wonder, could I put the gem I care about in a made up bundler group, like :focused_gem and do something like:

require File.expand_path("../boot", __FILE__)

# any gems you require above coverband won't get instrumented
require "rails"

# Require all standard gems listed in Gemfile
Bundler.require(*Rails.groups)

# this will ensure coverband is required before focused gems
require "coverband"

# Require the focused gems group from Gemfile
Bundler.require(:focused_gem)

I'll give that a shot. Thanks.

Seems bundler doesn't work quite how I hoped. It's going to be a bit fiddlier.

I tried slapping require: false on the coverband and shared gem and then requiring them manually at the bottom of application.rb.

This resulted in all the shared gem file paths showing up in coverband results, but the only line that had coverage was the top level file of the gem (e.g. lib/shared.rb) and a random monkey patch.

As I said this shared gem is a rails engine with the shared guts of a former monorail, models, controllers, helpers etc. So I think, at a guess, rails autoloading is slurping up pretty much everything even if the gem isn't formally required and there's nothing left by the time coverband and the shared gem are explicitly required.

I might tinker with a couple more things, but I don't think this is going to work. Thanks for your help though.

I am closing this as there are a number of reasons tracking gem usage is a different sort of problem than app tracking.