BioJulia/BioJuliaRegistry

[RFC] Deregister and re-register Microbiome packages

Closed this issue · 14 comments

I have 3 packages in the BioJulia registry, Microbiome.jl, MicrobiomePlots.jl, and BiobakeryUtils.jl, and ever since I started trying to add compat to the dependencies, I've been encountering a bunch of problems. Older versions that didn't have compat entries are constantly being installed without me noticing, and leading to a bunch of headaches.

I'm wondering if it would be worth removing all current entries for these packages, and replacing them with new versions that start with the right compatibility entries (actually, I'm thinking of deprecating MicrobiomePlots entirely). This would solve a lot of my issues I think.

Potential problems

  1. Anyone that has recent versions in any Manifests may end up with weird errors trying to instantiate them. I think there are unlikely to be many users (other than me) with these problems, but still worth mentioning
  2. The julia General registry needs a PR to cap the dependencies of older versions there to prevent fallbacks there. I don't think this will be hard, and Fredrik said on slack this would likely be accepted.
  3. ? anything else?

Alternative Approach

Could work on fixing the biojulia registry compat entries instead. I could be super simple and conservative (basically do the same thing as in General, but for julia 1.1+). I'm not sure if this actually improves anything re: existing Manifests, but it might be better.

I think it would be useful to have the required compatibility entries in the registry so that running update provides a functional set of expected packages. However, manually modifying the registry is risky. Perhaps, as a strategy, a PR with compatibility entries written in expanded form would be less error-prone?

@CiaranOMara Can you expand a bit on what you mean by "expanded form"? I'm not quite sure I follow. And do you mean for these packages in particular, or for the BioJulia registry as a whole?

I've taken care of (2) above - General now has better bounds for these three packages.

I'm only responding to your case. But, if the strategy has merit, it can certainly be used elsewhere.

For example, below is the current Compat.toml for Microbiome in what I'd call condensed form.

[0]
julia = "1.1.0-1"

["0.5"]
DataFrames = "0.19"

["0.5-0"]
MultivariateStats = "0.7"

["0.5.1-0"]
Reexport = "0.2"
StatsBase = "0.32"

["0.5.1-0.6.1"]
SpatialEcology = "0.7"

["0.6-0"]
DataFrames = "0.19-0.20"

["0.6.2-0"]
SpatialEcology = "0.9"

Below, is an example of what I'd call the expanded form of what's above, where everything is explicitly listed for each release/version (I didn't check for missing dep entries). In this form, I think it is easy to see and fill in any gaps. Also, each entry only affects the version under which it is listed -- there are no far-reaching consequences.

["0.4.1"]
DataFrames = "0.19"
julia = "1.1"

["0.5.0"]
DataFrames = "0.19"
MultivariateStats = "0.7"
SpatialEcology = "0.7"
julia = "1.1"

["0.5.1"]
DataFrames = "0.19"
MultivariateStats = "0.7"
Reexport = "0.2"
SpatialEcology = "0.7"
StatsBase = "0.32"
julia = "1.1"

["0.6.0"]
DataFrames = ["0.19", "0.20"]
MultivariateStats = "0.7"
StatsBase = "0.32"
SpatialEcology = "0.7"
Reexport = "0.2"
julia = "1.1"

["0.6.1"]
DataFrames = ["0.19", "0.20"]
MultivariateStats = "0.7"
StatsBase = "0.32"
SpatialEcology = "0.7"
Reexport = "0.2"
julia = "1.1"

["0.6.2"]
DataFrames = ["0.19", "0.20"]
MultivariateStats = "0.7"
StatsBase = "0.32"
SpatialEcology = "0.9"
Reexport = "0.2"
julia = "1.1"

I have to admit, I'm not sure what happens when there is a compatibility overlap: whether something will take precedence or whether an overlap would throw an error. Also, and perhaps most importantly, I don't know whether compatibility bounds set in general also apply/carry over to here. If the entries in general also apply here, then all of the entries under [0] in general overlap with something here.

As an aside, I'd be curious to know whether an expanded compat would be compressed on the next successful automated/bot register?

Ahh, that makes sense, thanks for spelling it out.

As an aside, I'd be curious to know whether an expanded compat would be compressed on the next successful automated/bot register?

Very good question - I'd guess yes, but really have no idea. Seems like having the long-form as an option for the bot might be accepted as a PR.

It looks like this code within Pkg compresses when saving.

The code below goes towards saving a long-form Compat.toml. I'm putting it here as it may be useful for other debugging. I think you solved/will solve your issue with https://github.com/JuliaRegistries/General/pull/10324/files.

using Pkg

file_compat = joinpath(@__DIR__, "M", "Microbiome", "Compat.toml")

dict_uncompressed = Pkg.Compress.load(file_compat)

# Convert dict to writable form (https://github.com/JuliaLang/Pkg.jl/blob/8c415b01924eb4452cb7be8438c2bbc0978d1152/bin/Compress.jl#L65-L79).
function writable(path::String, uncompressed::Dict,
    versions::Vector{VersionNumber} = Pkg.Compress.load_versions(path))
    inverted = Dict()
    for (ver, data) in uncompressed, (key, val) in data
        val isa Pkg.TOML.TYPE || (val = string(val))
        push!(get!(inverted, key => val, VersionNumber[]), ver)
    end
    dict = Dict()
    for ((k, v), vers) in inverted
        # for r in compress_versions(versions, sort!(vers)).ranges
        for r in sort!(vers)
            get!(dict, string(r), Dict{String,Any}())[k] = v #TODO: lookup explicit versions in range.
        end
    end
    return dict
end

file_uncompressed = joinpath(@__DIR__,"M", "Microbiome", "uncompressed.toml")

dict_writable = writable(file_uncompressed, dict_uncompressed)

# Write uncompressed.
open(file_uncompressed, write=true) do io
    Pkg.TOML.print(io, dict_writable, sorted=true)
end

# Check.
Pkg.Compress.load(file_compat) == Pkg.Compress.load(file_uncompressed)

I'm happy for you to de-register and re-register the Microbiome verse if you think you need to. I think the damage to users will be minimal.

Is #55 good for merge? Let me know if it is. If it does not fix things I can always roll the merge back.

I've just encountered the same sort of issue as @kescobo when developing XAM and GenomicFeatues ahead of the BioJuliaRegistry. Loosely capped entries in the BioJuliaRegistry interfere with my private registry in much the same way that General has interfered with the BioJuliaRegistry.

Sorry for leaving this languishing - unexpectedly had to do a re-write of my paper that's been taking all my time.

@CiaranOMara Have you come up with a better solution than mine?

@kescobo, nope. Ideally, the compression function would be rewritten so that registries do not provide authoritative answers for package versions that are not under their jurisdiction. Even if #60 were merged for my case, it would become undone with the next automated registration of XAM.

One could get around the issue by incrementing the major version in the more recent registry, but that may not accurately reflect code development.

@kescobo Are you ok to release the versions you released on this registry onto General? I don't think Microbiome and MicrobiomePlots depend on other BioJulia packages so it's a pretty quick migration.

Microbiome has been moved to General

Yep. 👍