loic-sharma/BaGet

Caching question

rong0312 opened this issue · 8 comments

Question issue, regards caching;

I'v noticed this feature:

isue

I have enabled it.

However, when i install package from nuget.org i cant see it in my BaGet volume.
How can i tell this feature works?

thanks a lot!

However, when i install package from nuget.org i cant see it in my BaGet volume.

This is odd - you should see the cached nupkg in your BaGet volume. Is the volume empty?

I'll do some investigation later to make sure mirroring works as expected while running on Docker.

This is odd - you should see the cached nupkg in your BaGet volume. Is the volume empty?

Unfortunately the package didn't make it,
I am installing packages in VS from nuget.org

What NuGet sources are you using in Visual Studio? You should have just http://localhost:5555/v3/index.jon to force Visual Studio to use BaGet. If you have both nuget.org and BaGet, Visual Studio may use nuget.org to download the package.

Also, have you already installed this package? NuGet won't contact upstream feeds if it has already downloaded the package. Check your machine's offline mirrors (like C:\Users\your-usernam\.nuget\packages) to see if the package has already.

I removed the nuget.org source and tried to install package from there via
the package manager console:

screen shot 2018-09-17 at 14 25 41

Did you come across something like this?

tomzo commented

I took a look on what happens here.
I have started baget under debugger. Then ran

nuget install log4net -DisableParallelProcessing -NoCache -Source http://localhost:50561/v3/index.json

This will hit RegistrationIndexController which calls IPackageService and returns NotFound.

var packages = await _packages.FindAsync(id);
            var versions = packages.Select(p => p.Version).ToList();

            if (!packages.Any())
            {
                return NotFound();
            }

@loic-sharma how was this supposed to work? isn't the package service responsible only for indexing local packages?
Would you accept a PR in which cache is moved to separate endpoint? E.g. v3/cache/index.json?

Apologies, I haven't had a chance to look into this bug yet. I appreciate the help @tomzo!

isn't the package service responsible only for indexing local packages

You're correct, the IPackageService does not cache packages from an upstream purpose. The MirrorService is what downloads and then caches packages from an upstream source.

Not all endpoints cache packages from an upstream source:

From what you're saying, it sounds like the Install-Package command somehow uses the registration index to restore packages. If that's true, I'll add support for mirroring packages from the registration index.

Would you accept a PR in which cache is moved to separate endpoint? E.g. v3/cache/index.json?

What's the benefit of adding a separate endpoint? Are the current endpoints not enough?

tomzo commented

From what you're saying, it sounds like the Install-Package command somehow uses the registration index to restore packages.

That's because it does not just "restore", but rather makes a query on what versions are available first, and then uses other endpoints to get particular package.
Paket does that too.

If that's true, I'll add support for mirroring packages from the registration index.

Something like I started in #102 ?

What's the benefit of adding a separate endpoint? Are the current endpoints not enough?

Currently in any of the endpoints we have a workflow like this:

  • query db to see if we have x package
    • if yes then just return it
    • if not then fallback to mirror service, which triggers cache hit/miss procedure

In short I think query to db is unnecessary for cached/public packages.
If we have different endpoint for cached packages, then we know from the start that query is for a public package and we don't need to search for it in the db.
I would rather focus on creating a cache based on something like https://github.com/MichaCo/CacheManager which could use memory, redis, or other database for serving public packages.
In the long run I would not want to store private packages in same place as public ones and would not want to use the same db.
My typical load on a server would involve 200 public packages and a few of private ones. Sqlite can easily handle a few packages, but it would be a major slowdown on hundreds.
Also notice that caching can be usually implemented with KV database much more efficiently. For caching based on KV db we don't have to worry about locks at all.
We could also start to cache responses of other endpoints, such as v3/registration/test/1.0.0.json? That would be nice to have an already pre-generated response in a KV store. SQL is unnecessary for it.
Lastly from user perspective, specifying source which is known to include only private or only public packages makes sense.

PS: I wrote you an email with some issues too.

This should have been fixed by #124. Feel free to open a new issue if you have any additional questions or problems!