Caching question
rong0312 opened this issue · 8 comments
However, when i install package from nuget.org i cant see it in my BaGet volume.
This is odd - you should see the cached nupkg in your BaGet volume. Is the volume empty?
I'll do some investigation later to make sure mirroring works as expected while running on Docker.
This is odd - you should see the cached nupkg in your BaGet volume. Is the volume empty?
Unfortunately the package didn't make it,
I am installing packages in VS from nuget.org
What NuGet sources are you using in Visual Studio? You should have just http://localhost:5555/v3/index.jon
to force Visual Studio to use BaGet. If you have both nuget.org and BaGet, Visual Studio may use nuget.org to download the package.
Also, have you already installed this package? NuGet won't contact upstream feeds if it has already downloaded the package. Check your machine's offline mirrors (like C:\Users\your-usernam\.nuget\packages
) to see if the package has already.
I took a look on what happens here.
I have started baget under debugger. Then ran
nuget install log4net -DisableParallelProcessing -NoCache -Source http://localhost:50561/v3/index.json
This will hit RegistrationIndexController
which calls IPackageService
and returns NotFound.
var packages = await _packages.FindAsync(id);
var versions = packages.Select(p => p.Version).ToList();
if (!packages.Any())
{
return NotFound();
}
@loic-sharma how was this supposed to work? isn't the package service responsible only for indexing local packages?
Would you accept a PR in which cache is moved to separate endpoint? E.g. v3/cache/index.json
?
Apologies, I haven't had a chance to look into this bug yet. I appreciate the help @tomzo!
isn't the package service responsible only for indexing local packages
You're correct, the IPackageService
does not cache packages from an upstream purpose. The MirrorService
is what downloads and then caches packages from an upstream source.
Not all endpoints cache packages from an upstream source:
v3/registration/test/index.json
(aka, the "registration index")- ❌ This doesn't mirror any versions of package Test
- This endpoint is handled by
RegistrationIndexController
v3/registration/test/1.0.0.json
(aka, the "registration leaf")- ✔️ This does mirror the package Test v1.0.0 (see this code)
- This endpoint is handled by
RegistrationLeafController
v3/package/test/1.0.0/test.1.0.0.nupkg
- ✔️ This does mirror the package Test v1.0.0 (see this code)
- This endpoint is handled by
PackageController.DownloadPackage
v3/package/test/1.0.0/test.1.0.0.nuspec
- ✔️ This does mirror the package Test v1.0.0 (see this code)
- This endpoint is handled by
PackageController.DownloadNuspec
From what you're saying, it sounds like the Install-Package
command somehow uses the registration index to restore packages. If that's true, I'll add support for mirroring packages from the registration index.
Would you accept a PR in which cache is moved to separate endpoint? E.g.
v3/cache/index.json?
What's the benefit of adding a separate endpoint? Are the current endpoints not enough?
From what you're saying, it sounds like the Install-Package command somehow uses the registration index to restore packages.
That's because it does not just "restore", but rather makes a query on what versions are available first, and then uses other endpoints to get particular package.
Paket does that too.
If that's true, I'll add support for mirroring packages from the registration index.
Something like I started in #102 ?
What's the benefit of adding a separate endpoint? Are the current endpoints not enough?
Currently in any of the endpoints we have a workflow like this:
- query db to see if we have
x
package- if yes then just return it
- if not then fallback to mirror service, which triggers cache hit/miss procedure
In short I think query to db is unnecessary for cached/public packages.
If we have different endpoint for cached packages, then we know from the start that query is for a public package and we don't need to search for it in the db.
I would rather focus on creating a cache based on something like https://github.com/MichaCo/CacheManager which could use memory, redis, or other database for serving public packages.
In the long run I would not want to store private packages in same place as public ones and would not want to use the same db.
My typical load on a server would involve 200 public packages and a few of private ones. Sqlite can easily handle a few packages, but it would be a major slowdown on hundreds.
Also notice that caching can be usually implemented with KV database much more efficiently. For caching based on KV db we don't have to worry about locks at all.
We could also start to cache responses of other endpoints, such as v3/registration/test/1.0.0.json
? That would be nice to have an already pre-generated response in a KV store. SQL is unnecessary for it.
Lastly from user perspective, specifying source which is known to include only private or only public packages makes sense.
PS: I wrote you an email with some issues too.
This should have been fixed by #124. Feel free to open a new issue if you have any additional questions or problems!