Package Cache Synchronization Feature

Developer
Apr 1, 2012 at 4:41 PM
Edited Apr 1, 2012 at 8:53 PM

Hi guys

When the NuGet Gallery was down, a discussion was raised on whether a mirror could help people out that do not commit their packages into their VCS. I'm in favor for such fallback solution as it would enable people to download those packages not yet available in their local user cache.

However, a potential interesting alternative fallback came to mind when it comes (e.g. but not limited) to Continuous Integration in those scenarios. The idea of a centralized (e.g. company-wide) cache could mitigate build failures in case any external package source is unavailable. This is achievable under one condition: that central cache should be in sync with the developers' local package caches. Hence the idea of a local package cache synchronization feature, allowing to decouple CI environments from external, non-controlled dependencies. This enables people (and build servers) to continue to work, probably even without noticing the downtime if set up properly (central cache == fallback package source for all).

Enough chatter, here's what I had in mind (and in my fork :)).

Cache Sync Toolbar Option

Cache Sync Options Page

Implementation details:

- Package Sources from the Package Sources options tab are displayed (except official NuGet source, Local User Cache and Aggregates)

- Shared PackageSourceOptionsContext between Package Cache Sync and Package Sources options pages, keeping their state in sync as long as the dialog is open

- New CacheSynchronizationConsent (similar to the new PackageRestoreConsent)

- New bool PackageSource.IsCacheSynchronizationTarget property

- New NuGet.Config section "cacheSyncPackageSources", elements refer to package source (by name) defined in "packageSources"

 <cacheSyncPackageSources>
    <add key="My Remote Cache" value="true" />
 </cacheSyncPackageSources>

- Manually synchronization using the Invoke-PackageCacheSync cmdlet in the console (supporting cancellation & currently doesn't require a solution to be opened)

Invoke-PackageCacheSync cmdlet

- Option to auto-sync upon new package installation

<cacheSync>
    <add key="enabled" value="true" />
</cacheSync>

- Auto-sync is logged into the ProgressReportDialog and VisualStudio Output windows, as shown below

------- Installing...AutoMapper 2.1.265 -------
Added file 'AutoMapper.dll' to folder 'AutoMapper.2.1.265\lib\net40'.
Added file 'AutoMapper.xml' to folder 'AutoMapper.2.1.265\lib\net40'.
Added file 'AutoMapper.dll' to folder 'AutoMapper.2.1.265\lib\sl4'.
Added file 'AutoMapper.XML' to folder 'AutoMapper.2.1.265\lib\sl4'.
Added file 'AutoMapper.2.1.265.nupkg' to folder 'AutoMapper.2.1.265'.
Successfully installed 'AutoMapper 2.1.265'.
Syncing package 'AutoMapper 2.1.265' to 'http://www.myget.org/F/myremotecache'...
Synced package 'AutoMapper 2.1.265' to 'http://www.myget.org/F/myremotecache'.
Added reference 'AutoMapper' to project 'ClassLibrary3'
Added file 'packages.config'.
Successfully added 'AutoMapper 2.1.265' to ClassLibrary3.

Progress reporting for Package Cache Synchronization

- Ability to quickly create an online remote using the create a remote feed link (which navigates to myget.org)

- Created a RemoteCacheTargetPackageRepository (which is a composite of DataServicePackageRepository) and supports pushing to remote feeds using an API key

- All existing tests are green (and adjusted where necessary)

- PackageSource now supports ApiKey (and so does the Package Sources options page)

Package Sources Options Page

Potential improvements/optimizations

- Rename the Package Cache Sync options page to Package Cache and move the cache-related stuff from the General options page to this one.

- Extend the Invoke-PackageCacheSync cmdlet, e.g. to scope by solution or project level pkgs

- Improve test coverage

- Add a contextmenu entry in Solution Explorer to sync packages from sln/proj (reusing underlying scope logic of extended Invoke-PackageCacheSync cmdlet for instance, and only enabled if PackageCacheSyncConstent.IsGranted = true)

- Probably there's still room for some performance improvements

Please discuss

I prefer triggering a discussion before sending a pull request of this size :) Do you think this would be a good addition to the NuGet vsix?

Note that building this as a separate optional vsix is not that straightforward because integration with the NuGet vsix would be harder to accomplish and maintain (there's not really an official release of the NuGet VisualStudio libraries for instance, not to mention the integration with NuGet.Config).

Apr 2, 2012 at 4:58 AM

Thanks for your willingness to contribute and spending time on the prototype.  It looks very impressive.

I'll let other people on the team to chime in; meanwhile I'll play a devil's advocate:

I feel that this package cache sync feature can be accomplished outside of NuGet. I can set up a normal package source within the company, and schedule a periodic task (or even a post-build task) to push all packages in the solution to that central package source. Does the package cache sync offer extra capabilities that I may overlook?

 

Developer
Apr 2, 2012 at 5:30 AM

@dotnetjunky Your story also applies to "Package Restore" yet it showed up in the UI. Why? Because it provided the power of package restore to the masses, giving them an easier method to work with this feature. This cache sync feature may be the same. And since a lot of groundwork for pushing packages from VS seems to have been done, maybe we can even push packages from VS directly in a future :-)

Developer
Apr 2, 2012 at 6:03 AM

@dotnetjunky that's a good point and one I've left open intentionally.

The main reason for having this available as a NuGet option in Visual Studio (note: working directly on the user's NuGet.config in the userprofile), is because its main goal is to sync the user's NuGet cache, not the solution's installed packages. It didn't feel right to have to change project files because of a setting that applies to the userprofile.

For instance, instructing a team to sync their caches to a central repo and register the pkg source as fallback should be possible ad-hoc (in case of emergency, or just as a precaution) and turned on/off very easily. 

I do feel something for an MSBuild approach within the scope of a solution, let's say to sync the pkgs installed into a project/sln only. Although I'm not entirely sure it would be necessary, as I assume at least one developer has installed the package locally (in the context of Visual Studio) before the change makes it to the build server.

Also as @maartenba mention's, you can now very easy set an api key from within visual studio (linked to package source), which would be system-wide available (nuget.exe setapikey logic). And yes, I had to composite some stuff to enable pkg pushing from within the vsix :)

Very interested in hearing the team's opinion here!

Cheers,

Xavier

Apr 2, 2012 at 6:59 AM

@maartenba: The package cache sync is not exactly comparable to Package Restore, because Package Restore requires modifying the project files and hence can't be accomplished outside of VS. Package Restore started as an independent nuget package that dfowler wrote, but it still required invoking from within the nuget console. Then because of its popularity, we decided to integrate it into NuGet.

On the other hand, as I said, the package cache sync can be achieved outside of VS/NuGet.

Developer
Apr 2, 2012 at 8:08 AM
dotnetjunky wrote: On the other hand, as I said, the package cache sync can be achieved outside of VS/NuGet.

Not questioning that, it definitely can be (using the following gist for instance https://gist.github.com/2007200).

The main difference with Package-Restore is that Package-Restore requires nuget.exe because it must be able to run on the build server for instance (or any env for that matter). I designed the package cache-sync to run on the developer machine only (which presumably has Visual Studio + NuGet vsix installed). I also wanted to enable devs that are not feeling too comfortable with MSBuild, PowerShell or the NuGet commandline.

I really hope no one would want to run this on the build server, because from process POV it's too late already, external package src can be down before build is triggered. The vsix syncs immediately (with auto-sync enabled): as soon as you have the package in your cache, it's synced. Hence why I think this could be a NuGet vsix feature only (in other cases, refer to the gist and do this manually).

FYI, I took the same approach as Package-Restore in v1.7, using a consent.

Apr 2, 2012 at 6:14 PM

I haven't looked at your code, but at what time do you do the synchronization? We wouldn't want slow down VS in any case.

Having all the devs sync with the central location seems to be a waste, because most of the time they will sync the same packages. IMO, if we have the CI machine to sync periodically, it will be sufficient.

Having said that, after the last failure we have learned valuable lessons that will help us ensure this won't happen very often.

Developer
Apr 2, 2012 at 7:43 PM

If auto-sync is enabled, then a pkg is synced during pkg installation only if it is a new pkg in the user cache, and only if not yet available on the remote. This really makes the impact minimal, and it can be turned on/off with a simple checkbox, no file changes involved

The Invoke-PackageCacheSync cmdlet is a different beast which could be extended further. Currently the cmdlet syncs all pkgs in the user cache, only if not yet available on the remote. This is a concious decision made when triggering it manually.

I still think a build server should not be charged with syncing pkgs that should already be available on the pkg src it uses for pkg restore. That would miss the point of having a central cache as a fallback repository.

Apr 3, 2012 at 6:58 AM

I'm still not convinced that the synchronization should be done within VS. If you don't think the build server should be charged with the syncing task (which I think should be), then it can be done on any other machine (and done outside of VS). Now I understand your point that if it is not done immediately after package installation, nuget.org may go down before the CI build kicks in. However, as I said, this event should be very rare and hence it doesn't justify complicating NuGet (and taxing VS).

I'll ask other members in the team to chime in their opinions.

Developer
Apr 3, 2012 at 7:59 AM

I'm sure we share the same concerns with regard to complicating nuget and taxing VS. Really, I don't want to make things slower or unnecessary complex.

My point against charging the build server with this is exactly the one you mention. Unless everyone in the world is using gated check-ins, the damage is already done if the changes make it to the VCS and anyone (build server or developer) pulls the latest sources. If any pkg is missing, compilation would fail (and the developer kinda blocked), unless a fallback repository is in place.

I second it is a rare case, but nevertheless one that potentially has a huge impact on some businesses. The conditions that make it a rare case:

  • nuget.org is unavailable (<-- this is what really makes it rare, the below are just extra conditions required to really have an issue)
  • no-commit strategy for pkgs
  • no mirror of used pkgs available

The main goal of this feature is double:

  • anticipate downtime (risk mgmt)
  • give a means to very quickly set up a fallback solution (even when nuget.org already is down!)

The first one really requires hooking into the pkg install event to reduce risk to the maximum. The latter obviously can be achieved outside of VS as well, but while I was at it, I also made it possible from within VS via a simple cmdlet. It's a trade-off (as always). And that's what this discussion is for :)

 

Developer
Apr 13, 2012 at 6:50 AM

Created a work item on the issue tracker so people can vote for it:

http://nuget.codeplex.com/workitem/2101