6
Vote

PackageHash changes on Nuget.Server

description

I'm running an internal remote repository via the Nuget.Server as per the documentation.

I keep getting the dreaded 'Failed to download package correctly. The contents of the package could not be verified.' error message at various times.

After some debugging (yay open source!) it would appear that nuget is failing due to the packagehash not matching. The hash is in fact created properly, however the packagehash in the xml feed is wrong, sometimes. If I recycle the AppPool on IIS7 that contains the Nuget.Server then the hash resets back to the proper hash, for a while.

Once the hash is reset on the server, nuget works fine from my local machine.

I haven't yet come up with a solid repro for this but I'm hoping someone else has experienced something similar?

No files are attached

comments

pranavkm wrote Jan 26 at 5:28 PM

Following up in a discussion - http://nuget.codeplex.com/discussions/287527

Could you respond there since I feel your issue might be unrelated

stevescotthome wrote Jan 26 at 4:44 PM

@pranavkm
This is a brand new install, I used exsiting nuget packages up until today when I tried to create my own...and some work, some fail with this error

I do an UnInstall and them Install with the PM, it says it's installing Nuget.Server 1.6.1, the .dll version though everywhere is 1.6.0?

Where can I get 1.6.2? Install-Package NuGet.Server 1.6.2 seems to crap out :)

(thanks for the help by the way!)

pranavkm wrote Jan 26 at 4:04 PM

@stevescotthome are you running the newest version of NuGet.Server? It should be 1.6.2. Also, did this issue start surfacing all of a sudden?

stevescotthome wrote Jan 26 at 3:22 PM

I'm getting this problem right now on seemly every new package I create...copy them to the repository, then it fails when I try and read it...Luckily I have a couple "good" packages backed up to use for testing

pranavkm wrote Nov 2 2011 at 7:54 PM

Awesome find! I kept suspecting something was amiss with the cache, never suspected the CryptoHashProvider.

Haacked wrote Nov 2 2011 at 7:46 PM

Yes, please do submit a patch. :)

stovellp wrote Nov 2 2011 at 6:43 PM

OK, I think I know the issue.

1) IIS is asleep, the app pool is shut down
2) 10 requests for the feed come in at once
3) We spin through all the packages and hash them

The hash provider is done by a single, shared instance of this:

public class CryptoHashProvider : IHashProvider
{
private static readonly HashAlgorithm _defaultHashAlgorithm = SHA512.Create();
private readonly HashAlgorithm _hashAlgorithm;

So they all end up using the same instance of SHA512Managed (return value of SHA512.Create).

Why is this a problem? Here's the top of SHA512Managed:

public class SHA512Managed : SHA512
{
private byte[] _buffer;
private ulong _count; // Number of bytes in the hashed message
private UInt64[] _stateSHA512;
private UInt64[] _W;

Nothing like shared state :) Obviously this guy isn't thread safe. Here is the proof:

https://gist.github.com/1334451

If you run that you'll find you get different hashes printed. However you'll notice that you'll often get the same hash - it appears SHA512Managed resets itself once it is done, so it usually works, but under many calls they overwrite each other.

I don't think this is a problem with NuGet.Server, but rather, NuGet.Core. Since SHA512Managed isn't thread safe, CryptoHashProvider probably shouldn't assume it can re-use it. I suggest simply creating a new SHA512.Create before hashing, and not storing it as a private instance member.

I'm more than happy to submit a patch for this if you agree with that change!

PS: Documentation on SHA512Managed says it isn't guaranteed thread safe:

http://msdn.microsoft.com/en-us/library/system.security.cryptography.sha512managed.aspx

Paul

stovellp wrote Nov 2 2011 at 6:22 PM

Here is another way to reproduce it.

1) Create a package, and add it to your server
2) Restart IIS
3) Use the code below to perform 10 queries at the same time
4) On the NuGet.Server, add a breakpoint in GetPackageMetadata. You'll have 10 entries in _derivedDataLookup

Inspect the entries. They'll have the same name, access date, and size. But they'll all have different hashes! Note that the file didn't change on disk during this process, so something appears wrong with the hashing.

Code:

for (var i = 0; i < 10; i++)
{
ThreadPool.QueueUserWorkItem(q =>
{
var repository = PackageRepositoryFactory.Default.CreateRepository(feedUri);
repository.GetPackages()
.Where(p => p.Id.Contains("Test"))
.Take(20)
.ToList();
});
}

In my situation, I can see how this happens - I have a page that queries NuGet.Server for the latest version of a list of a bunch of packages, each one being a separate query. All of those queries at once (on a server that has probably had its application pool shutdown) during warmup mean invalid cache entries being stored in _derivedDataLookup.

Paul

stovellp wrote Nov 2 2011 at 6:14 PM

Hi,

I build a product on NuGet (octopudeploy.com) and this bug causes a problem for some of my customers. I've been debugging it but I'm not able to find the root cause of the problem.

One way to reproduce it is to:

1) Create a package
2) Duplicate it
3) Edit the duplicate by removing files from the zip
4) Add the edited version to NuGet.Server
5) Check the hash
6) Copy the good version over the top, but make sure the 'last modified date' of the file is the same or newer than the file you are replacing
7) The hash won't have changed even though the file did

I'm still looking into what could cause this and I'll update this thread, but I just wanted to add a +1 so the OP knows he isn't crazy - it's happened to 3-4 of my customers too :)

Paul

cstavro wrote Sep 8 2011 at 6:08 PM

I thought something similar so I looked at it and that code looks like this:

_derivedDataLookup[package] = CalculateDerivedData(package, path);

The DerivedPackageData class that it is returning has no implementation for Equals or GetHashCode so sure, adding this to the dictionary is likely going to create multiple instances and that definitely needs to be fixed. But that means that the initial 'find' isn't actually finding the original entry in the first place which is causing it to add the new entry.

Could this be a threading issue? When the app pool is torn down by IIS, the search occurs simultaneously while rebuilding the package cache?

What I really can't understand is why I'm seeing different PackageHash values for packages being added with the same path.

Haacked wrote Sep 8 2011 at 4:37 PM

Ah! Could you look at any code that adds packages to _derivedDataLookup? We're probably missing a uniqueness check there or something like that.

cstavro wrote Sep 8 2011 at 3:31 PM

Alright so I don't have the actual cause/solution exactly yet but I do have more info, I'm basically using this thread as a dumping ground for my findings. Excuse the mess. ;)

The problem appears to have something to do with the _derivedDataLookup cache in the ServerPackageRepository. Somehow, it is getting multiple instances of various packages in there. I had 169 packages in there in my debug session when the problem occurred and I was able to identify multiple instances of several packages which all had different package hashes but pointed to the same physical path. Upon recycling the app pool, the package list has dropped to a more accurate 141 with no duplicates.

My current best guess is that when IIS kills the app pool due to non-use, something funny is going on when it brings it back up to serve up a request after new packages have been added to the packages folder. Unfortunately I'm not entirely sure how to debug that as I'm working with the remote debugger but my debug session dies off as soon as IIS kills the inactive w3wp process and thus I cannot attach the debugger to the process until after the first request recreates the process.

dfowler wrote Sep 8 2011 at 10:05 AM

We don't monitor for file change events, we just check to see if LastWriteTime of the package file changed. If you could debug the issue it would be awesome :)

cstavro wrote Sep 8 2011 at 1:42 AM

I believe that to be correct.

Keeping in mind the fact that the hash resets back to the expected hash after recycling the app pool, I assumed maybe it was reading the file prior to it being fully copied or something and grabbing a hash of that. But I don't believe the log4net package has changed at all (I will be double checking that however).

For now my workaround is to just recycle the app pool whenever I need to, but it's a pain in the butt with multiple developers and a build server all trying to grab packages.

I haven't spent much time digging into how Nuget.Server works but I imagine it's just monitoring the packages folder for change events. I'll see if I can do some more debugging and perhaps reproduce this on my actual dev machine.

Stay tuned for updates!

Haacked wrote Sep 8 2011 at 12:29 AM

Ok, just so I'm clear. You copy another package that depends on log4net into the Packages folder of the website *without changing the log4net package itself*, but the package hash for log4net changes? that's really weird.

cstavro wrote Sep 7 2011 at 2:06 PM

I don't think it's a caching issue. It seems to occur when I add a new package to the Packages folder.

My specific example at the moment is as follows:
1) I have a local copy of log4net 1.2.10 copied from the official repository with a package hash of yP7PCpxPt5cB9yiRAvCwyW5YRqXV4jM4tpVLnm/ls2nJt786l5srbSdm5xIEVMXz/kOYqLKP177Gyqey1wdP/A==

2) I copy a new internally created package into the packages folder. This package has a dependency on log4net

3) The log4net packagehash displayed in the xmlfeed has now changed to pIwWtskrwLHTxttzNyM1kLQ/IPDEhs1Lb9p8E3j+Cur/gg74PRrgHD06glqE2BkjVMhKD/5Tll1BD3ewvdm7QQ==

4) Recycling the app pool restores the hash back to the original (expected) hash

I don't believe it to be a caching issue because this new hash does not match any other package in my packages folder and it is also different each time this problem occurs.

Haacked wrote Sep 6 2011 at 9:07 PM

Thanks for the report. It's possibly a caching issue. If you can narrow it down and submit a pull request, we'd definitely look at it. :)