Automated Build workflow question

Dec 22, 2010 at 11:07 AM

I would like to use NuGet with the following workflow:

1. Developer finds and adds package to project.

2. Developer tells version control to ignore the packages folder and commits.  We don't want to commit these binary packages to version control.  (Why keep them in VCS if they are already in the repo, right?)

3. Automated build system runs (ours happens to be nant).  Calls NuGet.exe task to pull in packages from the feed repository.  No packages will be updated to new versions, as we will pin to specific versions (which the developer has already tested on his box).  Then the normal compile/run unit tests/deploy stuff happens.

4. Another developer checks out source code.  He will then need to run the nant build task that calls NuGet.exe in order for the project to be usable in Visual Studio.

A similar thread is here: http://nuget.codeplex.com/Thread/View.aspx?ThreadId=231541  and here: http://nuget.codeplex.com/Thread/View.aspx?ThreadId=233392.  I get the feeling that the original intention was to commit these dependency DLLs to source control, but I would like to avoid this.  Does anyone forsee any technical reason this would not be possible?

Does anyone have this workflow going?  I think that step #3 would simply need to parse the packages.config file and install the packages listed in that version.  However, I don't see that NuGet.exe supports Install-Package like the UI Console does.  I guess I'd have to grab the API and write my own code to do this.

Like one of the guys in the linked threads, I also have a Maven background, and this workflow is very similar to what Maven would do (or Ivy for that matter).

Thanks!

Dec 22, 2010 at 2:34 PM
I know there is a huge debate about whether you should keep packages in source control or not, but what you are asking for seems like an error prone way of going about getting the same packages without keeping them in source control.

What do you really gain by NOT keeping the packages in source control?

The first step to a repeatable build should be to NOT have much variance. There are a world of scenarios that would increase variance by NOT keeping your packages in source control. What happens if you don't have an internet connection or there is a hiccup? What happens when someone removes the package from the feed that you depend on? What about when the feed goes down for some reason?

The term is hermetic build (coined by Jeff Brown here: http://blog.bits-in-motion.com/2010/02/setting-up-distributed-build-part-1.html)

To quote:

Hermetic Build Environment

Hermetic means “sealed.”

A hermetic build environment is one which encapsulates all of its dependencies so that the result of the build is entirely reproducible.

Think for a moment about the dependencies of your own build process.

  • Source code.
  • 3rd party libraries and tools.
  • SDKs, compilers, frameworks, header files.
  • Test tools, coverage tools, build automation tools, documentation tools, linting tools.
  • Scripting languages, command-line tools, scripts.
  • Configuration files, registry entries.
  • Custom patches, tweaks.
  • Operating System revision and patch level.
  • Crazy glue.
  • Position of the mouse pointer on the screen. (Please don’t depend on this!)
  • Time of day. (Ditto.)

The more dependencies you control, the more reproducible will be the build.

The idea of establishing a hermetic build environment is to ensure that the dependencies are tightly controlled. If I change line 101 of Foo.cs to fix a bug, I need to know that the next build will be identical to the first except that it will contain my specific bug fix because that is the only change in the source tree.


____
Rob
"Be passionate in all you do"

http://devlicio.us/blogs/rob_reynolds
http://ferventcoder.com
http://twitter.com/ferventcoder


On Wed, Dec 22, 2010 at 6:07 AM, alanbrito <notifications@codeplex.com> wrote:

From: alanbrito

I would like to use NuGet with the following workflow:

1. Developer finds and adds package to project.

2. Developer tells version control to ignore the packages folder and commits. We don't want to commit these binary packages to version control. (Why keep them in VCS if they are already in the repo, right?)

3. Automated build system runs (ours happens to be nant). Calls NuGet.exe task to pull in packages from the feed repository. No packages will be updated to new versions, as we will pin to specific versions (which the developer has already tested on his box). Then the normal compile/run unit tests/deploy stuff happens.

4. Another developer checks out source code. He will then need to run the nant build task that calls NuGet.exe in order for the project to be usable in Visual Studio.

A similar thread is here: http://nuget.codeplex.com/Thread/View.aspx?ThreadId=231541 and here: http://nuget.codeplex.com/Thread/View.aspx?ThreadId=233392. I get the feeling that the original intention was to commit these dependency DLLs to source control, but I would like to avoid this. Does anyone forsee any technical reason this would not be possible?

Does anyone have this workflow going? I think that step #3 would simply need to parse the packages.config file and install the packages listed in that version. However, I don't see that NuGet.exe supports Install-Package like the UI Console does. I guess I'd have to grab the API and write my own code to do this.

Like one of the guys in the linked threads, I also have a Maven background, and this workflow is very similar to what Maven would do (or Ivy for that matter).

Thanks!

Read the full discussion online.

To add a post to this discussion, reply to this email (nuget@discussions.codeplex.com)

To start a new discussion for this project, email nuget@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com


Dec 22, 2010 at 3:23 PM
ferventcoder wrote:
I know there is a huge debate about whether you should keep packages in source control or not, but what you are asking for seems like an error prone way of going about getting the same packages without keeping them in source control.
What do you really gain by NOT keeping the packages in source control?

The first step to a repeatable build should be to NOT have much variance. There are a world of scenarios that would increase variance by NOT keeping your packages in source control. What happens if you don't have an internet connection or there is a hiccup? What happens when someone removes the package from the feed that you depend on? What about when the feed goes down for some reason?

Thanks for the reply, Rob.  I see your point.  I guess there is a fundamental divide between the java camp (don't ever store binary data in source control), and the .net camp (ok to store binaries).

Without trying to sound preachy about Maven, here is how it happens:

  • The pom.xml file contains everything one needs to know about the project dependencies (similar to packages.config).
  • Developer checks out code, runs "mvn clean install".  This resolves all dependencies from the central repository, sets the classpath, and syncs the local cache on the developer's machine.  I could see a command like "NuGet.exe install" that could do something similar.
  • The CI build server does pretty much the same thing before the build.

What happens if you don't have an internet connection or there is a hiccup?  [Maven keeps a local copy of the repository binaries that you have previously used. You cannot add a *new* dependency to your project using Maven if you are offline, but you cannot do this using NuGet either, assuming your feed is on a server somewhere.]

What happens when someone removes the package from the feed that you depend on?  [This is a cardinal sin in Maven dependency management.  Once committed to the repository as a non-snapshot build, ThouShaltNotDeleteOrChangeEver.  Yes, this requires a bit of governance and care, since automated builds and deploys depend on this.  ]

What about when the feed goes down for some reason?   [Again, cached local copy takes care of this.]

I guess since coming from the java world I somehow feel dirty about storing binaries in source control, when they could easily be resolved at build time.  The same goes for generated code, really, which can also be generated at build time.  The biggest issue I guess is space.  When a solution has hundreds of dependencies (and dependencies of those dependencies), it makes checking out / cloning projects very slow.  This could be a lot of stuff going into source control.  To be fair, running some command to resolve these dependencies after checking out, could also be a bit slow.

In the end, I guess I'm not too worked up about storing the packages, but it just doesn't feel quite right.

-Alan-

Coordinator
Dec 22, 2010 at 3:33 PM

Hi Alan,

One key scenario for NuGet is to not require NuGet for developers who want to get started working on a project. For example, say I start an OSS project and install 5 packages. And later another developer Jill comes along and she doesn't have NuGet installed (what is wrong with her!?). When she gets latest from my source repository, she can actually build the solution and even start working on it right away.

If she wants to add a dependency, she'll need to install NuGet of course, but until then, there's no additional requirements for her to get going.

However, as NuGet becomes more ubiquitous, the value of this scenario becomes less and less. That's why we do have an issue in our issue tracker that addresses the scenario you describe. "add-package from packages.config" http://nuget.codeplex.com/workitem/165

So it's something we'll look at in the future, but hopefully I've described why we have the system the way it is today in a way that makes sense.

Phil

Dec 22, 2010 at 3:39 PM
Thanks Alan. I agree with the thou shalt never delete a package scenario. Hopefully it will become more evident as we move forward that an update to a package is really a new package. :D
____
Rob
"Be passionate in all you do"

http://devlicio.us/blogs/rob_reynolds
http://ferventcoder.com
http://twitter.com/ferventcoder

Dec 22, 2010 at 5:20 PM

It's been said before, so I'm beating a dead horse here, but I agree with Alan and the Java community. Binaries in SCM is a big no no. Especially with distributed SCM solutions. Talk about network hiccups and bad feeds is a bit off, as most CI builds already have network dependencies any way, and as Alan points out it's pretty simple to treat this in the same manner as DVCS systems do (i.e. local caches) and rarely need to access the feed/network.

Phil's point about not requiring NuGet for getting started on a project is quite a valid scenario that should be supported, but that does NOT mean this other scenario should not be supported. The individual users should be able to determine which tradeoff is appropriate for their project(s). I'm glad this seems to be understood and plans do exist to support this scenario, but these continued responses about "in the future" coupled with the insistence that checking in binaries is the only way to ensure sane builds just makes it come across like you're dragging your feet, and this isn't something you should drag your feet over.

Just my .02

Coordinator
Dec 22, 2010 at 5:32 PM

This is just not a scenario that has come up as a very high priority in our own usage nor the usage of most people we talk with. However there appears to be a small passionate group of people for whom this is an important feature.

This is the beauty of open source. People tend to work on the problems that are most important to them.

Every good work of software starts by scratching a developer's personal itch. – Eric Raymond, The Cathedral and the Bazaar

Feel free to chip-in, or motivate someone else who finds this important to chip in, and help get it implemented sooner rather than later. :)

Dec 22, 2010 at 8:54 PM

This is actually a scenario that was discussed a number of times before, and I do think it is worthwhile to support.  We should think through what the design might be.  In essence, it about having a process that looks through the packages.config for all the projects in the solution, and then bring down those packages into the packages folder.  So technically, it's probably not all that difficult.

Some potential tricky things:

  • Assembly references point to the assembly in the packages folder.  So until the packages are brought down, VS will view it as a broken reference.  Sometimes, VS has a tendency to try to look for that reference in other places to fix it up, which can cause trouble if it in fact finds it.  Maybe not a big issue for most assemblies that are not installed anywhere global on the machine.
  • If the exact package version is not available, it'll have to settle for the next best.  But doing this requires first uninstalling files that could have been copied into the project (e.g. aspx files).  And unfortunately, this can't be done without having the original package.  Maybe it's best to say that if the exact version is not available on the feed, then things just do not work.
Dec 22, 2010 at 9:13 PM
davidebbo wrote:

This is actually a scenario that was discussed a number of times before, and I do think it is worthwhile to support.  We should think through what the design might be.  In essence, it about having a process that looks through the packages.config for all the projects in the solution, and then bring down those packages into the packages folder.  So technically, it's probably not all that difficult.

Some potential tricky things:

  • Assembly references point to the assembly in the packages folder.  So until the packages are brought down, VS will view it as a broken reference.  Sometimes, VS has a tendency to try to look for that reference in other places to fix it up, which can cause trouble if it in fact finds it.  Maybe not a big issue for most assemblies that are not installed anywhere global on the machine.
  • If the exact package version is not available, it'll have to settle for the next best.  But doing this requires first uninstalling files that could have been copied into the project (e.g. aspx files).  And unfortunately, this can't be done without having the original package.  Maybe it's best to say that if the exact version is not available on the feed, then things just do not work.
  • Yes, the project file would be broken until the install is run.  As you say, maybe not a big issue as long as it is not installed globally, which is a bad practice anyway.
  • I think strict handling here would be best.  The developer that installed the dependency in the first place would be using the same repository as the developer who just checked out the project (or the CI server, for that matter).  If the package is not found, would that not be a potentially huge problem?  There may be a very valid reason a specific version is needed.  The repository feed should be treated as sacred.  Once a package at a specific version goes in, it must never be deleted or changed in any way, or builds start breaking.  IMO, the package should fail to install in this scenario, forcing the developer to find out why the package was removed.

-Alan-

Jan 12, 2011 at 6:28 PM

Adding another vote for this capability.  Source control systems store source code.  Binary repositories store binaries.

I realize this is a religious debate.  I have a hard time with the following statement:

"However there appears to be a small passionate group of people for whom this is an important feature"

I think there are more of us than you think.