Why is the remote package list only showing A-F?

Dec 24, 2010 at 3:24 PM

When I issue:

Get-Package -Remote

in the Package Manager console, I'm only getting entries from A through F (the last one is Facebook 4.1.0).  How do I continue with the rest of the list?  Am I missing something incredibly obvious?

TIA

Developer
Dec 24, 2010 at 4:26 PM

The odata service limits the number of results. Are you looking for a specific set of packages, or do you just want to see all? You can filter packages with Get-Package -Remote {package name} or Get-Package -Remote -Filter {package name}

Dec 24, 2010 at 4:37 PM

I want to see the whole list.  I'm browsing. :-).  Yeah, I discovered the -Filter qualifier.  Thanks for your help. :-)

Dec 24, 2010 at 7:01 PM

Note that the reason the OData service limits the results is to avoid having the entire feed be downloaded at once.  Right now we have 300+ items and the whole feed is already somewhat large.  When it reaches several thousands, downloading the entire feed just won't work well.

However, I wonder if we should support some way of getting more items.  Note that if you use the NuGet dialog, you can page through the entire list.  The problem from the console is that there is no paging support (which would be hard since we never prompt).

Dec 24, 2010 at 7:02 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Dec 24, 2010 at 7:59 PM

To support paging in the console, we can add the -First and -Skip parameters to the Get-Package cmdlet.

Dec 24, 2010 at 10:07 PM

Probably -Take instead of -First?  Yes, we could do this, at least as a low level building block to higher level operations.  One issue is that if you don't pass -Take, you'd assume you get them all instead of running into server side paging limits.

Dec 25, 2010 at 7:48 PM

I follow the same convention as the built-in Select-Object cmdlet, which use -First instead of -Take.

To work-around server side paging limits, can we keep querying the server until we reach the end? Each query will start from where the previous query left off. Then we can output all results to the console?

Developer
Dec 26, 2010 at 6:36 AM
Edited Dec 26, 2010 at 6:54 AM

I like dotnetjunky ’s idea. Get-Package would work inconsistently depending on the kind of feed (physical repository versus data service repository) its working against. We shouldn’t impose limitations of our querying protocol on our user.

 

We could use the count to figure out the number of packages and present a consolidated list when working against a OData feed. Admittedly this would be slow, but the scenario is pretty rare.

Dec 26, 2010 at 7:15 AM

Ok, let's go with First if there is PowerShell precedent.

We can certainly loop through to avoid the server limit, and we in fact already have code to do something like that (see BufferedEnumerable).  But the question is what should be the default behavior when no -First is specified.  The natural thing is to return everything, but the concern is that it would be too slow when the feed becomes too large.  Since it's what most people will use, we want to make sure it's a sensible default.

I think we can start with displaying everything and see how that goes.  We can always change the default in the future if the feed becomes too large.

Developer
Dec 26, 2010 at 10:29 AM

We should just always display everything. In the command line world the way you'd do paging is to pipe the results to more (or less).

Get-Package  |  more

The only reason we can't do this not is because we can't prompt, but when we can we should have some support for this.

Dec 26, 2010 at 5:15 PM
Powershell likes to stream. Since the remote protocol isn't a streaming protocol, you end up simulating it by making several partial requests. Seems like a reasonable thing to me.
Dec 26, 2010 at 5:48 PM

Hopefully by the time the feed becomes large, we can implement asynchronous command invocation, which will allow us to do prompt and other useful things.

Meanwhile, I'll try this technique out and see how it goes.

Dec 26, 2010 at 5:49 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Dec 27, 2010 at 1:22 AM

I had already linked this to a bug earlier in this thread, so feel free to dup one to the other :)

Developer
Dec 27, 2010 at 3:21 AM

@dotnetjunky you shouldn't have to write much logic as we already have most of it (as davidebbo said).

Dec 27, 2010 at 6:53 AM

Are you referring to the BufferedEnumerable class? I tried it at first, but it didn't work for me due to a couple of reasons:

First, it doesn't work well with Skip and Take.

Second, it requires a buffer size, and it assumes that if the source ever returns less than the buffer size, then the source has no more items. This is incorrect for the OData case.

So I ended up writing a small extension method that handles the job for me. That took me little time. The troublesome part is writing unit tests. I ended up writing a QueryProvider to simulate the Odata service paging limit behavior. I managed to copy most of the code from SafeQueryable though :)

I'm sending out code reviews shortly.

Dec 27, 2010 at 7:00 AM

Why doesn't it work well with Skip/Take?  You should be able to take any normal query and then call AsBufferedEnumerable() on it.  As for the buffer size, you can just use a small size like 30, which is safe to assume will always be less than the server limit.  BTW, we're already making the same assumption in the dialog.

Though technically, buffered enumerable is not very efficient for this scenario, because we really don't need buffering.  I wrote a more efficient non-buffered version recently, which may be similar to what you wrote.  I'll forward you the code.

Dec 27, 2010 at 7:14 AM

The problem is that BufferedEnumerable internally also use Skip/Take. So if I pass in an IQueryable which already include Skip/Takes, it blows up.

Yes, I forgot to mention about the buffering issue too.

Developer
Dec 27, 2010 at 11:53 AM

I'd like avoid having 2 different pieces of code doing the exact same thing with minor tweaks. We should either extend buffered enumerable or have a really good reason why we don't use it.

Developer
Dec 27, 2010 at 11:54 AM
dotnetjunky wrote:

The problem is that BufferedEnumerable internally also use Skip/Take. So if I pass in an IQueryable which already include Skip/Takes, it blows up.

Yes, I forgot to mention about the buffering issue too.

By "blows up" you mean it throws an exception?

Dec 27, 2010 at 6:02 PM
dfowler wrote:

By "blows up" you mean it throws an exception?

Yes, it does. 

Dec 27, 2010 at 6:26 PM
Edited Dec 27, 2010 at 6:28 PM
dfowler wrote:

I'd like avoid having 2 different pieces of code doing the exact same thing with minor tweaks. We should either extend buffered enumerable or have a really good reason why we don't use it.

 I understand your point. We have a few options:

  1. Don't add First and Skip for paging. Use BufferedEnumerable as is. Call it with a small buffer size smaller than the server limits. (note that this will result in chatty server calls)
  2. Same as 1) but modify the BufferedEnumerable to avoid buffering the results. (We can use Ebbo's code here)
  3. Add First and Skip for paging. Modify BufferedEnumerable to work well with First/Skip. Can also incorporate Ebbo's change to avoid buffering. This will require more testing to make sure it doesn't break any other areas.
  4. Continue with my code. :)

Please vote :) Or feel free to add any other option.

Dec 27, 2010 at 7:23 PM

I actually like your new code.  I initially suggested BufferedEnumerable because it was there, but it really is not a great fit here, as it's way too heavy weight, especially when going over the entire feed (which will be the default scenario here).  BufferedEnumerable is meant for the dialog scenario where:

  • You only go through a small subset of the items (since you have to click next manually)
  • You need to be able to quickly go back in the last, hence the buffering

Your new GetAll method is pretty simple, so I'd go with it.

Developer
Dec 27, 2010 at 10:41 PM

I actually don't agree at all(surprise). BufferedEnumerable is exactly for this scenario but it's only "heavy" because we keep the list around (on purpose), so that if you choose to hold onto the enumerable, the results don't have to be re-queried. I'm not sure why there is so much push back on if we should use it. I'd love to see the efficiency of both methods measured and see if it's alot slower. 

Dec 27, 2010 at 11:14 PM

It's not the issue of speed. I believe both are more or less the same in terms of speed. It's the issue of buffering. We certainly don't want to store all items in the buffer. As Ebbo said, BufferedEnumerable is deisgned more for the dialog scenario. And as I said, BufferedEnumerable doesn't work for Skip and Take.

If you look at the GetAll() method, it's actually quite simple, with basically just a while loop. And it works very well with Skip/Take. It's not the same code pattern as BufferedEnumerable. So I'm not sure why you don't like it.

Dec 27, 2010 at 11:27 PM

Yes, I don't get all the arguing about something so minor.  GetAll is indeed a very small and simple method, and it does exactly what's needed to enumerate through all items optimally, no matter what server paging limits are set.  It seems cleaner to use this than to add logic to BufferedEnumerable to optionally not buffer (which would make its name not make sense!).

Let's just move on :)

Developer
Dec 28, 2010 at 3:48 AM

Sure...

Dec 28, 2010 at 6:19 PM

I did find an issue with the new code: it always makes an extra query.  See http://nuget.codeplex.com/workitem/517

Dec 28, 2010 at 7:03 PM

Yes, I'm aware of it. We can query less than the server page limit but it means it can't work reliably with any server. Maybe it's not a problem.

Personally, I don't think this is a major issue though, because the extra query always returns empty results.

Dec 28, 2010 at 8:37 PM

My concern is that this extra http request may add a small yet not negligible delay to the command.  It also doubles the number of http requests that the server need to respond to when people look for packages (since the result set is typically small).

Note that OData has client support for its server paging (see this post). Though that would mean doing things that are not general LINQ, which could add some pain.

Developer
Dec 30, 2010 at 5:04 AM

The command line tool (NuGet.exe) also needs to be updated to list all of the packages (there is a command nuget.exe l)

Dec 30, 2010 at 6:48 AM

Good point, I opened http://nuget.codeplex.com/workitem/525 for tracking.  The cheap fix is to just call AsBufferedEnumerable on it, though as we discussed, it's not memory efficient.  Would could have another public extension that does the right thing, but it may be overkill for now.

Developer
Dec 30, 2010 at 4:40 PM

Last night Ebbo and I discussed changing our odata layer to bypass all server paging logic, this gives a cleaner solution that works across all clients.

Jan 6, 2011 at 7:39 AM

Are you planning to disable paging in the OData packages feed? I have some powershell utility scripts that currently implement paging logic and am wondering if I should keep that in or hold off.

Jan 6, 2011 at 7:46 AM

No, we are not disabling server paging.  It's important to make sure that going to the feed in the browser doesn't dump the entire feed, which could become very large,

What we did is add some client logic such that if you got through our package API, everything will work as if there is no server page limit.  So if you're scripts go against our model, you don't need that paging logic.

BTW, you can try the latest 1.1 today by installing from here.

Developer
Jan 6, 2011 at 7:01 PM

I think the best way to do paging in powershell is to write a cmdlet that works like more but understands IQueryables so it can do the right thing begins the scenes.