NuGet tagging abuse

Mar 14, 2012 at 5:00 AM
I was just searching for something on NuGet and I came across several of these:

Normally not something that bothers me, but is this really relevant?
Mar 14, 2012 at 5:04 AM

Thanks for pointing it out. I think we should limit the length of the Tag field.

Mar 14, 2012 at 5:07 AM
Or the number of tag fields that takes by default to something like the first 20 tags in a nupkg. Not sure I would ever need more than 10ish, but you never know.

Mar 14, 2012 at 5:16 AM
To the guy's defense, there is no statement on what you should be using the tagging system for, so it's not really an abuse since there is no real policy on it (or if there is I haven't seen it).
Mar 14, 2012 at 5:28 AM

I personally hate having to use so many tags, but until start supporting newlines, it's hard to put the required keywords into the description, and unless the right keywords are present, users can't find the package.

I love NuGet, but searching for packages right now is nearly impossible unless you know the exact name.

I opted for the approach of using Tags and leaving the description readable, but I'll be thinning those tags out as soon as NuGet supports Markdown or HTML. 

All of the tags listed are relevant to the package. If there are some you don't think are relevant, let me know, and I'll explain. Some may be too generic or too specific, perhaps, but they're all relevant to features of the package series. 

I went through each tag and made sure each one was needed to support searching on a known search query - the only tag I should probably remove is 'redirect', as I don't remember why that was in there. Perhaps it was part of the CloudFront redirect feature...

I'm all for hiding all but the first 10-20 tags, but we need a better way to make packages searchable.

Mar 15, 2012 at 1:50 AM

We made some first steps toward improving our search by switching to Lucene.NET, but there's certainly more we cn do. @nathanaeljones, can you give me some specific examples of where search is failing, so that we can use those scenario to continue to tweak our we index and search packages. Thanks.

Mar 15, 2012 at 2:12 AM

Nathanael does point out something lacking that would be nice in the description - formatting. I noticed the release notes recognizes line endings. Would be nice to something similar applied to the description.

Mar 15, 2012 at 3:43 AM

@drewmiller I'm not sure if you've enabled stemming yet in your lucene analyzers, but that could help somewhat considering the problem you are up against - namely searching 2-sentence descriptions of complex packages that need a page or 10. 

I think Markdown support is a no-brainer, but since it's xml, you could use an attribute to specify the content format, and even start with just allowing newlines. Until that's done nobody can really use all of the right words to make their stuff 'findable'. 

Right now the only user-side solution is to cram the tags full of keywords, which has the unfortunate side effect of making the page ugly. Displaying only the top 10 most popular and 10 most rare tags seems like a reasonable solution that wouldn't require too much committee work. I know supporting markdown/HTML/X is a much more controversial topic. 

Mar 15, 2012 at 4:31 AM

@nathanaeljones I love Markdown and I'd love to see us use it for descriptions, but the potential problem is there are numerous places the description is shown, some of which we don't control., and some of which aren't as flexible as website. That's not to say we can't do something better, I'm just saying it's something we need to think through. I'll create a work item (though I'd be surprised if there isn't one already, seeing as we've talked about using Markdown (or in general improving display of the description) before.

As for the Lucene impl, I didn't work on it. I assume we're leveraging Lucene's inflection support, but I don't know off-hand. I'll definitely look into it when we revisit search, but we have other work queued for the next two iterations. Specific examples that you had to work around would be helpful.

Mar 15, 2012 at 4:32 AM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.