1,804 articles and 14,684 comments as of Thursday, January 20th, 2011

EndUserSharePoint has combined resources with NothingButSharePoint.com. You can now find End User (Mark Miller), Developer (Jeremy Thake) and IT Pro SharePoint (Joel Oleson) content all in one place!

This site is a historical archive and is no longer being updated. Please update your favorites, bookmarks and RSS feeds.

NothingButSharePoint.com
Tuesday, April 20, 2010

SharePoint Content Structure – Let a thousand content types bloom?

Guest Author: Stephanie Lemieux

“How many content types should you have?”

This is the question that came up in a conference call last week on SharePoint architecture. This organization had implemented their corporate portal on SharePoint 2007 and was interested in going forward with more portal sites but had some concerns about the approach to information architecture they had undertaken.

I answered what I would answer no matter what technology it was – “Only as many as you really need to implement the appropriate level of metadata, workflow and templates.” Which is of course vague, as most good consultant-ese is. I followed up with some stats: when we work on web content management implementations, we typically end up with about 10-15 content types for a site of medium complexity. We always try to keep the structure simple and number of content types few for many good reasons, ranging from ease of content structure management to content publisher user experience.

The folks on the phone were quiet for a minute… You see, the previous consultant they had worked with had a bit of a different (read opposite) approach. The philosophy they described was that SharePoint content types should be created to the maximum degree of granularity (e.g. one content type per library) so as to reduce the need for content publishers to select a content type and tag metadata values. For example, if you had a site for human resources forms, you would have one library and content type for medical forms, one library and content type for dental forms, etc. Each content type would be extremely specific and require little tagging. “If you need 30,000 content types, then so be it” is the idea. (insert eye twitch.)

The intent behind this – to reduce uncertainty and effort for content publishers – is noble and good, and in some specific cases might be the right approach. But in general, the overly-granular content types seems to be in the realm of sledgehammer to kill a fly. To help explain why, I thought I’d enlist the help of a couple of friends and colleagues.

First, I emailed content management guru Bob Boiko, author of the Content Management Bible, to see if he agreed. His response?

“How many content types is the right number? The fewest possible to squeeze the most value out of the info you possess. If it were my system, I would create a generic type and put all the info that I could not find a business justification for into that bucket. It’s not worth naming if you can’t say clearly why you are managing it. Then I would start with the info we have decided is most valuable and put real energy into naming the type and fleshing out the metadata behind it. Then on to the next most valuable and so on till I ran out of resources. In that way, the effort of typing is spent on the stuff that is most likely to repay the effort.

Amen to that! But I also wanted to get a tool-specific view from my colleague and SharePoint expert friend Shawn Shell. So I skyped him…

So, what do you think?

Well, having a content type for every document library is certainly an interesting approach, though I think your SharePoint administrators, as well as your users, might go quite mad.

So, I think the argument is that having this many content types is supposed to make it easier on the users by presetting all choices and removing the potential for error. If you never have to choose a content type because each library has a very specific default that matches the content you are creating, then there’s no confusion, the idea seems to be… From a general content management perspective, this is flawed. But what about from a SharePoint-specific standpoint?

I can understand why this might make sense on the surface.  Unfortunately, I think you end up exchanging one kind of confusion for another.  Further, there’s a huge maintenance implication as well. For example, if you have a content type for each library, you are, for all practical purposes requiring the user to decide where to physically store a document.  This physical storage then implies your classification — regardless of whether a default content type is applied.

So, you’re basically recreating all the ills of a fileshare folder structure.

In essence yes. To make matters worse, more complex SharePoint environments will necessarily include multiple applications and multiple site collections. Because content types are site collection bound, administrators will have lots more administration to create, maintain and ensure consistency across the applications and site collection. This would normally be true, but when you have such an overload of content types and libraries, the complexities of management are compounded.

So, if you have 50 content types, and you need to use them in 2 or 3 site collections, you’d have to create 150 content types. Good argument to keep your use of content types judicious. Is there a hard limit to the number of content types one can manage in a site collection?

The answer is “sort of.”  There’s no specific hard limit to the number of content types in a site collection, but there are some general “soft limits” in the product around numbers of objects (generally 2000). This particular limit is an interface limit where users will see slower performance if you’re trying to display more than 2000 items.  The condition won’t typically manifest itself for normal users, but it will for administration. The other real limit is the content type schema can’t exceed 2 Gb.  While this seems like a pretty high limit, if you have a content type for each library, loads of libraries in a site collection and robust content types, there’s certainly a chance to hit this limit.

What about search? I assume that a plethora of content types would have adverse effects on search.

It absolutely does.  Like everything we’ve discussed here, the impact is primarily two fold: 1) administration and 2) user experience. Content types, as well as columns, can be used as facets for search.  If you have an overwhelming number of facets in results, the value facets bring is reduced.  Plus, as I mentioned before, having large numbers of content types could also produce performance problems when trying to enumerate all of the type included in the search result.

From an administrative standpoint, we’re back to managing all of these content types across site collections, ensuring that the columns in those content types are mapped to managed columns (a requirement for surfacing the metadata in search results) and, if you have multiple Shared Services providers, that this work is done across all SSPs.

I expect there will also be a usability issue for those trying to create content outside of the SharePoint interface. Wouldn’t users have to choose from the plethora of content types if they started in Word for?

This is another excellent point.  Often, when discussing solutions within SharePoint, we think only of the web interface. When developing any solution, however, you need to keep both the Office and Windows Explorer interface in mind as well. Interestingly, using multiple document libraries, with a content type for each library, makes a little more sense from the end users perspective, since it’s similar to physical file shares and folders.

However, the same challenges that many organizations are facing related to management of file shares can manifest themselves when using the multiple library and matching content type approach as well — putting these organizations back in the same unmanageable place they started.

Great, thanks Shawn for your insights! I’ll be sure to spread the word to avoid a content type pandemic.

So there you have it folks. As a general rule, less is more. Standardize, simplify and don’t let your content types multiply needlessly. Your content contributors and SharePoint administrators will thank you.

Guest Author: Stephanie Lemieux

Stephanie has a Masters in Library and Information Studies (MLIS) from McGill University, specializing in knowledge and content management, taxonomy, and information architecture. For the past several years, she has been working on taxonomy & knowledge management contracts and research projects for a variety of clients.

 

Please Join the Discussion

15 Responses to “SharePoint Content Structure – Let a thousand content types bloom?”
  1. Xene says:

    Stephanie, I really enjoyed this article. Information architecture is a hot topic in our environment. The timing of this piece is perfect for our implementation. Thanks!

  2. What an enjoyable read. I especially liked reading the conversation and the reasoning behind why a plethora of content types can lead to significant administrative load and complications.

    A part of me want’s to caution the idea that there isn’t considerable benefit to many content types though. Especially if it’s in a well thought out and planned manner. It is my belief that the concerns listed can be outweighed (and often) by the benefits of more content types (within reason etc).

    Here are a couple short thoughts:

    A) Content types have a hierarchy for a reason. – This can significantly reduce the issues in managing content types if carefully planned and executed. This can really be leveraged for a lot of extra management ease, and some very interesting benefits when it comes to ‘digging through’ or exploring content.

    B) Reasonably if you are deploying content types across site collections it makes sense that a feature would be developed to accomplish this. Realistically this can all be done using XML so it’s a very simple (in terms of complexity) task. So if we assume content types are deployed across site collections in this manner it sort of removes that big bubble of concern.

    C) Realistically it’s not as though every ‘type of content’ within an organization requires this level of complexity. In fact many will not, and as the organization evolves, learns, and matures in its understanding of SharePoint users will begin using metadata more often as well as the other features of SharePoint. While I can see the concern of having hundreds or thousands of content types I would suggest: VERY few organizations could come up with hundreds of distinct types of content worth making into their own content types (architecturally speaking). So the very large number seems unreasonable as a serious concern.

    D) The reason it’s useful to have the initial content type structures (based on your initial evaluations as a consultant with the client) implemented as a default is that it also makes it much clearer when a piece of content is ‘not in the right place’. This can significantly help getting people used to putting their content where it belongs. Which from a SECURITY standpoint becomes absolutely critical. A simple way of handling security (the simplest) is making sure people put things where they should be and where permissions can be automatically assigned and inherited correctly.

    So my argument here is that it would actually help users in knowing where to put content since it has an impact on search and the way content rolls up in various places.

    E) The power of content types, and their respective limitations has changed in SharePoint 2010. I would suggest that this alone (if we think of where things are going) is reason enough to carefully consider sometimes more content types (if well thought out and structured) leads to better content management, discover-ability, and eventually a more structured Intranet.

    (Note: There are definitely technical complexities with managing content types across multiple site collections, and I am not discounting these. Whomever is the taxonomy architect, or the individual planning for content needs to fully understand the limitations and gotchas of SharePoint content types before trying to use many across many site collections.)

    I admit it’s definitely something people can do way too much of (especially if they don’t fully understand those limitations) but I also honestly believe there are many scenarios where more than 15 content types is an intelligent and effective decision based on some of the reasons I briefly outlined above and your own references to their advantages.

    Would love to hear more though, these were just my first thoughts for the minute or two I could spare at lunch today. :)
    Richard Harbridge

  3. Brian Bedard says:

    This argument may be different for a team collaboration site. We use a content type for each list we want to provision. Sometimes these lists become more specialized and we then derive from the content type and create a more specialized one. So we use content types to stamp our lists. We try to eliminate the fields collection in a list schema and connect directly to the site content type and control all list fields from site columns. We’ve only on one occasion had multiple content types declared for a single list schema. We do this because our solution is a provisioning system that generates a team site with a number of lists upon request. We solved the the ALM crisis by patching the content types and site columns and pushing the changes down to all the lists in all the provisioned sites. We have alot of content types but no infopath nor publishing layout pages.

  4. Ruven Gotz says:

    This article really made me think, thanks for posting it Stephanie.

    After reading it through a couple of times, I think that there is a little bit of “extremism” that may send our thinking in extreme directions.

    The idea of one-and-only-one unique content type per library (and only one library per content type) is an extreme example that I have not come across in real-life projects (but it obviously has happened). In fact, one of the tricky parts of my job is working with stakeholders to define what the libraries are, what content types go into each library and what folders (yes, you heard me, folders) may be useful within those libraries. The trick is to find a balance that best serves the users (both content consumers as well as creators).

    For example, my customer does product testing at their internal lab and also using one of four outside labs. They are looking for a place to store the resulting lab reports. In this case, we settled on a single document library with two content types: “Internal Lab Report” and “External Lab Report”. When uploading a document, the user picks either internal or external. When choosing “external”, the user is required to select which of the four external labs was used (a piece of data not asked for when an “internal” lab report is uploaded. For either type of lab report, they are also required to choose the value for the result of the test (Pass or Fail). Now, we could add content types called “External Lab Report – Passed” and “External Lab Report – Failed”, which would reduce the requirement for the user to enter that piece of metadata, but I do not believe that this would not add value here.

    As Richard points out above, the even better solution would be to make “Internal Lab Report” and “External Lab Report” sub-types of “Lab Report” as that would allow for more flexible searching and simplifies metadata management.

    Whether you structure your content with folders/sub-folders, sites/sub-sites/libraries, content types or combinations of all of these, you can create different architectures that are logically equivalent. However, some will be much easier to manage and use than others. The goal is to find the optimal mix.

    The issue of content type management across site collections is real, but there are tools that can help with this (and it goes away in SharePoint 2010).

    Every project is an ‘it depends’ scenario, but on a recent project, which was a small to medium size corporate portal, we had about five main sites (not including the team/project collaboration sites), 20 document libraries/lists, and about 20 ‘top-level’ content types. If you include content sub-types the total number of content types comes out to about forty. I’d be interested in hearing some statistics and more comments from other readers.

    -Ruven

    • I really liked how specific these examples were. Maybe what we need more of is real scenarios clearly described to help users, managers, and administrators understand how they should design or work with their taxonomies.

      I too am interested in hearing more statistics and comments from other readers. :)

  5. Andrew Burns says:

    It’s always an interesting question, as the answer isn’t black or white – is a shade of gray.

    Having loads of content types so that your users don’t have to fill in any fields is silly. You’ll end up overwhelmed with Content Types. It’s also not unreasonable to expect users to fill in field data. Yes, there is a training issue there, and yes, you might be able to do things to make that experience better – but sadly it’s difficult to magically ‘know’ what their intention for that document is. And yes, I’ve seen this in customer’s systems.

    Having very few content types has the opposite problem. You still need to capture certain metadata, so you end up with a small number of content types containing a large number of (often irrelevant) fields. I’ve seen this happen, too – content types with an ‘Edit Properties’ page 3 or 4 screens tall. This is actually worse – users simply avoid filling in the data, even the bits that are relevant.And yes, I’ve seen this in customer’s systems too.

    I find that I totally agree with Richard above, that the inheritance hierarchy of Content Types is a useful way of finding this middle ground.

    It is worth noting that whatever you do, each List or Library will have it’s own Content Types – what you see used on Lists and Libraries are actually children of the Content Types defined at the site level. That’s why if you add a column to a Library which has ‘Advanced Management of Content Types’ enabled, you can ‘Update all content types on this list’. See http://www.novolocus.com/2008/03/28/content-types-whos-your-daddy/ and http://www.novolocus.com/2008/03/27/what-happens-to-content-types-when-you-add-a-column-to-a-list-in-sharepoint/ for more info.

    I’m not that convinced about the ‘Search’ arguments – yes, facetted search would suffer with lots of facets, but little else would. In fact, (in SP2007 at least) Search is unaware of Content Type as anything other than another crawled field. And the problem with mapping crawled columns to managed properties isn’t one of Content Types but rather a problem of Site Columns. Often more Content Types means more Site Columns, but not always. We’d a customer who needed a lot of Document Content Types (they *had* to have lots of different templates), but they all used a fairly small set of Site Columns. Search admin wasn’t an issue.

    Oh, and I’ve not built an entire Intranet – but some of the applications we’ve built in SharePoint use up to 120 Content Types. We had to build ways of helping manage that. But that was exceptional. Normally, solutions I build have up to around 10.

    • Much better stated than what I meant to say around search especially. Thanks!

      The search point I think is a really crucial one.

      If we let ourselves think about a single (in this case of SP2007 unsupported codeplex) solution’s feature influence how we structure (or not structure) certain content it can lead to considerable difficulty down the road.

      (P.S – Just to put this Faceted search point to bed: Faceted search is awesome, but also a codeplex project, which means it’s not too hard to modify/improve. Removing that potential issue.)

      There is a considerable discussion (and progress being made) around Automatic tagging and application of metadata using Search. (Since search can crawl a document’s content it can also infer many interesting things about the content.) Something to think about anyways.

      • Richard – I enjoyed reading your posts as always. I think the idea of easily customizing the Faceted Search is relative to the skill set and commitments of the person/people that have to do it. I have no doubt that the Faceted Search feature is often implemented in an environment with a limited SharePoint developer skill set. Having said that I love the tool!

        Regarding automatic tagging, I use the Calais Tagaroo plugin on my blog that is semantically driven and it’s quite incredible and gives me hope for automatic tagging in the not too distant future.

        I tend to look at Content Types from a simplistic point of view. How can I search the information, how do I need to view it and how easy is it to propagate the inevitable changes?

        This was an extremely informative, well-written article.

  6. Jason Lochan says:

    Wow, my eye twitched at the same time as yours.

  7. Bjørn says:

    Personally, I think the name holds the answer to this question quite clearly. It is content type, in other words, a type of content. Not a type of field, and type of list, or anything like that. It is a type of content.

    Having a content type for each field of metadata makes no sense, nor does having a content type per list or library, unless that list or library contains only one type of content (there’s that reversal of the two words again – go figure). If your list or library contains more types of content (amazing how often this appears) you have more content types.

    You should have one content type per type of content. For immature installations and for file share to SharePoint kind of installations, most likely the type of content will be document and that’s it. For more mature solutions, both the business and the users will have clearer understandings of which types of content they want, and you’d create your content types according to that.

    At it’s core, however, it’s as simple as reading the two words: Content Type.

    .b

  8. Amy says:

    Stephanie, I really enjoyed this article. Information architecture is a hot topic in our environment. The timing of this piece is perfect for our implementation. Thanks!

  9. Mike says:

    Great article. Thanks to everyone who posted.

    Six months ago, Sharepoint was deployed in the company where I work. In one of our internal sites, I have a document library with 20 content types, one for each letter type. These letters are very similar (termination of service templates). Most have the same site columns; others have 1-2 additional site columns.

    What’s the best approach for this scenario?

    1 conteny type called ‘Termination of Service’, and have all the possible site columns
    20 content types. One for each letter even if the site columns are identical (or nearly identical)

    thanks for any insight!
    Mike

    A few months ago, a SharePoint user getting ready to establish content types, I’d like to know if someone can provide some advise.

    I have a library that consist of 20 letters.

  10. Michael A. "Lucky" LaChance says:

    Great article. I enjoyed reading the debate in conversational style. It also helped me as I struggle with a contract library concept tied to CRM where I am not only managing SP metadata, Content types but also relationships and entities in CRM. Keep the Content types for contracts to a minimum! I agree. The devil is always in the details though :-)

    Thanks for a great explanation of the impacts of faulty content type logic.

  11. Ruth says:

    I’ve been round in circles over the best way to use Content Types in our organisation. The concept of one content type per business content works in theory until you start introducing templates. A content type can only be associated to a single template and the hierarchy is limiting factor because you need “parent” content types to support the content types with associated templates.
    Has anyone got any best practice on this sort of scenario?

    Thanks in advance

  12. Ruth – This sounds like the foundation for a good article. Could you please state your question, include a full description of what you are trying to accomplish and then email it to me. I will look at it with an eye towards publishing it to get community input. Thanks in advance. — Mark


Notify me of comments to this article:


Speak and you will be heard.

We check comments hourly.
If you want a pic to show with your comment, go get a gravatar!