SharePoint 2010 – What You Need to Know About Taxonomy, Metadata & Information Architecture
Guest Author: Jeff Carr
As a follow up to last week’s guest post on AIIM’s Digital Landfill blog and the official launch of SharePoint 2010 only weeks away (May 12 at 11 a.m. EST), I thought I’d take some time and put together a series of posts that dig into further detail around each of the 8 things you need to know about taxonomy, metadata and information architecture in SharePoint 2010. Topics in the will series include:
- Using Taxonomy and Controlled Vocabularies for Content Enrichment
- Using Social Features for Personal Classification and Improved Findability
- Using Taxonomy and Metadata to Improve Navigation and Browsing
- Using Taxonomy and Metadata to Improve Search and Discovery
- Sharing Content Types Across Site Collections
- Using Retention Stages to Manage the Information Lifecycle
- Administering Taxonomy Using Term Store Management
- Importing Taxonomy Using the Managed Metadata Import File
Before jumping in however, I’d like to take a step back and look at the problem of taxonomy and information architecture in general from a more strategic perspective. As the growth of digital content within our organizations continues to increase at almost unmanageable rates, our ability to provide intuitive access to the right information at the right time for users is rapidly becoming a significant challenge. If left unchecked, this challenge is sure to translate into significant costs as a result of lost productivity through time spent looking for relevant information.
Oftentimes, our IT departments attempt to solve the problem through the procurement of new technologies that come with the promise of improving findability only to find that the more technology we throw at the problem, the more complex it becomes and the further behind we fall.
Successful information management requires strategic initiatives that lie outside the realm of enterprise systems and focus on understanding and developing a consistent set of organizing principles to be applied across all technologies. Understanding the intricacies of knowledge domains enables enterprises to fully leverage technical capability and when this is not done, chances are new systems and tools will not fully meet the needs of the business.
SharePoint is a technology that is no different. Organizations use SharePoint for a variety of purposes from intranets, extranets and customer portals to document management and team collaboration. There’s been significant excitement about new product functionality introduced as part of the SharePoint 2010 platform for taxonomy implementation and management across sites and site collections. SharePoint 2007 and its predecessors have had their challenges with the implementation and management of taxonomy, including:
- A lack of cross site collection synchronization of content types, metadata and vocabularies;
- An inability to create and manage taxonomic relationships between terms;
- No concept of hierarchical metadata, resulting in programmatic customizations for tagging; and
- An inability to easily surface and leverage metadata through search and navigation.
Although SharePoint 2010 has taken a number of strides in the right direction to solving some of these problems, our estimation is that many of the same challenges in information management will persist moving forward, primarily because SharePoint itself is not intended to be an enterprise taxonomy management tool.
To get to a point where information assets are fully exploited and working to meet the needs of the organization, time and effort must be spent building an appropriate foundation for the information ecosystem – through design, development and application of foundational information architectures and enterprise taxonomy. A well planned and intelligently constructed foundation is the basis for successful information applications and high quality user experiences.
Fundamental Principles of Enterprise Taxonomy
How taxonomy is applied to a body of knowledge is dependent on the technologies used within a domain. Different systems leverage taxonomy in different ways, and taxonomy management in the typical information environment is fragmented and inconsistent with each application using a separate instance of an oftentimes similar vocabulary.
True enterprise taxonomy is intended to be centrally managed and pushed out for consumption by our enterprise systems. SharePoint is but one of many systems required to consume taxonomy in an effort to provide a better user experience, and is rarely the only such system in use within an organization. Even though it is often the centralized access point to enterprise information, the need to establish common vocabularies across systems (or at the very least, mappings of similar vocabularies) is still an important organizational requirement.
Only after we have designed and constructed a solid foundation with respect to the organizing principles of our information can we consider how it is to be managed, implemented and consumed by the technologies we employ. As we work our way through this series please keep these fundamental principles of taxonomy in mind as they are a key element in strategic information management.
Understanding Core Taxonomic Concepts in SharePoint 2010
There’s been significant excitement about new product functionality introduced as part of the SharePoint 2010 platform for taxonomy implementation and management across sites and site collections. With it has come a whole new set of terminology that needs to be defined prior to proceeding with our discussion. Core concepts basic to our understanding are (via MSDN Library):
- Managed Metadata - A hierarchical collection of predefined centrally managed terms that are applied by publishers as metadata attributes for content items.
- Term Store - A database that is used to house both Managed Terms and Managed Keywords.
- Managed Term - A predefined word or phrase created and managed by a user with appropriate permissions and often organized into a hierarchy (controlled vocabularies, taxonomic in nature).
- Managed Keyword - A non-hierarchical word or phrase that has been added to the keyword set directly by a system user (uncontrolled vocabularies, folksonomic in nature).
- Group - From a taxonomy perspective, a group is a flat list or hierarchical collection of related attributes comprised of one or more Term Sets.
- Term Set - A flat list or hierarchical collection of related Terms that belong to a Group.
- Term - A word or phrase that can be applied by publishers and system users as metadata to content.
Armed with an understanding of this new terminology we can now move on to the enrichment of content through the application of taxonomy in SharePoint 2010.
Guest Author: Jeff Carr
Jeff Carr is an Information Architect and Search Consultant with Earley & Associates specializing in user centered information design. Working with SharePoint since 2003, he has been involved in the design, development and integration of web-based solutions from intranets and extranets to public facing websites for a variety of large enterprises across a wide range of industries.
- SharePoint 2010 - What You Need to Know About Taxonomy, Metadata & Information Architecture
- SharePoint 2010 - Using Taxonomy & Controlled Vocabulary for Content Enrichment
- SharePoint 2010 - Using Social Features for Personal Classification & Improved Findability
- SharePoint 2010 - Using Taxonomy & Metadata to Improve Navigation & Browsing
- SharePoint 2010 - Using Taxonomy & Metadata to Improve Search & Discovery
- SharePoint 2010 - Share Content Types Across Site Collections
- SharePoint 2010 - Using Retention Stages to Manage the Lifecycle of Information
- SharePoint 2010 - Administering Taxonomy Using Term Store Management
- SharePoint 2010 - Importing Taxonomy Using the Managed Metadata Import File
- Taxonomy, Metadata and Information Architecture in SharePoint 2010 - Series Summary and Conclusions