1,488 articles and 10,482 comments as of Saturday, April 10th, 2010

Friday, January 15, 2010

How to Use SharePoint Metadata to Improve Search and Control Content – Part 3: Classifying SharePoint Content to Improve Search and Control Content

Guest Author: Mark Klinchin
MetaVis Technologies

Document and item find-ability (and most workflows) in SharePoint rely on the actual metadata values associated with the content.  Efficient metadata models improve search and navigation processes and enable generic workflows to be applied to content.  This, however, will not be realized unless the metadata is reliably populated for all content.  A metadata model architect (information architect) needs to achieve a fine balance between not enough metadata to make the content “find-able”; and excessive metadata which adds too much burden to users working with the content.

To encourage accurate and complete entry of metadata, architects should simplify the capture of the data for the user.   For example, field values could be populated from a pre-defined vocabulary.   Both the data entry control and the vocabulary could be configured to simplify the task.  For instance, you could display a hierarchical tree of terms for the user to select from.  Field values selection could be grouped in cascading relationships.  Hence, the selection of a value in one field will limit the list of available values in another.

Authors are often unenthusiastic about populating metadata (despite having potentially written a large, time-consuming document).  Nevertheless, the author is precisely the right person to do this task, since he/she usually has the most knowledge about the document. 

The methods listed above do not address the issue of entering metadata values for multiple items already in the SharePoint environment that do not have any associated metadata; or files on a file store that are to be imported into a library (e.g faxes, scans, etc.).

Figure 15 Shows simple lookup drop-down box that helps users select a value for the field Information Category


Figure 15

Figure 16 Shows drop-down box with Countries taxonomy that helps user to select right value for the field Country.  Taxonomy is a new feature of SharePoint 2010 that allows the selection of a value from complex hierarchical lists of terms.


Figure 16

Several approaches are available for assigning metadata to a large number of items that have little or no existing metadata. One of these is to analyze the text of the document itself and then use an algorithm to extract and assign usable metadata values.  These algorithms may be simple and rely on certain conventions that authors can adopt or they may be based on complex logic of specialized text analytics machine. 

An example of a simple algorithm is to retrieve the ‘Title’ field from a cell in a spreadsheet.  Text analytics strategies may be very complex and based on the dictionaries that require separate maintenance and time to learn data patterns and relationships.  In some cases they produce reliable results with little human intervention.  Considering the wide variety of content that can appear in documents and that many types of content (e.g. pictures, drawings, engineering models) have no text at all, successes in text analytics are rare.

Mass documents tagging could be seen as a reasonable balance between direct metadata entry and text analytics.  Mass tagging involves filtering a specific set of documents based on a criterion that includes existing metadata and then updating the metadata values for all these items, en masse.  Mass tagging can be done by typing a new value for a field, selecting value from a lookup list or mapping one field to another so that the current value from the source field is copied into the target.  Site, list and folder locations of a SharePoint item are also part of its metadata set that could be changed during a mass tagging exercise.  It provides a powerful way to relocate content based on its metadata.

Figure 17 Shows a third party mass classification tool that can change the content type and metadata values for many selected items and documents at the same time.  Such tools allow organizations to gradually evolve SharePoint information architecture keeping pace with increasing user adoption.


Figure 17

Mass tagging, which can be referred to as metadata enhancement, relies on the pre-existence of some metadata.  This can be as simple as a name, author and creation date that is auto created when the item is modified; or can be the result of a manual metadata assignment by authors; or an automated process such as text analytics.  Mass tagging can be applied many times to different document sets, so that different aspects of metadata can be applied to documents selected by different filters.

Conclusion

The need for a standardized content search and workflow dictate that metadata structures should be standardized and consistent across SharePoint environment in an organization.  Different evolutionary approaches can be taken to design and maintain these structures.  Coupled with multiple mechanisms for entering and updating metadata values for your SharePoint content will result in an effective, consistent and reliable search experience and an efficient automation of business processes through workflows.

About MetaVis Technologies:

MetaVis provides software solutions to help organize SharePoint environments for improved search, findability and e-discovery. MetaVis takes the complexity out of designing, deploying and managing content within SharePoint by offering reusable taxonomies, metadata management and migration software and services. The benefit is an organized SharePoint environment that is easily understood and well documented.

The company believes that taxonomy management within SharePoint should not be complicated to implement and use. MetaVis products are based on intuitive, graphical interfaces that are easy to use and easy to install. Drag and drop features allow information architects to design SharePoint metadata models and reuse them saving valuable time and resources. As a result, MetaVis products improve search optimization, consistency, content migration, and workflows across corporate SharePoint sites.

You can follow MetaVis on Twitter @metavistech

Guest Author: Mark Klinchin
MetaVis Technologies

Mark Klinchin directs the technological vision and product development for MetaVis Technologies.  Mark joined the company with 15 years of experience as a software product architect. As CTO of MetaVis he has led the development of MetaVis Architect Suite to take the complexity out of designing, deploying and managing content within Microsoft SharePoint 2003, 2007 and 2010 by offering reusable taxonomies, metadata management and migration software and services. You can follow him on twitter @mklinchin.  You can download a trial of MetaVis Architect to see your metadata model at www.metavistech.com.

 

Please Join the Discussion

2 Responses to “How to Use SharePoint Metadata to Improve Search and Control Content – Part 3: Classifying SharePoint Content to Improve Search and Control Content”
  1. Gigi Tarasow says:

    Do you have any information on what companies (oil and gas in particular) are using SharePoint and/or other collaboration/document management tools?

Trackbacks

Check out what others are saying about this post...
  1. [...] we found “How to Use SharePoint Metadata to Improve Search and Control Content – Part 3: Classifying SharePo…” interesting. We hear a lot about metadata and how it is the greatest thing since sliced bread [...]




Notify me of comments to this article:


Speak and you will be heard.

We check comments hourly.
If you want a pic to show with your comment, go get a gravatar!