1,804 articles and 14,882 comments as of Saturday, April 30th, 2011

EndUserSharePoint has combined resources with NothingButSharePoint.com. You can now find End User (Mark Miller), Developer (Jeremy Thake) and IT Pro SharePoint (Joel Oleson) content all in one place!

This site is a historical archive and is no longer being updated. Please update your favorites, bookmarks and RSS feeds.

NothingButSharePoint.com
Thursday, February 19, 2009

Create SharePoint Document Taxonomies with MindManager – Part 1

Introduction

Over the past four weeks I have introduced you to the concept of Mind Mapping for SharePoint and how to use MindManager to build and document SharePoint projects including: Navigation; Project prioritization; and Early-stage brainstorming.

Next week I will show you how I use MindManager for one of the most difficult aspects of a SharePoint project: Building out the taxonomy of Document Libraries, Content Types and Metadata that will be used by a site.

In this week’s installment, you won’t see any Mind Mapping as I describe my approach to explaining the meaning and value of metadata to stakeholders (Site owners, content owners and contributors).

Metadata: What a concept

You are going to ask your users/stakeholders to tell you about their metadata, but they don’t know what that is. Explaining the concept clearly is difficult, and even if you do a great job, most people won’t really get it right away. Don’t let that stop you. Think of this as a process that starts at the edge of a target. Together with your stakeholders, you will spiral in towards the bull’s eye.

Document Taxonomies
The Metadata Target

I have found that, quite often, one of the key drivers behind a SharePoint implementation is to clean-up the chaotic “S-Drive”. What the client often wants to do is create a new and better folder structure in SharePoint. If this happens to you, DON’T LET THEM DO IT! A deeply nested folder system in SharePoint is even worse than one in Windows because it is slower and harder to navigate.

The Metadata Workshop

At the Metadata Workshop, I explain that metadata is “data about data”. In the case of documents, it means information about a document that will help you to identify it later without having to open it. People use metadata every day without realizing it.

I start the workshop by reviewing something that everyone already understands (and probably hates), the folder based file system on the “S-Drive”, or whatever the shared network drive is called in the customer organization.

I explain that because the Windows file system gives you very little information beyond the file’s name, size and last-modified date, people use the filename itself to capture metadata like date, customer, document type and version. You will often see files with names like “IBM-Proposal-May-2008-Ver2aFinalFinal.doc”. All of that information crammed into the filename is really metadata.

The use of structured folder names is another type of metadata that people use every day. If you have a file structure that looks like the sample below, then you are using metadata to help you identify your files.

Document Taxonomies
Sample Folder Hierarchy

The metadata is telling you that “file1.doc” is a Sales Document for the Year 2007; Western Region; Industrial Division for a Customer named BBN Inc. That is all useful metadata, but here comes some trouble.

Folder Pain

In workshops you may find that the stakeholders will tell you that they have a well structured and efficient folder hierarchy and everyone knows where to find the documents they are looking for. If you ask them what happens when someone new joins the team, they’ll tell you “Oh, that’s a mess at first; they put all sorts of stuff in the wrong places”.

Here’s a typical story: The intern we hired last summer dug down the hierarchy and could not find where to save files for CBS Inc. He knew that CBS is a national client (he didn’t realize that we divide all clients regionally by delivery location).

So the intern, taking some initiative, decided to create the CBS folder directly under the year. By the end of the summer, there were too many files in there and they were a combination of East, West, Industrial and Consumer files. It was going to take too much effort to fix everything, so we just left it. From now on, we have to remember that we have to look in multiple, non-intuitive, locations for sales documents.


Document Taxonomies
Problems with the Hierarchy

An Alternative to Folders

Everyone is pretty much aware of the problem with folders. The solution is to identify the relevant metadata fields ahead of time and then assign the metadata values to the document at the time that it is saved. The documents can then all be saved in the same place. This means no more guessing about which folder to use (and how many layers down you need to dig).

In the following diagram, each column represents a piece of metadata:


Document Taxonomies
Files with metadata

You can see that we have defined metadata fields for: Customer (Customer Name); Type (Invoice, Proposal, Contract); Division (Consumer, Industrial), Year and Region (East, West). Every time we save a new file, we are prompted by the system to enter values for each of these metadata fields.

By entering this data at the time the file is saved, we will be able to find, sort and filter our documents much more easily. You may notice that there’s a catch here: A bit of extra work has to be done every time you create a new file. For this reason, it is critical to keep the number of metadata fields to a minimum; otherwise people will find ways to get around the system (e.g. entering fake data). Research has shown that about six metadata fields is about the maximum that you can ask people to fill-out on a regular basis.

A great advantage of using metadata rather than nested folders is the use of views for filtering and ordering your documents. For example, if your boss wants you to pull all 2008 contracts from the industrial division in the East, you no longer need to hunt up and down the levels of folders; you just create a view that gives you only those files.


Document Taxonomies
Files with metadata and a view showing a subset of the documents.

At this point in the workshop, If I’ve done a good job, some of the participants will “get it”, but most will be flirting with the edge of the yellow ring of the target. They still don’t get it completely, but they’re starting to understand that there may be a way to organize documents other than folders.

NOTE: This is also where I explain SharePoint versioning and Check-in/out. But those are topics for a different article.

Workshop Homework

At the end of the workshop, I give the attendees a spreadsheet to use so that they can catalogue the metadata that they will need for storing documents in their own SharePoint sites. The spreadsheet has a couple of relevant examples to help guide them through the process.

I don’t worry about telling them to limit the metadata fields to six. I want them to get as much information as possible. Later, we will work together to determine what is really essential.

Here is a sample of the spreadsheet that I give the users. In this case I am showing two quite different types of documents as examples. Normally, I tailor the example to the department that I am working with.


Document Taxonomies
Sample Document Inventory Spreadsheet

Conclusion

In this week’s column I’ve shown you how I educate stakeholders about metadata and how to get them started on a document inventory.

This is only one possible approach that has worked quite well for me, but this can be a difficult process. I would be very interested in hearing about methods that have worked for you.

In next week’s article, I will demonstrate how to use MindManager to take the results of the document inventory and build out the taxonomy of document libraries, content types and metadata.

Ruven Gotz

Author: Ruven Gotz

Ruven Gotz is a senior consultant with Ideaca, a Microsoft Gold Partner based in Toronto. For the past five years he has been focused on delivering award-winning SharePoint solutions (most recently, a Microsoft Impact Award for Information Worker Solution of the Year, 2008).

Ruven’s blog is at http://spinsiders.com/ruveng and you can follow him at http://twitter.com/ruveng.

View all entries in this series: Ruven Gotz-Mind Manager»
 

Please Join the Discussion

11 Responses to “Create SharePoint Document Taxonomies with MindManager – Part 1”
  1. Ruven, this is a great article. I have followed the same approach in educating stakeholders regarding metadata. It is one of the capabilities to SharePoint that people do not completely comprehend. Excellent article!

    Using a clients sample directory definitely helps with communicating how it will help their organization. Depending on the site / number of documents (as there are limits to SharePoint) I might recommend mulitple document libraries and instead recommend content query web parts to display documents from multiple document libraries.

  2. Ruven,

    Another approach I use in workshops is to identify replicated folder structures, such as those in your example. As we look at the folder structure, we name a column after each of the folders in the hierarchy; sales, year, region, industry. Many times, it is a direct one-to-one correlation.

    By using the existing folder structure as a blueprint, we can then show immediate results for flattening out the hierarchy of large document sets by dynamic filtering using column headers and then setting up static views.

    An additional benefit is that everyone is already familiar with the names being used for each column because that is what they have been using in their folder structure.

    Nice article. I hope it gets people thinking.

  3. Ruven Gotz says:

    Kanwal, Thanks for your comment. You are correct, I have simplified the issue a bit here, talking only about one document library. In next week’s post I’ll talk a little bit about the work of building the structure of the site based on the number of document types that the user lists in the “homework” spreadsheet.

    The goal is to find a reasonable grouping of content types into multiple (but not too many) document libraries while keeping performance issues in mind.

    Mark, your approach is correct, but there can be wrinkles which makes this one-to-one mapping difficult. I have seen situations where the client has forty folders and under some of those folders, there are dozens of sub-folders.

    Along with keeping the number metadata fields from growing out of control, you have to worry about a metadata field that has a drop-down with forty items in it (this is probably unusable for the average user, resulting in irrelevant/useless metadata).

    Thanks for the feedback: I’d love to see more comments from people with their experiences about what has worked (or not worked) for them.

    -Ruven

  4. Joan says:

    Ruven, Thank you — I so have the File Share 2.0 problem at the company I work for even though I have continuously preached no, no, no… use metadata instead. I am in process of structuring some 1-hour training sessions I volunteered to do (OMG what was I thinking) and the one on metadata gets a front-and-center link to this article.

  5. Pez says:

    Where is part 2? I’ve been waiting for weeks.

  6. Barry says:

    “Research has shown that about six metadata fields is about the maximum that you can ask people to fill-out on a regular basis.”

    I’d be interested in reading this research. Have you got a reference for that please? Thanks!

  7. Hi Ruven,

    I was reading your article and, believe me, you were just describing what is happening to us in our Institute. Using Sharepoint as an online Files Structure of folder is always the main problem. I was wondering if you were thinking about publishing the second part of this article as we sure will find it very useful.

    Regards,

  8. Bil Simser says:

    One thing to avoid falling into is the trap of using a folder structure to reverse engineer requirements. Remember that any folder structure was created (probably) organically and was created within the restrictions of what you can’t do in folders (documents in multiple locations for example). It’s a starting point for a discussion but I wouldn’t use it as a blueprint to map into SharePoint. Doing that leads you down the path of replicating one technology for another. There needs to be some thought and discussion behind what makes sense to the business unit and the documents along with where they’re going. It’s far easier to fix metadata than folders so it’s great you get it into that form but be wary of using folders as your blueprint. It can be used as a roadmap or reference but think twice about using to build the foundation. That foundation should have some pseudo-science behind it and that requires some thought around what the user needs out of the taxonomy. What was valid in folder land 2 years ago might not be valid in metadata land today or going forward.

  9. Pat Kennedy says:

    Bil, I agree with you, but at this point I think Ruven is just getting started gathering the requirements. Part of the requirements process is getting a confirmation and buy-in from the client. Of course the taxonomy is part of the structure that supports the document management and since as you point out that most of the management is done organically now without regard to duplication or version control mechanisms, this is just a place to start that the client inherently knows. Then will come the explanation of why the suggested taxonomy structure and eventually to support of the business processes.

  10. Pat Kennedy says:

    By the way I love MindManager for this type of requirements gathering. Building structure and laying it out so a client can see is easier and modifiable as things change and the requirements become tighter and tighter.

  11. Ruven says:

    Wow, 2.5 years later, this post is getting a bunch of new activity. I am working on an update (and the never-completed part-2 [oops, sorry]) for this post. For the most-part, I still use these tools in these ways, but I have developed some better ways of explaining the concepts.

    Bil, when I work with clients, I don’t look at the old folder structures – I ask them to classify the types of documents that they work with. Once they bring that info back, we work on building out structures (including metadata) that will work best.

    Pat: I’m still loving MindManager for this and many other tasks.


Notify me of comments to this article:


Speak and you will be heard.

We check comments hourly.
If you want a pic to show with your comment, go get a gravatar!