EndUserSharePoint.com: Can I store terrabytes of data in SharePoint?
The question of the day comes from Rudy:
Is it feasable to use SharePoint as a document management system with several terrabytes of documents and less than 50 concurrent users?
Is it possible… yeah, I guess so. But my real question would be “Why?” Is your current data storage system working for you? Could you access it through the Business Data Catalog (BDC) and use SharePoint as an interface to the existing system? I think there should be a needs analysis done to see why you would tackle this thing and if it is really necessary.
My gut tells me that you would probably be better off with something like Documentum, using SharePoint as a work environment for document creation, collaboration and project management.
Give us some more background and we’ll see what the DMS people have to say.
Mark –
I completely agree with your sentiments about “Why”. The one thing that’s surprised me about MOSS as a document repository is how MS has somewhat changed their tune from the days of SPS 2003 about SharePoint as a replacement for the file share. Some of their points make sense, but sometimes I find myself questioning their motivation for advocating a move away from files shares.
That being said, I have a current customer who is aggressively planning to migrate their document corpus away from file shares and local hard drives to SharePoint. Since they have thousands of users world-wide and multiple terabytes of documents, , they’ve been pressing MS for some estimates around SharePoint’s capacity for storage. Below are the approximate numbers I’ve seen for their system, keep in mind that they are somewhat specialized for this large, global MOSS environment.
- No content DBs larger than 200 GB.
- No more than 10-12 TB of data per SQL Server 2005 content DB instance (one instance per server, so assume 50-60 content DBs per instance).
- No more than two content DB instances per MOSS farm.
This environment, when it’s fully implemented, will probably have far more documents, storage usage, sites, and concurrent users than the example posed by Rudy, so I would say that the answer for him is yes, it’s possible. But (and this is a big “but”), he should consider if MOSS the most reliable, efficient, and cost-effective solution for his problem. In my opinion, nine times out of ten that answer is going to be no.
John
John – Good to ’see’ you again.
For enterprise level data storage, I think we’re in agreement. The defined limitations will be too restrictive for anyone with exceptionally large datasets, which pretty much rules out most global size corporations.
Another issue is the limit to the amount of exposed data in the library. The current restriction is between 1500 – 2000 documents (Lawrence Liu, New York City User Group, 2007). The ‘answer’ to that problem is to create folders within the libraries to chunk out the information.
No way! The whole idea is to flatten out the structure as much as possible and manage content through views. Why is the End User asked to make the paradigm shift from folders to flat libraries and then told “Sorry, you have to use folders if you have any real amount of content.” Bite me!
The one instance where SharePoint will be the most useful is to get the data off of local drives. This pre-supposes the admin has setup an intuitive infrastructure that people will know how to use, so the solution might be as big a mess as the problem.
Good topic to expand on.
Regards,
Mark
The other area where I think SharePoint shines for enterprise document management is around the application of metadata to your content. I don’t know of any built-in way to contextually tag documents in a file share, whereas with SharePoint you can not only do it, but enforce the creation of that data throughout your entire site through Features and Content Types.