Save money with smart content storage in SharePoint
How to manage unstructured data with an eye to cutting investment
Published 13:20, 02 March 12
An effective storage infrastructure must fulfil several important needs: support scalability, offer simple and robust management capabilities, and can’t break the bank. Scalability is essential, as reliability and performance are part in parcel to ensuring that Service Level Agreements (SLAs) and user expectations are met at all times.
Optimising storage to meet these key criteria can have significant productivity and cost benefits for any business, and users of Microsoft SharePoint are no exception.
SharePoint automatically stores data in SQL Server content databases - high performance, high cost Tier 1 storage. As a relational database, SQL is highly efficient at storing structured data, but is significantly less so when dealing with larger, non-relational data streams such as Word documents, PDF files and video files - which are also known as Binary Large Objects (BLOBs).
As BLOBs can account for up to 95 percent of all data in a typical organisation, the efficiency of data storage in SharePoint can be negatively impacted. So, what can be done to prevent that roadblock to user adoption and business productivity?
Adopters of SharePoint have several options to keep the cost of storage in check. One option would be to leverage aggressive content lifecycle management policies to address or delete SharePoint content that has not been accessed or modified in a specified time frame, potentially leveraging SharePoint’s Information Management policies to do so. Another would be to set stringent site quotas and locks, and limit versioning settings to prevent users from uploading significant amounts of data in various sites, lists or libraries.
For organisations that wish to optimise SharePoint storage while keeping content accessible to end users, they are limited to two main options, both of which leverage Microsoft’s BLOB Storage APIs (EBS or RBS).
The first option is for businesses to develop systems internally to enable the movement of BLOBs away from SQL server onto lower tiered, cheaper physical storage. The second option is to turn to a third party solution provider to optimise SharePoint storage using similar, but possibly more advanced, technology for BLOB storage management.
BLOB externalisation in and of itself does not reduce the total storage footprint of an infrastructure for SharePoint. However, it is an important first step because it does enable businesses to transfer the storage burden to more cost-effective tiers. The cost savings can be tremendous: Some large organisations report savings of millions of pounds a year from storage optimisation efforts focused on BLOB externalisation.
Native SharePoint functionality does include some tool sets that enable businesses to externalise BLOBs - the External BLOB Store (EBS) API and the Remote BLOB Storage (RBS) API which is leveraged by Microsoft’s free FILESTREAM provider. However, introducing either the EBS or RBS APIs via customised or the FILESTREAM provider requires intense customisation in order to properly manage the communication between SharePoint and the non-SQL Server ‘BLOB store’, making management complex and time consuming.
With Microsoft Office SharePoint Server 2007 SP1, EBS was introduced so organisations could extend storage in SharePoint’s SQL Server content databases to other media. EBS can take ownership of BLOBs and move them to cheaper, more efficient file-based storage while leaving a token or stub in the SQL Server, allowing the object to be easily retrieved and accessed when required. The intention is for someone to be able to navigate a SharePoint document library, click on a piece of content, and be able to access it without knowing whether or not it is stored in SQL.
However, EBS is not granular, which means that it is only deployable at the farm-level. This means that administrators would need to deploy EBS on every SharePoint web front end server, which requires a deep knowledge of scripting and coding - not to mention a vast amount of time and resources - in order to be successful.
SharePoint 2010 improved on this with the support of the RBS, a SQL-based API. More flexible than EBS, RBS enables storage of all content in a specified content database on the file system (with metadata retained in the SQL Server content database). However, it also requires significant coding in order to utilise the RBS effectively, and to achieve more advanced capabilities that third parties may offer out of the box.
To address this, Microsoft created FILESTREAM, which is included free-of-charge with the RBS installation files. FILESTREAM allows you to externalise BLOBs to the local file system of the SQL Server. This works well with some environments, but less so with others such as cloud environments, for example. It is recommended that you test BLOB externalisation providers in your environment to identify performance differences and, more importantly, to determine whether such differences have a material impact on overall SharePoint performance.
Businesses that do not have the internal resources to manage EBS and RBS have the option of working with third party solution providers to optimise SharePoint storage. This approach offers organisations the option to offload specified data to tier 2 or 3 storage based on customizable filters, like content properties, file type, or file size, and more static SharePoint data can even be stored in the cloud, which is not recommended out-of-the-box due to the latency restrictions required for Microsoft support.
In reality, working with third party tools to improve the lifecycle of content in SharePoint can help businesses manage these large unstructured files without having to make unnecessary investments in Tier 1 storage systems that would be required to support additional SQL Server bloat up front, offsetting cost in the long-term.
For example, if a particular file has not been accessed for 12 months, third party software can automatically offload this static data or BLOB to a lower tier storage device depending upon customisable triggers including file size or time, while keeping the core metadata and security permissions within SharePoint.
This way, the data remains easily searchable and accessible, preventing SQL Server from being inundated with BLOBs that can quickly occupy space, increase storage costs and slow down the performance of the SharePoint platform. Automating the process of archiving BLOBs through the aforementioned triggers saves time and allows IT managers to adhere to SharePoint best practices, as well as reduce the cost and complexity of SharePoint environments while promoting a culture of IT governance.
Data growth is expected to continue at a rapid rate over the next decade, with IDC predicting that the world’s data is doubling every two years. IT administrators charged with management SharePoint deployments will be required to be savvy with storage to cope with this exponential data growth.
Optimising storage will enable businesses to make the most of existing assets while ensuring that IT infrastructure is available and scalable as business needs dictate. BLOB externalisation is a key enabler of this optimisation process in SharePoint environments, allowing for increased application performance at the Tier 1 level through improved data and storage management.
By Mary Leigh Mackie, AvePoint