Archiving: Not Just for Emails Anymore
Just as companies have implemented email archiving systems to better manage their ever-growing volumes of email records, an increasing number are doing the same for their SharePoint records. SEC Rule 17a-4, HIPAA and the Sarbanes-Oxley Act are just a few of the laws and regulations requiring companies to actively monitor employee communications. That includes SharePoint, which has grown into a $1 billion business for Microsoft as organizations discover it can be a much better employee collaboration tool than trading attachments via email. However, moving intellectual property into SharePoint libraries creates the same operational risks exploding Exchange email volumes do, as well as a new set of challenges. This article is the first of a three-part series that will examine why organizations should implement archiving for SharePoint, just as many have done for their email systems. Today, we’ll examine the storage optimization benefits of archiving SharePoint.
Records created, managed and stored in SharePoint constitute electronic communications that fall under the same compliance rules requiring the retention and availability of communications records organizations must apply to email. There is not only limited space for storing such unstructured content but there’s also more than a little confusion about exactly what should be stored and for how long. To make matters even more confusing, those same SharePoint records cannot simply be dumped onto backup tapes and sent to a storage facility down the road where they remain untouched to gather dust. Specific records must also be accessible at a moment’s notice in the event of a lawsuit or regulatory investigation.
SharePoint presents IT administrators with the same management and storage issues as email, as well as some new ones. SharePoint versioning is a useful tool, but it stores a completely new item for each new version created. So if you create a PowerPoint presentation and a colleague makes just one change to one slide, SharePoint saves that change as an entirely new slide deck. You can imagine how quickly SharePoint can consume expensive primary storage space.
The number of SharePoint sites within an organization can grow very quickly as sites are often created for even short-term projects. These sites are usually temporary and are active only for the life of the specific project.
Benefits of Archiving
Implementing an archiving system for SharePoint records enables IT to optimize its storage resources by moving older records off the SharePoint server and expensive primary storage facilities to less expensive options after a project reaches completion or when a document is no longer being modified on a regular basis SharePoint documents can be archived to preserve storage space, utilizing features such as compression and single-instance-store (SIS) to further reduce costs. Moving from primary to secondary storage also has the benefit of accelerating SharePoint maintenance such as scanning the database for viruses or backups.
Single instance store eliminates the cost of storing multiple copies of the same document. For example, if the same document exists in multiple SharePoint document libraries, just one copy will be stored in the archive. If that same document also exists in the File System archive or email only one copy will be stored.
When an item is archived, it is first compressed and then metadata is added to it. As a general rule, the item is compressed to half its original size, and the metadata comprises approximately five KB. When an item is shared, only the metadata is added. To estimate the storage savings compression creates, halve the total size of items to be archived, divide by the average number of sharers or collaborators (if any) and add five KB multiplied by the total number of items. The compression ratio may vary considerably, and you must factor in the growth in the number and size of documents. A more conservative method of estimating storage is to assume that space used by archiving equals the space used by SharePoint in storing items.
Additionally, the SharePoint administrator can set policies to further reduce the number of files stored within SharePoint, such as establishing a rule to automatically delete mp3 or video files from a SharePoint site collection.. Documents can also can be automatically deleted as they become outdated by creating a rule at the site or subsite level to archive and remove documents from SharePoint based on the last modified date (while maintaining an archived copy).
As previously mentioned, SharePoint versioning is a useful tool to preserver every modification to a given document however each change is stored as a separate document. Archiving solutions can ease this storage burden by providing a way to archive the older versions to cheaper storage while maintaining the current version in SharePoint for easy modification. Versions can still be accessed by the user via shortcuts or stubs but the actual file will reside within the archive not within SharePoint freeing up valuable space.
End-user Access to Archived items
Shortcuts or “Stubs” are essential in maintaining the end-user experience once items have been archived and removed from SharePoint. A shortcut or stub links to the original document in the archive and provides seamless access for the end user. A good archiving solution will also maintain the original icon or a variation of the original icon to minimize the change to the user experience.
End-user search is also essential to provide an easy way to find items in the archive (especially if shortcuts are not in use). End-users must be able to search on both the metadata and content within the document and should be able to restore items back to SharePoint if they wish to edit the document.
Improving Search and Recovery
An archiving system’s search and retrieval functionalities make finding and recovering specific records for business or discovery purposes much faster, more accurate and far less costly than manually searching through piles of backup tapes. Backup tapes were not designed to do what archives do. Backups offer safeguards against unexpected data loss and application errors, while archiving is the process of systematically saving copies of unstructured files to reduce primary storage, better management retention and enable the discovery of information on a per item basis. Backups will always be an important part of an organization’s data protection and disaster recovery strategy, but they are less than efficient when used for archiving purposes. Nevertheless, if tapes are the only source of SharePoint, email, IM and other electronic communications backups, then they are fair game should a lawsuit demand their discovery. And simply saying they’re inaccessible will not hold up in court.
A proactive archiving system reduces the costs associated with the search and collection of electronic data by creating a centralized and indexed archive that can be searched on demand. With archiving in place, organizations can save on discovery costs by eliminating the fire drills and time required to respond to requests, decreasing the resources needed to carry out legal holds and significantly reducing the cost of manual collection, de-duplication, imaging, password cracking and tape restores.
To summarize, there are multiple benefits to expanding your existing email archive to incorporate SharePoint records, including:
- automating the archiving of older business-critical data on Microsoft SharePoint Portal Server to more cost-effective online stores
- moving documents from SharePoint workspaces to provide maximum storage benefit, including entire workspaces when a project ends
- maximizing online knowledge store and increases information sharing
- enabling SharePoint to remain a constant size and does not suffer from storage “bloat” to facilitate rapid disaster recovery and optimal backup processing
- all data can be stored in one location
Implementing a software-based archiving solution is one way to keep storage costs from spiraling upwards without limiting how employees use SharePoint. Newer, more frequently accessed content is retained within SharePoint with older, less frequently accessed content being moved to the archive while still remaining available to the end user. This keeps storage growth focused, enables SharePoint Portal Server scalability and keeps backup/restore windows under control without impacting the end user.
The second part of this three-part series will focus on archiving for SharePoint to ensure compliance with laws and regulations requiring the retention and availability of electronic communications records and to facilitate the e-discovery process. Part three will examine how extending your archiving system to include SharePoint will significantly improve the end-user’s experience with SharePoint.



09. Dec, 2008 








Author