Article
3 comments

Cleanup item and file versions in SharePoint using PowerShell

At the SharePoint Conference 2011 I had a discussion with Christian Ståhl and Laura Curtis if it will be possible to cleanup old versions in SharePoint using a script or command line application. I thought to myself that this should be a big task to accomplish but a useful. The slow way to clean up old versions is to loop through all the site collections, webs, lists and items. From each item check the versions and delete them. In SharePoint 2010 Server there is a much smarter way to do this by using a helper that exists in SharePoint Server. The class I’m talking about is called ContentIterator and can be found in Microsoft.SharePoint.Server.Utilities namespace.

The smart thing about the ContentIterator it prevents SharePoint from blocking the database with requests. The only side effect of this is that items in the meantime can be modified because this method is not thread safe.

How does it work?

The class I mentioned before can be used in command line applications, PowerShell script without any problem. The MSDN Example provides a great way of usage that I reused in my code with the addition of the version deletion task. The ContentIterator contains a bunch of methods to process batch updates to SharePoint and the one that will be used here is ProcessListItems.

The benefit to use this kind of delegate instead of walking the list by using the normal object model is that the list items will be paged and avoid blocking the database objects affected by the query. Paged result means that the query gets a couple of results back and processes the item, after that the next results will be returned from the database. In the method a CAML query can also specified. That allow me to filter every list for items and documents that was last modified three month ago for example.

 

The benefit to use this kind of delegate instead of walking the list by using the normal object model is that the list items will be paged and avoid blocking the database objects affected by the query. Paged result means that the query gets a couple of results back and processes the item, after that the next results will be returned from the database. In the ProcessListItems method a CAML query can also specified. That allow me to filter every list for items and documents that was last modified three month ago for example.

The code for cleaning up the versions is also quite easy. SPListItemVersionCollection has an own method for that called DeleteAll and RecycleAll. The difference between those two methods is that RecycleAll will delete the versions and move it to the recycle bin. DeleteAll will instantly delete the versions and they cannot be restored from inside SharePoint. In my case i used DeleteAll Method and the code in the script looks like this:

Wrap up the C# Code in a PowerShell script

The easiest way to use SharePoint Object Model inside of a PowerShell script is to wrap native C# or VB.net code in a new type. The basic structure of the PowerShell script looks like this:

This can be done with any C# code. Once the type is registered it can be used.

The usage

The complete PowerShell script can be found can be found here. To use this script first execute the PowerShell script. A new type will be registered to you current PowerShell script session. After that you can use it like a “normal” PowerShell command. The trick is to specify the namespace, class and method you want to use.

In our case here I want to filter for files that were last modified three month ago and delete the old versions. The call of that looks like this:

 

In my case I will get a result like this because no file was modified more than three month ago.

Result With CAML Query specified

Result With CAML Query specified

 If the CAML Query will be left out then all files will be queried and the result looks like this.

Result Without CAML

Result Without CAML

Summary

The utilities in Microsoft.Office.Server.Utilities are great to update to lot of items, webs and sites at once. They are even faster than do it using normal object model. They are not thread safe which means they shouldn’t be used in a normal SharePoint development but can do a lot of administration and optimization task. To clean up versions in SharePoint is not able to shrink the database size but can free up space in the database. A database file can only be shrink using database administration. Check out this great Database Maintenance article that provides more information.

In SharePoint 2007 I used to create command line applications in Visual Studio. Nowadays I still use Visual Studio to create command line applications for tasks like this, but with PowerShell I can let it run on the fly in a power shell session and I’m also able to update scripts on the fly instead of recompile the scripts. Depending on the use case I use native PowerShell scripting or c# code inside PowerShell.

On last word of warning I’ve tested this script a couple of times on my development machine and it worked great but I don’t guarantee that it works save in a productive environment. It should but should be tested anyway as all things from the web. The script will also executed over all content databases attached to a web application and all sites, webs and nearly all lists.

This script helps to cleanup old versions after a migration but to plan of document versioning should always be the prefered solution in SharePoint. For this check out Versioning,

Dowload: VersionCleaner Powershell Script