Synchronising EPiServer VPP folders using the Microsoft Sync Framework
Nov 09, 2009
I've worked EPiServer projects that often involve load balancing of an EPiServer site either for scalability or resilience. While EPiServer itself is relatively easy to set up in a load balanced configuration the VPP folder configuration appears to be forgotten about. It appears to be left to individuals to decide how to deal with their VPP folders in a load balanced environment.
One approach in a load balanced environment if for each site could point to a common VPP folder location accessible on the network. Whilst this is a valid approach it means losing resiliency as the infrastructure hosting the VPP folder share is a single point of failure. Personally I have never used or recommended this approach.
Another approach is to write your own VPP provider that backs onto a database. However this would involve writing a reasonable amount of code and would mean storing site assets in the database. In my personal experience DBA's never like you storing a lot of binary data in their databases!
The alternative approach is to use an external tool to synchronise the VPP folder structure on each machine in the load balanced environment. However there are a number of problems with this approach:
- This approach tends to favour a master/slave configuration which means only a single server can be used for content editors
- An external sync tool needs to be selected, configured and supported (and perhaps purchased)
- File are not available on the other machines in the cluster until the sync tool has run therefore running the risk of 404's
Requirement
I have recently worked with a client that uses four load balanced EPiServer installs across two physical sites. I was never happy with recommending using a single server for editing and using an external tool to keep the VPP folder structure in sync. This didn't appear to present EPiServer in a particularly "enterprise" light. After several searches that returned no results I decided to do something about it. The requirements for my solution were:
- Keep custom code down to a minimum
- Be extensible to allow any number of servers in a load balanced environment
- Allow EPiServer VPP functionality to work untouched
- Provide immediate sync of VPP folders
- Allow a "light touch" implementation to an existing EPiServer site with little or no code changes
- Allow users to use edit mode on any server
Assumptions VPP folders
Please be aware of the following assumptions about VPP folder structures before reading on:
- All VPP folders come under a common root location on disk. E.g. D:\websites\MySite\VPP\Global and D:\websites\MySite\VPP\Documents etc
- All servers in the cluster are connected by a fast network
The solution
The first requirement is to get a notification that a file has been added, moved or deleted in the EPiServer file store. This is easy to achieve by hooking into standard EPiServer UnifiedFile events. But how to synchronise the files themselves? I didn't want to write code to manually synchronise files on disk as this can be fraught with problems. Enter Microsoft Sync Framework 2.0 http://msdn.microsoft.com/en-us/library/bb902854%28SQL.105%29.aspx. The Microsoft Sync Framework has been around for a little while and is currently at version 2.0. Out of the box it includes a provider to handle file synchronisation so felt this was an ideal candidate to handle the actual file sync part of my solution.
Firstly I need to define a configuration section which can be entered in web.config to allow us to define what folders should be part of our VPP sync. This is fairly standard .net configuration handler as seen below:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Configuration;
namespace EPiTools.VPPFolderSync.Configuration
{
public class VPPSyncConfiguration
{
/// <summary>
/// The VPP folder sync section of the .config file
/// </summary>
[ConfigurationProperty("vppFolderSync")]
public static VPPLocationsGroup VPPFolderSync
{
get
{
VPPLocationsGroup group = (VPPLocationsGroup)System.Configuration.ConfigurationManager.GetSection("vppFolderSync");
return group;
}
}
}
/// <summary>
/// The element to contain VPP folders to sync
/// </summary>
public class VPPLocationsGroup : ConfigurationSection
{
[ConfigurationProperty("vppFolders")]
public VPPFolderLocationCollection SyncFolders
{
get { return (VPPFolderLocationCollection)this["vppFolders"]; }
set { this["vppFolders"] = value; }
}
[ConfigurationProperty("localPath")]
public string LocalPath
{
get { return (string)this["localPath"]; }
set { this["localPath"] = value; }
}
}
/// <summary>
/// A collection of VPPFolderLocation elements in the config file and the root physical location of the VPP folders
/// </summary>
public class VPPFolderLocationCollection : ConfigurationElementCollection
{
protected override ConfigurationElement CreateNewElement()
{
return new VPPFolderLocation();
}
protected override object GetElementKey(ConfigurationElement element)
{
return ((VPPFolderLocation)element).Location;
}
public List<string> ToList()
{
List<string> rtn = new List<string>(this.Count);
foreach (VPPFolderLocation folder in this)
rtn.Add(folder.Location);
return rtn;
}
}
/// <summary>
/// An individual VPP folder entry in the configuration file
/// </summary>
public class VPPFolderLocation : ConfigurationElement
{
public VPPFolderLocation()
{
}
public VPPFolderLocation(String location)
{
Location = location;
}
[ConfigurationProperty("networkPath", IsRequired = true)]
public String Location
{
get
{
return (String)this["networkPath"];
}
set
{
this["networkPath"] = value.TrimEnd(@"\".ToCharArray());
}
}
}
}
If you want to read any more about .net configuration files then see the following documentation on MSDN: http://msdn.microsoft.com/en-us/library/system.configuration.aspx
To add this to web.config add the following sections:
<configuration>
<configSections>
...
<section name="vppFolderSync" type="EPiTools.VPPFolderSync.Configuration.VPPLocationsGroup, EPiTools.VPPFolderSync"
allowLocation="true" allowDefinition="Everywhere"></section>
...
</configSections>
<vppFolderSync localPath="C:\EPiServer\VPP\MyEPiServerSite">
<vppFolders>
<add networkPath="\\server2\VPP$" />
<add networkPath="\\server3\VPP$" />
...
</vppFolders>
</vppFolderSync>
...
<configuration>
OK so now we need to hook into the appropriate UnifiedFile events to receive notification that a file has been added, changed or deleted. This is achieved via a HttpModule:
using System.Collections.Generic;
using System.IO;
using System.Web;
using EPiServer.Web.Hosting;
using EPiTools.VPPFolderSync.Configuration;
namespace EPiTools.VPPFolderSync.Modules
{
public class VPPSyncModule : IHttpModule
{
#region { Constants }
/// <summary>
/// Boolean to determine whether the module has been initialised
/// </summary>
private static bool isInitialized;
#endregion
#region { IHttpModule Members }
public void Dispose()
{
isInitialized = false;
}
/// <summary>
/// Load the Unified file events
/// </summary>
/// <param name="context"></param>
public void Init(HttpApplication context)
{
if (!isInitialized)
{
isInitialized = true;
UnifiedFile.UnifiedFileCheckedIn += new UnifiedFileEventHandler(UnifiedFile_VPPSyncHandler);
UnifiedFile.UnifiedFileMoved += new UnifiedFileEventHandler(UnifiedFile_VPPSyncHandler);
UnifiedFile.UnifiedFileCopied += new UnifiedFileEventHandler(UnifiedFile_VPPSyncHandler);
UnifiedFile.UnifiedFileDeleted += new UnifiedFileEventHandler(UnifiedFile_VPPSyncHandler);
}
}
#endregion
#region { Event handlers }
private void UnifiedFile_VPPSyncHandler(UnifiedFile sender, UnifiedVirtualPathEventArgs e)
{
this.performSync(sender);
}
#endregion
#region { Methods }
/// <summary>
/// Synchronise VPP folders
/// </summary>
/// <param name="vppFile">The Unified file that we want to sync the rest of the servers with</param>
private void performSync(UnifiedFile vppFile)
{
SyncVPPFolders sync = new SyncVPPFolders();
sync.PerformFileSync(vppFile);
}
#endregion
}
}
This can be plugged into EPiServer by modifying web.config as follows (IIS7/IIS7.5):
<configuration>
...
<system.webServer>
...
<modules runAllManagedModulesForAllRequests="true">
...
<add name="VPPSync" type="EPiTools.VPPFolderSync.Modules.VPPSyncModule, EPiTools.VPPFolderSync"/>
</modules>
...
</system.webServer>
...
</configuration>
At this point we know a file has been added in a VPP folder so it should be sync'd with all other VPP folders. But wait, the VPP folder structure can be quite large so we don't want to sync the entire structure every time. So using the following code ensures that we only sync the folder that the event occurred in:
/// <summary>
/// Synchronise the unified file
/// </summary>
/// <param name="sourceFile">The unified file to sync</param>
public void PerformFileSync(UnifiedFile sourceFile)
{
string sourceFolder = Path.GetDirectoryName(sourceFile.LocalPath);
string rootPhysicalPath = VPPSyncConfiguration.VPPFolderSync.LocalPath.TrimEnd(@"\".ToCharArray()) + @"\";
string relativePath = @"\" + sourceFolder.Replace(rootPhysicalPath, string.Empty);
List<string> vppFolders = VPPSyncConfiguration.VPPFolderSync.SyncFolders.ToList();
this.PerformSync(sourceFolder, vppFolders, relativePath);
}
The code for the file sync can be seen below. In theory conflicts should never happen as the physical location of each file in the VPP structure is held in the database. However should this ever happen I think its safe to say that we should always take the newest file. The OnItemConflicting and OnItemConstraint methods take care of this. The FileSyncProvider conveniently maintains a meta data about each VPP folder location. Standard log4net functionality is used to allow administrators to see exactly whats going on in the sync process. The full code for the sync class can be seen below:
using System;
using System.Collections.Generic;
using System.IO;
using Microsoft.Synchronization.Files;
using Microsoft.Synchronization;
using EPiServer.Web.Hosting;
using EPiTools.VPPFolderSync.Configuration;
using log4net;
namespace EPiTools.VPPFolderSync
{
public class SyncVPPFolders
{
private static readonly ILog _log = LogManager.GetLogger(typeof(SyncVPPFolders));
/// <summary>
/// Synchronise the unified file
/// </summary>
/// <param name="sourceFile">The unified file to sync</param>
public void PerformFileSync(UnifiedFile sourceFile)
{
string sourceFolder = Path.GetDirectoryName(sourceFile.LocalPath);
string rootPhysicalPath = VPPSyncConfiguration.VPPFolderSync.LocalPath.TrimEnd(@"\".ToCharArray()) + @"\";
string relativePath = @"\" + sourceFolder.Replace(rootPhysicalPath, string.Empty);
List<string> vppFolders = VPPSyncConfiguration.VPPFolderSync.SyncFolders.ToList();
this.PerformSync(sourceFolder, vppFolders, relativePath);
}
/// <summary>
/// Synchronise the source folder with all items in the folder sync list
/// </summary>
/// <param name="sourceFolder">The folder on the local disk we want to synchronise. E.g. C:\websites\mysite\VPP\Global\path\to\my\file</param>
/// <param name="folderSyncList">A list of folders we want to sync with. E.g \\machine1\VPP$, \\machine2\VPP$</param>
public void PerformSync(string sourceFolder, List<string> folderSyncList)
{
PerformSync(sourceFolder, folderSyncList, string.Empty);
}
/// <summary>
/// Synchronise the source folder with all items in the folder sync list
/// </summary>
/// <param name="sourceFolder">The folder on the local disk we want to synchronise. E.g. C:\websites\mysite\VPP\Global\path\to\my\file</param>
/// <param name="folderSyncList">A list of folders we want to sync with. E.g \\machine1\VPP$, \\machine2\VPP$</param>
/// <param name="relativePath">A relative path to tag onto each item in the folderSyncList. E.g \path\to\my\file</param>
public void PerformSync(string sourceFolder, List<string> folderSyncList, string relativePath)
{
try
{
// Set options for the synchronization session. In this case, options specify
// that the application will explicitly call FileSyncProvider.DetectChanges, and
// that items should be moved to the Recycle Bin instead of being permanently deleted.
FileSyncOptions options = FileSyncOptions.ExplicitDetectChanges |
FileSyncOptions.RecycleDeletedFiles | FileSyncOptions.RecyclePreviousFileOnUpdates |
FileSyncOptions.RecycleConflictLoserFiles;
// Ensure that the filesync data store objects are no sync'd
FileSyncScopeFilter filter = new FileSyncScopeFilter();
filter.FileNameExcludes.Add("filesync.metadata");
// Detect changes on the local drive and all folders in the replication list and
// synchronize the source folder with all replicas.
DetectChangesOnFileSystemReplica(sourceFolder, null, options);
foreach (string replicationLocation in folderSyncList)
{
// If the target replication location cannot be accessed then the
// server is offline or mis-configured therefore do not attempt replication
if (Directory.Exists(replicationLocation))
{
//Detect changes on the replication location
DetectChangesOnFileSystemReplica(
replicationLocation + relativePath, filter, options);
// Sync the replica with the source folder and vice versa. The third parameter
// (the filter value) is null because the filter is specified in
// DetectChangesOnFileSystemReplica()
SyncFileSystemReplicasOneWay(
sourceFolder, replicationLocation + relativePath, null, options);
SyncFileSystemReplicasOneWay(
replicationLocation + relativePath, sourceFolder, null, options);
}
}
}
catch (Exception ex)
{
_log.Error("Exception performing sync", ex);
}
}
// Create a provider, and detect changes on the replica that the provider represents.
public void DetectChangesOnFileSystemReplica(
string replicaRootPath,
FileSyncScopeFilter filter, FileSyncOptions options)
{
FileSyncProvider provider = null;
try
{
//Ensure the destination folder exists
if (!Directory.Exists(replicaRootPath))
Directory.CreateDirectory(replicaRootPath);
provider = new FileSyncProvider(replicaRootPath, filter, options);
provider.DetectChanges();
}
catch (Exception ex)
{
logError("Exception detecting file changes", ex);
}
finally
{
// Release resources.
if (provider != null)
provider.Dispose();
}
}
public void SyncFileSystemReplicasOneWay(
string sourceReplicaRootPath, string destinationReplicaRootPath,
FileSyncScopeFilter filter, FileSyncOptions options)
{
FileSyncProvider sourceProvider = null;
FileSyncProvider destinationProvider = null;
try
{
// Instantiate source and destination providers, with a null filter (the filter
// was specified in DetectChangesOnFileSystemReplica()), and options for both.
sourceProvider = new FileSyncProvider(
sourceReplicaRootPath, filter, options);
destinationProvider = new FileSyncProvider(
destinationReplicaRootPath, filter, options);
// Register event handlers so that we can write information
// to the //Console.
destinationProvider.AppliedChange +=
new EventHandler<AppliedChangeEventArgs>(OnAppliedChange);
destinationProvider.SkippedChange +=
new EventHandler<SkippedChangeEventArgs>(OnSkippedChange);
// Use SyncCallbacks for conflicting items.
SyncCallbacks destinationCallbacks = destinationProvider.DestinationCallbacks;
destinationCallbacks.ItemConflicting += new EventHandler<ItemConflictingEventArgs>(OnItemConflicting);
destinationCallbacks.ItemConstraint += new EventHandler<ItemConstraintEventArgs>(OnItemConstraint);
SyncOrchestrator agent = new SyncOrchestrator();
agent.LocalProvider = sourceProvider;
agent.RemoteProvider = destinationProvider;
agent.Direction = SyncDirectionOrder.Upload; // Upload changes from the source to the destination.
//Console.WriteLine("Synchronizing changes to replica: " + destinationProvider.RootDirectoryPath);
agent.Synchronize();
}
catch (Exception ex)
{
logError("Exception sync'ing files", ex);
}
finally
{
// Release resources.
if (sourceProvider != null) sourceProvider.Dispose();
if (destinationProvider != null) destinationProvider.Dispose();
}
}
// Provide information about files that were affected by the synchronization session.
private void OnAppliedChange(object sender, AppliedChangeEventArgs args)
{
switch (args.ChangeType)
{
case ChangeType.Create:
logInfo("Applied CREATE for file " + args.NewFilePath);
break;
case ChangeType.Delete:
logInfo("Applied DELETE for file " + args.OldFilePath);
break;
case ChangeType.Update:
logInfo("Applied OVERWRITE for file " + args.OldFilePath);
break;
case ChangeType.Rename:
logInfo("Applied RENAME for file " + args.OldFilePath + " as " + args.NewFilePath);
break;
}
}
// Provide error information for any changes that were skipped.
private static void OnSkippedChange(object sender, SkippedChangeEventArgs args)
{
string logMessage = "-- Skipped applying " + args.ChangeType.ToString().ToUpper()
+ " for " + (!string.IsNullOrEmpty(args.CurrentFilePath) ?
args.CurrentFilePath : args.NewFilePath) + " due to error";
if (args.Exception != null)
logError(logMessage, args.Exception);
else
logError(logMessage);
}
// By default, conflicts are resolved in favor of the last writer
private void OnItemConflicting(object sender, ItemConflictingEventArgs args)
{
args.SetResolutionAction(ConflictResolutionAction.SourceWins);
}
private void OnItemConstraint(object sender, ItemConstraintEventArgs args)
{
args.SetResolutionAction(ConstraintConflictResolutionAction.SourceWins);
}
#region Logging helper methods
private static void logInfo(string message)
{
if (_log.IsInfoEnabled)
{
_log.Info(message.Replace(@"\", @"\\"));
}
}
private static void logWarning(string message)
{
if (_log.IsWarnEnabled)
{
_log.Warn(message.Replace(@"\", @"\\"));
}
}
private static void logError(string message)
{
if (_log.IsErrorEnabled)
{
_log.Error(message.Replace(@"\", @"\\"));
}
}
private static void logError(string message, Exception exception)
{
if (_log.IsErrorEnabled)
{
_log.Error(message.Replace(@"\", @"\\"), exception);
}
}
#endregion
}
}
The code was adapted from code supplied on MSDN "How to: Synchronize Files by Using Managed Code": http://msdn.microsoft.com/en-us/library/ee617386%28SQL.105%29.aspx
When sync is first run on a folder a file called "filesync.metadata" will placed in the root of folder being synchronised. This file is the store the SyncOrchestrator and FileSyncProvider uses to keep a track of changes to files and decided what files have changed since the last sync.
One other thing to think about is the fact that we are in a load balanced environment therefore one or more of the machines in the cluster may be unavailable. While these machines are not available changes may occur in VPP folders on servers that are serving the site. So we need to add a hook into the EPiServer application starting event to ensure that we sync the local VPP folder with all of the other VPP folders on start up. This is easily achieved with by modifying the Application_Start method of Global.asax:
public class Global : EPiServer.Global
{
protected void Application_Start(Object sender, EventArgs e)
{
...
SyncVPPFolders sync = new SyncVPPFolders();
sync.PerformSync(
VPPSyncConfiguration.VPPFolderSync.LocalPath,
VPPSyncConfiguration.VPPFolderSync.SyncFolders.ToList(),
string.Empty);
}
}
Other implementation options
Other possible options could include an EPiServer scheduled job instead of hooking into UnifiedFile events. This would present the same problems as the folder sync issue I mentioned at the start of this post where files would not be available until the job has run. However it would mean that VPP folder synchronisation is integrated (and therefore visible) in EPiServer. This would present its own problems as unless specified its not possible to define what machine a job runs on so folders would need to be sync'd in a round robin fashion with each server sync'ing its entire VPP folder structure with every server in the cluster. This would be slower than an "on demand" sync via a UnifiedFile event as demonstrated here but offers little over a standard external tool file sync option.
Conclusion
The solution can slow down publishing of content to VPP folders but I think the slight performance hit is a reasonable price to pay for an always updated set of VPP folders. The solution provides an extensible framework that uses the Microsoft Sync Framework to solve the age old EPiServer VPP folders sync in a load balanced environment problem. It ensures that synchronisation transparent to the EPiServer installation and can be plugged in as required to any EPiServer 5 installation.
The full source of the project can be downloaded here: [Download]
Downloads
- Microsoft Sync Framework v2.0 SDK http://www.microsoft.com/downloads/details.aspx?familyid=89ADBB1E-53FF-41B5-BA17-8E43A2E66254&displaylang=en
- Microsoft Sync Framework 2.0 Redistributable Package http://www.microsoft.com/downloads/details.aspx?familyid=109DB36E-CDD0-4514-9FB5-B77D9CEA37F6&displaylang=en