Menu

Synchronization Framework (5)

An update to “Remote File Sync using WCF and MSF”

This is a follow-up to Bryant Likes’ post where he gave a prototype solution for file synchronization over WCF. I converted the code to Microsoft Sync Framework 2.0 so now it compiles and seems to run well enough. But you have to keep in mind that it isn’t a complete example (it wasn’t that in the original code either): it only does upload sync, it doesn’t have any conflict resolution logic etc. It is my opinion that this is not really worth pursuing any further, because one would need to develop two complete custom providers - and all that (just for copying files?) when there’s an existing FileSyncProvider in the framework which knows how to cooperate with other providers so it should be able to communicate over WCF… On the other hand, if you do pull the heroic act of completing this code, please let me know because I’m (obviously) very interested in it. I tried to keep the modified code as close to the original as possible so that a simple diff (e.g. WinMerge) can show what I’ve done, because I’m not sure I got it all right (as I’m afraid Bryant wasn’t, too).

Here’s the complete solution: http://www.8bit.rs/download/samples/RemoteSync converted to MSF 2.0.zip

Replicating self-referencing tables and circular foreign keys with Microsoft Sync Framework

Self-referencing tables – or, at least circular foreign key references between tables – are probably a common thing in all but the simplest database designs. Yet Microsoft Sync Framework doesn’t have a clear strategy on how to replicate such data. I found various suggestions on the net: order rows so that the parent records come before children – this is usable for self-referencing tables (although not endorsed by Microsoft because the framework doesn’t guarantee it will respect this order), but not nearly good enough for circular references – if you have two rows in two tables pointing at each other, ordering them cannot solve the problem. On an MSDN forum there was a suggestion to temporarily disable foreign key constraints: this I cannot take seriously because it opens my database to corruption, all it takes is one faulty write while the constraint is down and I have invalid data in the database (unless I lock the tables before synchronization, and I’m not sure how to do this from within the Sync Framework).

So, when all else fails, you have to sit and think: what would be the general principle for solving this, Sync Framework notwithstanding? Exactly - do it in two passes. The problem is present only when inserting rows, if the row contains a reference to another row that wasn’t yet created, we get a foreign key violation… Our strategy could be to insert all rows without setting foreign key field values, then do another pass to just connect the foreign keys. If we do this after all tables have finished their first pass (inserts, updates, deletes and all), we also support the circular references because required rows are present in all tables. Ok, that was fairly easy to figure out (not much harder to implement either, but more on that later). We have another issue here that is not so obvious, deleting the rows… There may be other rows referencing the one we are deleting that haven’t yet been replicated. Since the Sync Framework applies the deletes first, we can be fairly certain that the referencing rows are yet to be replicated - they will either be deleted or updated to reference something else. So we can put a null value in all fields that reference our row. (Note that this will probably mark the other rows as modified and cause them to be replicated back – this is an issue I won’t go into in this post, but I’m quite certain there needs to be a global mechanism for disabling change tracking while we’re writing replicated data. I currently use a temporary “secret handshake” solution: I send a special value - the birth date of Humphrey Bogart - in the row’s creation/last update date fields that disables the change tracking trigger).

Ok, on to the code. I won’t give you a working example here, just sample lines with comments. You’ve probably figured out by now that it will be necessary to write SQL commands for the sync adapter by hand. I don’t know about you, but I’m no longer surprised by this: many of the tools and components we get in the .Net framework packages solve just the simplest problems and provide nice demos – if you need anything clever, you code it by hand. My solution was to create my own designer/code generator, and now I’m free to support any feature I need (also, I am able to do it much faster than Microsoft, for whatever reason: it took me a couple of days to add this feature… It may be that I’m standing on the shoulders of giants, but the giants could really have spared a couple of days to do this themselves). For simplicity, I’ll show how to replicate a circular reference: there’s an Item table that has an ID, a Name, and a ParentID, referencing itself. For replication, I split the table into two SyncAdapters: Item, that inserts only ID and Name and has a special delete command to eliminate foreign references beforehand, and Item2ndPass, which has only the insert command – but the only thing insert command does is wiring up of ParentID’s, it does not insert anything. I’ve deleted all the usual command creation and parameter addition code, the point is only to show the SQL’s, since they hold the key to the solution.

[Serializable]
public partial class ItemSyncAdapter : Microsoft.Synchronization.Data.Server.SyncAdapter
{
	partial void OnInitialized();

	public ItemSyncAdapter()
	{
		this.InitializeCommands();
		this.InitializeAdapterProperties();
		this.OnInitialized();
	}

	private void InitializeCommands()
	{
		// InsertCommand
		// 1899-12-25 00:00:00.000 is a 'Humphrey Bogart' special value telling
		// the change tracking trigger to skip this row
		this.InsertCommand.CommandText =  @"SET IDENTITY_INSERT Item ON
INSERT INTO Item ([ID], [Name], [CreatedDate], [LastUpdatedDate]) VALUES (@ID, @Name,
@sync_last_received_anchor, '1899-12-25 00:00:00.000') SET @sync_row_count = @@rowcount
SET IDENTITY_INSERT Item OFF";

		// UpdateCommand
		this.UpdateCommand.CommandText = @"UPDATE Item SET [Name] = @Name,
CreatedDate='1899-12-25 00:00:00.000', LastUpdatedDate=@sync_last_received_anchor WHERE
([ID] = @ID) AND (@sync_force_write = 1 OR ([LastUpdatedDate] IS NULL OR [LastUpdatedDate]
<= @sync_last_received_anchor)) SET @sync_row_count = @@rowcount";		

		// DeleteCommand
		this.DeleteCommand.CommandText = @"UPDATE Item SET [ParentID] = NULL
WHERE [ParentID] = @ID DELETE FROM Item WHERE ([ID] = @ID) AND (@sync_force_write = 1 OR
([LastUpdatedDate] <= @sync_last_received_anchor OR [LastUpdatedDate] IS NULL))
SET @sync_row_count = @@rowcount";

		// SelectConflictUpdatedRowsCommand, SelectConflictDeletedRowsCommand
		// skipped because they are not relevant

		// SelectIncrementalInsertsCommand
		this.SelectIncrementalInsertsCommand.CommandText = @"SELECT  [ID],
[ParentID], [CreatedDate], [LastUpdatedDate] FROM Item WHERE ([CreatedDate] >
@sync_last_received_anchor AND [CreatedDate] <= @sync_new_received_anchor)";

		// SelectIncrementalUpdatesCommand
		this.SelectIncrementalUpdatesCommand.CommandText = @"SELECT  [ID],
[ParentID], [CreatedDate], [LastUpdatedDate] FROM Item WHERE ([LastUpdatedDate] >
@sync_last_received_anchor AND [LastUpdatedDate] <= @sync_new_received_anchor AND
[CreatedDate] <= @sync_last_received_anchor)";

		// SelectIncrementalDeletesCommand
		this.SelectIncrementalDeletesCommand.CommandText = @"SELECT FirstID
AS ID FROM sys_ReplicationTombstone WHERE NameOfTable = 'Item' AND DeletionDate >
@sync_last_received_anchor AND DeletionDate <= @sync_new_received_anchor";
	}

	private void InitializeAdapterProperties()
	{
		this.TableName = "Item";
	}

} // end ItemSyncAdapter 
[Serializable]
public partial class Item2ndPassSyncAdapter : Microsoft.Synchronization.Data.Server.SyncAdapter
{
	partial void OnInitialized();

	public Item2ndPassSyncAdapter()
	{
		this.InitializeCommands();
		this.InitializeAdapterProperties();
		this.OnInitialized();
	}

	private void InitializeCommands()
	{
		// InsertCommand
		this.InsertCommand.CommandText =  @"UPDATE Item SET [ParentID] = @ParentID,
CreatedDate='1899-12-25 00:00:00.000', LastUpdatedDate=@sync_last_received_anchor WHERE ([ID] =
@ID) AND (@sync_force_write = 1 OR ([LastUpdatedDate] IS NULL OR [LastUpdatedDate] <=
@sync_last_received_anchor)) SET @sync_row_count = @@rowcount";

		// SelectIncrementalInsertsCommand
		this.SelectIncrementalInsertsCommand.CommandText = @"SELECT  [ID],
[ParentID] FROM Item WHERE ([CreatedDate] > @sync_last_received_anchor AND [CreatedDate] <=
@sync_new_received_anchor)";

	}

	private void InitializeAdapterProperties()
	{
		this.TableName = "Item2ndPass";
	}

} // end Item2ndPassSyncAdapter

In this case, it would be enough to setup the second-pass sync adapter to be executed after the first one. For circular references, I put all second-pass adapters at the end, after all first-pass adapters. Notice that the commands for selecting incremental inserts and updates read all columns - this is probably suboptimal because some fields will not be used, but it's much more convenient to have all field values handy than to rework the whole code generator template for each minor adjustment.

UPDATE (24.11.2010): I’ve added a source file with an illustration for this solution. I haven’t tested it (although it does compile and may well work) but it could be useful in showing the overall picture. It was created by extracting one generated sync adapter from my application, hacking away most of our specific code and then making it compile. http://www.8bit.rs/download/samples/ItemSyncAgentSample.cs Note that the file contains a couple of things that stray away from standard implementation, like using one table for all tombstone records, using the sample sql express client sync provider etc. Just ignore these. One thing, though, may be of interest (I may even do a blog post about it one day): the insert command knows how to restore autoincrement ID after replication, so that different autoincrement ranges can exist in different replicated databases (no GUIDs are used for primary keys) and identity insert is possible. This is necessary because SQL server automatically sets the current autoincrement seed to the largest value inserted in the autoincrement column. Keep in mind that this (as well as the whole class, for that matter) may not be the best way to do things – but I’ve been using it for some time now and haven’t had any problems.

Sync Framework 2 CTP2 – no SqlExpressClientSyncProvider yet

Ok, I’ve finally gotten around to installing the CTP2 of the Sync framework. Let's see what new and interesting stuff I got with it:

1. A headache.
2. Erm... anything else?

All witticism aside, a lot of details have probably changed, but the main gripe I had still stands: there is no support for hub-and-spoke replication between two SQL servers etc. (Oh, and the designer is still unusable… Two gripes).

As for the first thing, I hoped I was finally going to get rid of my SqlExpressClientSyncProvider debugged demo but no such luck. It turns out that nothing of the sort is (yet?) included in the sync framework. Judging by a forum post, v2 is soon due to be released, but any questions regarding the Sql Express provider are met with a dead silence. It doesn’t seem it will be included this time (9200 downloads of the SqlExpressClientSyncProvider demo are obviously not significant for these guys). You almost literally have to read between the lines: regarding information about this CTP, there was a very sparse announcement, and a somewhat misleading one at that (and since this is the only information you get, any ambiguity can lead you in the wrong direction).

The CTP2 release announcement said:

# New database providers (SqlSyncProvider and SqlCeSyncProvider)
Enable hub-and-spoke and peer-to-peer synchronization for SQL Server, SQL Server Express, and SQL Server Compact.

So, does SqlSyncProvider work in hub-and-spoke scenario? I thought yes. How would you interpret the above sentence?

The truth is that SqlSyncProvider cannot be used as local sync provider in a SyncAgent (that is, in a hub-and-spoke scenario as I know it) because it is not derived from ClientSyncProvider. The SyncAgent explicitly denies it - and throws a very un-useful exception that says, essentially “ClientSyncProvider”… Translated, this means: “use the Reflector to see what has happened”, which I did. The code in the SyncAgent looks like this:

public SyncProvider LocalProvider 
{ 
    get 
    { 
        return this._localProvider; 
    } 
    set 
    { 
        ClientSyncProvider provider = value as ClientSyncProvider; 
        if ((value != null) && (provider == null)) 
        { 
            throw new InvalidCastException(typeof(ClientSyncProvider).ToString()); 
        } 
        this._localProvider = provider; 
    } 
}

(Someone was too lazy to write a meaningful error message… How much effort does it take? I know I wouldn’t tolerate this kind of behavior in my company.)

So there’s no chance for it to work (or I’m somehow using an old version of the SyncAgent). Is there any other way to do a hub-and-spoke sync, without a SyncAgent? I don’t know of it. But the docs for the CTP2 say:

SqlSyncProvider and SqlCeSyncProvider can be used for client-server, peer-to-peer, and mixed topologies, whereas DbServerSyncProvider and SqlCeClientSyncProvider are appropriate only for client-server topologies.

I thought that client-server is the same as hub-and-spoke, now I’m not so sure… At the end, after hours spent researching, I still don't know what to think.

Debugging SQL Express Client Sync Provider

How to finally get the SQL Express Client Sync Provider to work correctly? It’s been almost a year since it was released, and still it has documented bugs. One was detected by Microsoft more than a month after release and documented on the forum, but the fix was never included in the released version. We could analyze this kind of shameless negligence in the context of Microsoft's overall quality policies, but it’s a broad (and also well documented) topic, so we’ll leave it at that. It wouldn’t be such a problem if there were no people interested in using it, but there are, very much so. So, what else is there to do than to try to fix what we can ourselves… You can find the source for the class here. To use it, you may also want to download (if you don’t already have it) the original sql express provider source which has the solution and project files which I didn’t include. (UPDATE: the original source seems to be removed from the MSDN site, and my code was updated - see the comments for this post to download the latest version). The first (and solved, albeit only on the forum) problem was that the provider was reversing the sync direction. This happens because the client provider basically simulates client behavior by internally using a server provider. In hub-and-spoke replication, the distinction between client and server is important since only the client databases keep track of synchronization anchors (that is, remember what was replicated and when). I also incorporated support for datetime anchors I proposed in the mentioned forum post, which wasn’t present in the original source. But that is not all that’s wrong with the provider: it seems that it also swaps client and server anchors, and that is a very serious blunder because it’s very hard to detect. It effectively uses client time/timestamps to detect changes on the server and vice versa. I tested it using datetime anchors, and this is the most dangerous situation because if the server clocks aren’t perfectly synchronized, data can be lost. (It might behave differently with timestamps, but it doubt it). The obvious solution for anchors is to also swap them before and after running synchronization. This can be done by modifying the ApplyChanges method like this:

foreach (SyncTableMetadata metaTable in groupMetadata.TablesMetadata)
{
    SyncAnchor temp = metaTable.LastReceivedAnchor;
    metaTable.LastReceivedAnchor = metaTable.LastSentAnchor;
    metaTable.LastSentAnchor = temp;
} 

// this is the original line
SyncContext syncContext = _dbSyncProvider.ApplyChanges(groupMetadata, dataSet, syncSession); 

foreach (SyncTableMetadata metaTable in groupMetadata.TablesMetadata)
{
    SyncAnchor temp = metaTable.LastReceivedAnchor;
    metaTable.LastReceivedAnchor = metaTable.LastSentAnchor;
    metaTable.LastSentAnchor = temp;
}

This seems to correct the anchor confusion but for some reason the @sync_new_received_anchor parameter still receives an invalid value in the update/insert/delete stage, so it shouldn’t be used. The reason for this could be that both the client and server use the same sync metadata and that the server sync provider posing as client probably doesn’t think it is required to leave valid anchor values after it’s finished. I promise to post in the future some more information I gathered poking around the sync framework innards. Note that this version is by no means fully tested nor without issues, but its basic functionality seems correct. You have to be careful to use @sync_new_anchor only in queries that select changes (either that or modify the provider further to correct this behaviour: I think this can be done by storing and restoring anchors in the metadata during ApplyChanges, but I’m not sure whether this is compatible with the provider internal logic). Another minor issue I found was that the trace log reports both client and server providers as servers. If you find and/or fix another issue with the provider, please post a comment here so that we can one day have a fully functional provider.

Microsoft Sync Framework: where will it end?

It seems that the Microsoft Sync Framework is being developed in a hurry. It is a quite a big task - or at least it should become big if we're to have a serious data distribution framework - therefore it probably merits some patience, but the first thing that is sacrificed in similar cases is documentation. So what we now have is a piece of software for which it is not easy to figure out how it works, and once you do, there are cases when you’re left to your own power of deduction to figure out how it’s supposed to work.

I haven't dug into the framework deeply enough, but I'm digging. And I have the intention to document the findings, even if it means just sketching everything in short sentences. There are many things not obvious until you start disassembling (and Reflector Ilspy-ing, of course) the innards of the dlls. And even in that case, you have to keep notes because it's not a simple system. I intend to come back to this subject in the posts to come (and get way more technical), be sure to check back if you're interested.

I cannot say for sure, but given the complexity of problem the Sync Framework set out to solve, it is commendably (somewhat bravely, even) comprehensive, well thought out - and quite stable for a Microsoft V1. There's a V2 "on the air" right now, but it's a technology preview and it mostly contains a more mature version of the stuff we've already seen before. But even V2 or V3 would only be the small first step: what it currently does, copying database rows back and forth between PCs is not a mechanism that will allow us to one day easily build distributed systems. Even Sync Framework guys themselves acknowledge that the biggest obstacle is replication conflicts - irregularities that occur when the same piece of data is changed in multiple locations at the same time. Microsoft cannot help but give us a simplified solution in the form of record-by-record detection and resolution, and this is because the framework is in its very early stages: I don't know even if (or when) it will grow smart enough to handle more serious conflict resolution.

The thing is, record-by-record resolution cannot help you enforce business rules: for example, if your business logic depends on an invoice not containing the same product multiple times, how do you prevent this from happening in a distributed system? Two users working with two different databases can each add a record for a single item, but when the records replicate you get two of them. This really needs to be detected, and not in such way that would require programming a separate copy of validation logic for synchronization issues (which, when you think of it, should validate data that was already succesfully written to the database… I shudder to think of it). The synchronization framework would really need to somehow integrate with validation logic: in this aspect (and this is probably the biggest issue but only one of the issues present), Microsoft Sync Framework is much closer to the start line than to the finish. But at least it's moving...

The IT industry has so far moved on mostly in a step by step fashion, by implementing better solutions than the existing ones. This is where the Sync Framework will be of most use, to finally help us start thinking in terms of distributed data. Also, once there's a working system for data distribution, most will be interested in having it. And once they do, it will be much easier to persuade them that they need to structure their data and/or applications differently. Hopefully we'll be moving onto an application design philosophy in which it is a "good thing" to have distributed data just like it currently is a "good thing" to have object-orientedness, layered structure etc. There’s a good chance CRUD will be the one of the things that we'll start getting rid of. Because, once you look at it, storing the current state of data (which is the essence of CRUD - Create, Read, Update, Delete - philosophy) is the major factor in causing replication conflicts. If the databases stored operations - that is, changes to the data – besides data itself, it would be much easier to resolve conflicts, many of them automatically. The logic would know what the two mentioned users did - added the same product to the invoice - and act with this knowledge. In this concrete example there would be a much clearer situation for conflict resolution, the system could replay the operations so that the second one gets a chance to detect there already is a record present and act accordingly - be it to add the second quantity to the first or raise an error. Note that now a common validation logic for the operation could be employed... This is light years away from getting "fait accompli" duplicated rows and having to do a Sherlock Holmes to discover what has happened. Of course, this is also light years from where we currently are, but when you think of it, the database servers are way overdue for serious feature upgrades – and in any case, they already store something that resembles this in transactional logs.

So, it seems we're making the first step in the right general direction, even if we're not sure what precise direction we should move in. Trying to wrap our heads around distributed data philosphy is good – and seeing this practice widely deployed will be even better.

Subscribe to this RSS feed