Tech blog (42)

Fluent API vs. named arguments for readability

Fluent APIs are great to make code readable - should I say, at the expense of verbosity? Because verbosity is, actually, one of the positive traits of being fluent. It is very useful in complex and hard to follow scenarios like data transformation (think LINQ), configuration, testing etc., leading the developer to the solution but also documenting what's going on.

One example: I have a LINQ-like method that processes an array. It takes as parameters a criterion for determining duplicates and separate operations for new and repeated entries. It could look like this:

    r => r.Name.Trim(),
    x => new Entry() { Name = x.Name.Trim(), Date = x.Date },
    (a, b) => { b.Date = a.Date > b.Date ? b.Date : a.Date; }

But it isn't obvious what's going on here... The code isn't descriptive at all - and only partially due to the fact that the variable naming scheme stinks. If we make it fluent, the purpose is much clearer:

    .WithKey(r => r.Name.Trim())
    .SelectFirstAppearance(x => new Entry() { Name = x.Name.Trim(), Date = x.Date })
    .ProcessRepeatedAppearance((a, b) => { b.Date = a.Date > b.Date ? b.Date : a.Date; });

But what is the real difference here? In the second example we introduced methods just to describe the parameters from the first example. These parameters had names, though, didn't they, and in this respect the biggest flaw of the first example was that they were invisible. But, we can make them visible using named arguments. Like this:

    key: r => r.Name.Trim(),
    selectFirstAppearance: x => new Entry() { Name = x.Name.Trim(), Date = x.Date },
    processRepeatedAppearance: (a, b) => { b.Date = a.Date > b.Date ? b.Date : a.Date; }

Arguably, this makes the code as readable as the fluent example - aesthetic concerns aside. Of course, fluent can be much more powerful than this and allow more flexibility by providing different methods for different scenarios. But, when describing parameters is the primary concern, named arguments may work just as well.

Naturally, standard rules for code readability also apply, and the first one is naming. If we give descriptive names to arguments, we get something like this:

    key: sourceRow => sourceRow.Name.Trim(),
    selectFirstAppearance: sourceRow => new Entry() { Name = sourceRow.Name.Trim(), Date = sourceRow.Date },
    processRepeatedAppearance: (sourceRow, previousEntry) => { previousEntry.Date = sourceRow.Date > previousEntry.Date ? previousEntry.Date : sourceRow.Date; }

This could be enough for someone familiar with the API to understand the intention.

New site, new technology, old blog

Just a short notice: We have migrated our site to different technology (from WordPress to Joomla K2) and new design. I think I managed to reconstruct all blog posts and comments so it should work as before. Of feed URLs I'm not sure, I added a redirection for the main feed but category and comment feeds are probably unusable, so please update to new ones. If you find something wrong with the blog, please contact us at office (at)

Building customizable applications in .Net

Most modern applications today have at least some degree of modularity and customizability. By customizability I mean its simplest form - having one “vanilla” application with standard features and many derived versions adapted with minimal modifications to customer’s needs. It is common practice to use dependency injection (DI) and with it we can influence the behaviour of our application by being able to replace one component with another. But DI alone does not provide everything needed to make the application truly customizable. Each aspect of an application needs to support modularity and customizability in its own way. For example, how do you maintain a customized relational database?

This post is intended to be the first of a series regarding issues and solutions we found developing our own customizable application, a kind of an introduction to the subject so that I can write about concrete solutions to concrete problems, as/when they pop up. Here I’ll give an overview of the general architecture, problems and possible solutions.

Our product is a .Net WinForms line-of-business application. I think WinForms is still the best environment for LOB apps as the others are still not mature enough (like WPF) or simply not suitable (like the web). I would say it’s also good for customizability because it doesn’t impose an overly complex architecture: customizability will make it complex by itself.

As I said, DI fits naturally at the heart of such a system. DI can be used for large-grained configuration in the sense that it can transparently replace big components with other components implementing the same features. It’s generally suitable for assembling interfaces and possibly bigger parts of the business logic. It’s not very suitable for a data access layer: even if you used a superfast DI container that puts no noticeable overhead when used with thousands of records, that would be just part of the solution. A bigger question would be how you can customize queries or the database schema. So, for data access we need a different solution, and I believe that this part of the puzzle is the most slippery since it’s so heterogeneous. Firstly, there’s usually the relational database that’s not modular at all: how do you maintain different versions of the same database, each with part of its structure custom-built for a specific client? (It would be, I suppose, a huge relief if an object-oriented database was used, but this is rarely feasible in LOB). Then, there are SQL queries which you cannot reuse/override/customize unless you parse the SQL. Then the data access classes, etc.

Get content from WPF DataGridCell in one line of code (hack)

How do you get the text displayed in a WPF DataGridCell? It should be simple, but incredibly it doesn’t seem it is: all the solutions given on the ‘net contain at least a page of code (I suppose the grid designers didn’t think anyone would want to get the value from a grid cell). But when you quick-view a DataGridCell in the debugger, it routinely shows the required value in the “value” column. It does this by calling a GetPlainText() method, which, unfortunately, isn’t public. We can hack it by using reflection – and, absurdly, this solution seems more elegant than any other I’ve seen.

DataGridCell cell = something;

var value = typeof(DataGridCell).GetMethod("GetPlainText", 
System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance)
.Invoke(cell, null);

CruiseControl.Net Missing Xml node (sourceControls) for required member (ThoughtWorks . CruiseControl . Core . Sourcecontrol . MultiSourceControl . SourceControls).

It’s a silly error but the solution is not very obvious or logical… I modified my CruiseControl.Net configuration to include multiple source control nodes, but it started complaining that the XML was malformed. The error was something like

[CCNet Server] ERROR CruiseControl.NET [(null)] - Exception: 
  Unable to instantiate CruiseControl projects from configuration document.
  Configuration document is likely missing Xml nodes required for properly populating CruiseControl configuration.
  Missing Xml node (sourceControls) for required member (ThoughtWorks.CruiseControl.Core.Sourcecontrol.MultiSourceControl.SourceControls).
  Xml: <sourcecontrol><sourcecontrols><hg><executable>C:\Program Files\TortoiseHg\hg.exe</executable> […]

It complains of a missing sourceControls node for required member SourceControls – but it’s present in the xml. The problem is the capital “C”: 1. XML is case-sensitive. 2. CruiseControl config tags are inconsistent in that sourcecontrol is not camel-cased but sourceControls is. That’s what confused me - hopefully this post will help someone else.

Workaround for HQL SELECT TOP 1 in a subquery

HQL doesn’t seem to support clauses like “SELECT TOP N…”, which can cause headaches when for example you need to get the data for the newest record from a table. One way to resolve this would be to do something like “SELECT * FROM X WHERE ID in (SELECT ID FROM X WHERE Date IN (SELECT MAX(Date) FROM X))”, a doubly nested query which looks complicated even in this simple example and gets out of control when query conditions need to be more complex.

What is the alternative? Use EXISTS – as in “a newer record doesn’t exist”. It still looks a bit ugly but at least it’s manageable. The above query would then look like this: “SELECT * FROM X AS X1 WHERE NOT EXISTS(SELECT * FROM X AS X2 WHERE X2.Date > X1.Date)”

Note that this works only for “SELECT TOP 1”. For a greater number there doesn’t seem to be a solution at all.

How to create a temporary table that can be seen by different SqlCommands on the same connection?

The unexpected answer (that I learned the hard way) is: well, it depends on whether you have parameters on your command or not. The point being, if you execute a parameterless SqlCommand, the sql gets executed directly, the same way as if you entered it into the query analyzer. If you add a parameter, the things change in that a call to sp_execsql stored procedure gets inserted in the executed sql. The difference here is the scope: if you create a temporary table from within the sp_execsql, it's scope will be the stored procedure call and it will be dropped once the stored procedure finishes. In that case, you cannot use different commands to access it. If you execute a parameterless command, the temporary table will be connection-scoped and will be left alive for other commands to access. In that case, the other commands can have parameters because their sp_execsql call will be a child scope and will have access to parent scope's temporary table.

As to why they did it this way, I can't say I understand.

Lessons learned: migrating a complicated repository from subversion to mercurial

Migrating from SVN to Mercurial is a simple process only if the SVN repository has a straight-and-square structure - that is, there are trunk, branches and tags folders in the root and nothing else, not in its present state or ever before. If you used your SVN repository in a way that was convenient in SVN but not in Mercurial – for example, you created branches in various subdirectories, it still shouldn’t be too hard to migrate. But if you, like me, decided late in the game to create the mentioned folders in SVN and then moved an renamed your folders, you will need to invest serious time if you don’t want to lose parts of your history. You need to plot your migration very thoroughly and do a lot of test runs.

The reason for this is that Mercurial’s ConvertExtension is somewhat of a low-level tool. (In other words, although reliable it is not too bright). Browsing the internet you may get the impression that it’s an automated conversion system: it isn’t. It does fully automated migration only for straight SVN repositories, but for the rest it’s more like something to use in your migration script. It seems to do its primary purpose – converting revisions from one repository format to another – quite well but the rest of the tool is not so intelligent and it needs help. So, lesson number one: if you have a complex repository, don’t take the migration lightly.

A small disclaimer is in order: this post is not intended to be a complete step-by-step guide to migration. Rather, it’s something to fill in the blanks left by what little is available on the internet. I’ve done a complex migration and I want to do a brain dump for my future reference or for “whomeverother it may concern”.

Between the two alternatives I perceived as most promising, hgsubversion and the convert extension, i chose the latter. Hgsubversion was claimed by some to be the better tool for this job, but it was somewhat troublesome. The problem with hgsubversion was that it had a memory leak and broke easily in the middle of the conversion (note that this happened a couple of months ago: things may have changed in the meantime). The solution, they say, was to do hg pull repeatedly until it finishes. I wanted to do a hg clone with a filemap, but when the import broke I was in trouble because hg pull doesn’t accept filemaps. (It could be that the filemap was cached somewhere inside of the new repository and my worries were unfounded, I don’t really know). I may try that in the future. One other way around it would be to do a straight clone of SVN – no branches or anything – into an intermediate mercurial repository and then split that into separate final repositories. In that case, hgsubversion could be a viable solution, maybe even better than the conversion extension. I had more success with the conversion extension so this is what we’ll talk about here.

Subscribe to this RSS feed