Linq is really handy. Once you’re familiar with the syntax and the way it works, it becomes very useful.
Recently I had to write a front-end to compare lists of financial instruments coming from two different sources. One source was a SQL Server database, the other source was an in-house market data system.
The result set from the database was extracted by a stored procedure into a collection. The collection’s type is IEnumerable<Result1> where Result1 is a type automatically generated by the Linq designer using the stored procedure definition.
The other result set comes from the in-house market data system. It is extracted with an API that constructs Result2 objects and accumulates them into a collection List<Result2>.
The purpose of the little app is to show the instruments present in the first result set and not the other one. The types Result1 and Result2 are defined as follows:
Fill both collections with some dummy data:
I want to compute result1 MINUS result2. In Linq this is done with Except. So I’d like to write something like this:
…but that does not compile because result1 and result2 have two different types. Therefore I need to do a projection, like this:
The above compiles fine, but there is a problem: the result of the diff is the original result1 collection. It doesn’t remove the elements present also present in result2! This is because at this stage Except does not know how to test Result1 objects for equality.
Except takes an IEqualityComparer<T> as second argument. Let’s define one:
The diff now looks like this:
It’s still not working! Why? When Except compares the result sets, it uses the comparer’s GetHashCode first. If two objects have two different hashcodes then it doesn’t bother calling Equals().
In the code above the GetHashCode() override calls the default Object.GetHashCode(). Object.GetHashCode() is useless: it generates an index as a function of the object reference. Even if two objects have the same value Object.GetHashCode()returns two different hashes. And Except thinks the objects are different!
No need to burn too much brainpower on writing the hash function as long as:
- it’s fast
- two objects with the same value return the same hashcode.
The following should do:
No the MINUS operation with Except works.
Limits of Linq:
The SQL-like way of comparing collections is elegant but it’s nothing more than a nice way of writing for loops. Don’t expect miracles on the performance side…
The best place to perform queries is still in the database.
In the case of the app described above performing queries on the client is just fine. However there are situations where importing into the database and having all the queries take place in SQL Server would be faster.