Continuous Testing for Visual Studio

The other night is was playing around with a side project. I try to use a rather strict TDD approach for these projects, and so I run my tests a lot of times as I move forward, and spend quite some time waiting for the result before I move forward. This is a tedious and frankly unnecessary manual step; what I needed was continuous testing – unit tests that test themselves continuously, making sure I don’t break anything.

I remembered reading about JUnit Max by Kent Beck, a continuous testing plugin for Eclipse, that runs your unit tests in the background and unobtrusively tells you when a test fails, allowing you to do what you do best: write code. JUnit Max seems like a great thing, and now I needed the same thing for Visual Studio. A quick Google didn’t yield any add-ins, extensions or packages, so I decided to create one.

The result is Continuous Testing for Visual Studio, a small extension which runs your unit tests each time you build your solution, and reports failing tests to the error list so you can navigate to the line that failed and make the test pass. The extension significantly improves my workflow by removing a tedious manual step of running unit tests, so I encourage you to take it for a spin. Continuous Testing can be downloaded for Visual Studio 2008 and Visual Studio 2010. Future updates and versions will be announced on the Continuous Testing home page.

UPDATE Jun 17th, 2010: I’ve receive a lot of feedback through various solutions online. To be able to help you and/or improve Continuous Testing for Visual Studio, I need samples from you that reproduce the problems you are experiencing. Do not hesitate to leave a comment here, and provide your e-mail address when commenting, and you will receive a reply.

Copyable available on GitHub

People actually download and use Copyable, and they tend to use it in scenarios I haven’t used it in. This results in bug reports and patch submissions. So far, these have been given to me by e-mail or by blog comment, neither of which is a particularly great way of receiving them. So after receiving another one today, I finally got around to putting Copyable on GitHub.

The version I put up includes several enhancements from the latest release:

  • It uses FormatterServices.GetUninitializedObject and hence does not depend on a parameterless constructor or custom instance provider (but you can of course still create an instance provider if you want to control object initialization)
  • The bug with copy semantics for already visited objects submitted by Walter Oesch has been fixed
  • The bug with inherited fields found by Alex, and the patch submitted for it, has been incorporated

Bleeding edge Copyable can be found at http://github.com/havard/copyable. The clone URL is git://github.com/havard/copyable.git. Now go fix your own bugs! Or even better, enhance the framework.

Minimalistic MapReduce in .NET 4.0 with the new Task Parallel Library (TPL)

Among the news in .NET 4.0 are several additions by the Parallel Computing Platform Team. As I wandered through the documentation of the Task library with cloud computing and parallelism buzz in the back of my head, I got the idea of using tasks to create a minimalistic MapReduce. Here’s the result, a rather crude and simple, but efficient MapReduce for you to play with and utilize!

What is MapReduce?

For those of you who don’t know what MapReduce is: MapReduce is a simplified interface for parallel data processing. MapReduce was initially described by the Google engineers Jeffrey Dean and Sanjay Ghemawat in the 2004 paper titled MapReduce: Simplified data processing on large clusters.

MapReduce processes data by splitting the processing in to a set of transformations (in functional programming, this is called the “map” function (it maps or transforms an input to an output)). The results of the transformations are then combined into a single result (in functional programming, this is called the “reduce” function (it reduces a set of values to a single value)). On a sidenote, Linq has equivalent functions, but the names are different, presumably to make them more familiar to people with SQL knowledge. In Linq, map is called Select, and reduce is called Aggregate.

Shortly put, to process a huge set of data, you split the data into chunks and process each chunk in parallel. This eventually creates a new set of intermediary results, which is reduced to a single result.

Implementing a minimalistic MapReduce in .NET 4.0

The signature of my MapReduce function is


static Task<TResult> Start<TInput, TPartial, TResult>(
  Func<TInput, TPartial> map, 
  Func<TPartial[], TResult> reduce, 
  params TInput[] inputs);</pre>

In other words, to start a MapReduce run, you supply a map function, a reduce function, and a set of inputs. Each input will be turned into an intermediate result (of type TPartial). Inputs are transformed concurrently. When all inputs are transformed, the reduce function is called to transform the partial results into a final result (of type TResult). Cool!

The map part is implemented by starting a task for each supplied input using Task.Factory.StartNew().


Task.Factory.StartNew(() => map(input));

The reduce part is implemented as a continuation of all the map tasks, meaning that the reduce task waits for all the map tasks to complete, and then executes. This is achieved using Task.Factory.ContinueWhenAll.


Task.Factory.ContinueWhenAll(
  mapTasks, 
  tasks => PerformReduce(reduce, tasks));

As you can see, the implementation is minimalistic and simple, and usage is likewise.

Here’s a simple example using MapReduce to calculate the root mean square (MSE) of a set of values:


var task = MapReduce.Start<int, int, double>(
  i => i * i,
  s => Math.Sqrt(s.Aggregate((a, b) => a + b) / 5),
  1, 2, 3, 4, 5);
// Wait for result
task.Wait();
// Prints 3.3166...
Console.WriteLine(task.Result);

Actual applications of MapReduce are of course far more interesting than this simple example.

Applications of MapReduce

MapReduce can essentially be applied to any problem where you need a number of things to be done in parallel. It can even be applied in cases where you don’t need a final result. Just return an arbitrary value as the result (or even better, implement a variant of my MapReduce which uses Action<T>).

A few obvious use cases:

  • Distributed search
  • Distributed sort
  • Tokenization
  • Indexing
  • Log processing
  • Machine learning
  • General artificial intelligence
  • General data mining
  • Large scale image processing

The list goes on and on, these are just a few things off the top of my head.

You can grab the source code for MapReduce here. Since this is done in .NET 4.0, it requires Visual Studio 2010 Beta 2 or later.

As usual, play around with it, have fun, and let me know if you find it useful!

RSA using BouncyCastle

Trying to do RSA using BouncyCastle, but struggling to find your way around the API? In a previous post (see here) I pondered why the RSA implementation in System.Security.Cryptography is restricted to only the most common usage scenarios. I mentioned BouncyCastle as an alternative for those who wanted a more flexible API, but never got around to providing examples where BouncyCastle was used. By request, this post provides usage examples by building a crude and simple, but efficient set of methods for RSA key generation, encryption, and decryption, all built on top of BouncyCastle.

NOTE: The general cryptographical security of the presented method is beyond the scope of the article. The code presented is not cryptographically secure for large data sets. If you’re here looking for a way to do cryptographically secure RSA in the general case, you should look into more complicated approaches including padding, blinding, and more sophisticated block cipher modes. Cryptography is a topic undergoing constant research, so stay up to date and be sure to evaluate the strength of your solution for the scenarios in which you apply it.

BouncyCastle provides flexibility and control over your encryption approach, which comes at a cost. The BouncyCastle API might be a bit hard to cope with at first, but if you know encryption in general you should be able to find your way around the API without too much effort. This post will be focusing on RSA, since that was my original need, but it should be mentioned that BouncyCastle provides many other asymmetric (and symmetric) algorithms for which the usage is similar to what you find below.

Creating RSA keys

Creating RSA keys is a simple task. The method below lets you specify the key size in bits, and creates a key pair for you.


public AsymmetricCipherKeyPair GenerateKeys(int keySizeInBits)
{
  RsaKeyPairGenerator r = new RsaKeyPairGenerator();
  r.Init(new KeyGenerationParameters(new SecureRandom(),
    keySizeInBits));
  AsymmetricCipherKeyPair keys = r.GenerateKeyPair();
  return keys;
}

That’s all there is to it.

Encryption

Now that we have a key pair, we are ready to encrypt and decrypt using RSA. In the example below, we use a key (public or private) to encrypt a byte sequence. To encrypt a string, simply convert the string to a byte array using Encoding.GetBytes.


public byte[] Encrypt(byte[] data, AsymmetricKeyParameter key)
{
  RsaEngine e = new RsaEngine();
  e.Init(true, key);</p>

<p>int blockSize = e.GetInputBlockSize();</p>

<p>List<byte> output = new List<byte>();</p>

<p>for (int chunkPosition = 0; chunkPosition &lt; data.Length; 
    chunkPosition += blockSize)
  {
    int chunkSize = Math.Min(blockSize, data.Length - 
      (chunkPosition * blockSize));
    output.AddRange(e.ProcessBlock(data, chunkPosition,
      chunkSize));
  }
  return output.ToArray();
}

The approach above uses a list to gather output for the sake of simplicity. Note that the RSA engine can only process a limited block size at a time (block size depends on the key size). The approach above processes a data set of an arbitrary size.

The above method does not impose constraints on which key you use for encryption. Use the public key or the private key as you see fit for your solution.

Decryption

The Decrypt method is very similar to the Encrypt method:


public byte[] Decrypt(byte[] data, AsymmetricKeyParameter key)
{
  RsaEngine e = new RsaEngine();
  e.Init(false, key);</p>

<p>int blockSize = e.GetInputBlockSize();</p>

<p>List<byte> output = new List<byte>();</p>

<p>for (int chunkPosition = 0; chunkPosition &lt; data.Length;
    chunkPosition += blockSize)
  {
    int chunkSize = Math.Min(blockSize, data.Length - 
      (chunkPosition * blockSize));
    output.AddRange(e.ProcessBlock(data, chunkPosition,
      chunkSize));
  }
  return output.ToArray();
}

Again, it’s up to you which key you choose to use. If you want to use the common approach, encrypt using a symmetric cipher, hash the data, and sign the hash with your private key using the above Encrypt method. If you want to use another approach like encrypting the actual data using your private key, you are free to do so.

I hope this post helps those of you who want to apply RSA (or any other asymmetric cipher) to more subtle cases than those supported by the .NET framework.