Samples

MapReduceRoles4Azure beta-2 contains the following samples.

  • WordCount
  • Smith Waterman sequence distance calculation
  • Cap3 sequence assembly 
  • Blast sequence alignment (Under Construction)

Pre-requisites
1. Visual Studio 2010
2. Azure SDK (http://msdn.microsoft.com/en-us/windowsazure/cc974146.aspx)
All the samples are configured to run using the Azure development storage.

Running/Debugging using Azure local development fabric
Run Visual Studio 2010 as administrator (Required for deploying in the development fabric) and open the sample solution. Run/Debug the sample solution. The web based monitoring console will open in a browser window.

Deploying in Azure
In order to deploy in Azure, first you have to edit the “DataConnectionString” with your storage account information for each role in the solution. This can be performed by,
1. Right clicking on the role and select properties.
2. Go to settings
3. Select the “DataConnectionString” setting (add this setting if it’s not there) and click on the “...” button on the right hand corner as follows.
4. Enter your Azure Storage Account information.

Follow these tutorials  from MSDN to learn how to deploy Azure Applications from Visual Studio.

WordCount

This sample counts the number of occurrences of all the words in a given set of text files stored under a Azure storage container similar to the Hadoop WordCount sample. Sample contains a web-role based client  as well as a command line client.
1. Use the data file in “sample_input.zip” file and replicate it as many times you desire  (or any other text file you are interested in processing) to a container in the Blob storage. You can use an external program (eg: Cloudberry explorer) to upload data and to manage the storage account.

To be completed.....