There are quite a lot of differences between having an on-premise data center and using the cloud. One of these differences is the (guaranteed) uptime and the latency between the different servers. When creating your local on-premise datacenter you will have a pretty stable network connection between the different servers and it’s probably really fast. The cloud can be pretty fast also, especially when you are located in the same datacenter/container. However, you don’t have any real guarantees on where your stuff gets deployed, let alone the latency or connection between the servers.

This might lead to some small problems when you have to hit a different server, like when accessing files from storage or querying a database. This is why the Microsoft Patterns & Practices team has published the Transient Fault Handling Application Block. This application block states there are several transient exceptions which can occur when you try to access a service. These exceptions are known to sometimes be automatically be resolved over a small period of time, therefore retrying within a small period of time might fix the problem.

Basically, all this application block does is helping you add a retry-mechanism so you don’t have to worry much about these transient errors yourself. There are a couple of retrying strategies available out of the box when using the Entity Framework. The SQL team has also provided a new retrying mechanism to be used with the Elastic Scale libraries.

When you have downloaded the Elastic Scale sample applications you’ll have access to the SqlDatabaseUtils in which you will find the retry policy defined for the sample application.

/// 
/// Gets the retry policy to use for connections to SQL Server.
/// 
public static RetryPolicy SqlRetryPolicy
{
	get { return new RetryPolicy<SqlDatabaseTransientErrorDetectionStrategy>(10, TimeSpan.FromSeconds(5)); }
}

If an exception occurs within the scope of this policy, it’ll check if the exception is transient. If so, the command is executed again in 5 seconds. To determine if a SqlException is transient the policy checks if the error number matches one of a specified list. Below is an excerpt of the decompiled piece code.

foreach (SqlError error in sqlException.Errors)
{
switch (error.Number)
{
    case 40501:
    ThrottlingCondition throttlingCondition = ThrottlingCondition.FromError(error);
    sqlException.Data[(object) throttlingCondition.ThrottlingMode.GetType().Name] = (object) throttlingCondition.ThrottlingMode.ToString();
    sqlException.Data[(object) throttlingCondition.GetType().Name] = (object) throttlingCondition;
    return true;
    case 40540:
    case 40613:
    case 10928:
    case 10929:
    case 40143:
    case 40197:
    case 233:
    case 10053:
    case 10054:
    case 10060:
    case 20:
    case 64:
    return true;
    default:
    continue;
}
}

If you want to know the details about these errors, just check out your SQL Server.

SELECT * 
FROM [master].[dbo].[sysmessages]
WHERE error in (40501, 40540, 40613, 10928, 10929, 40143, 40197, 233, 10053, 10054, 10060, 20, 64)
AND msglangid = 1033

This will give you a list, in English, of all errors known to the SQL Server you are running on. If you need a different language, just change or remove the mslangid constraint.

Using this retry policy is rather straightforward. The basics are shown in the sample applications.

SqlDatabaseUtils.SqlRetryPolicy.ExecuteAction(() =>
{
    using (var db = new ElasticScaleContext<int>(sharding.ShardMap, tenantId2, connStrBldr.ConnectionString))
    {
        var blog = new Blog { Name = name2 };
        db.Blogs.Add(blog);
        db.SaveChanges();
    }
});

When executing this piece of code and you are faced with a transient SqlException, the request is executed again to the same connection/context until the constraints of the policy is met.

The code runs a bit different in my project, since I’m not creating the database connection and the database context in the same place. The reason for this is because sometimes I have to do a query/command on multiple shards and the so I had to resort to a small work around to get the proper connections and database contexts.

One of my connection managers now contains the following piece of code

return shards.Select(s =>
{
	return DatabaseUtility.SqlRetryPolicy.ExecuteAsync(() =>
	{
		var task = s.OpenConnectionAsync(credentialsConnectionString, ConnectionOptions.Validate);
		return task.ContinueWith(t => (IDbConnection) t.Result, TaskContinuationOptions.OnlyOnRanToCompletion | TaskContinuationOptions.ExecuteSynchronously);
	});
});

At this place, I’m creating all of the IDbConnections for the necessary shards. All of this is done within the scope of the retry policy.

Creating the database context and executing the request is done in this piece of code.

public async Task<IList<T>> InvokeQuery<T>(IDbConnection connection, Func<CustomerDatabaseEntities, T> command)
{
	var resultSet = new List<T>();
	Exception exception = null;
	return await TransientDatabase.SqlRetryPolicy.ExecuteAsync(async () =>
	{
		CustomerDatabaseEntities context = null;
		try
		{
			context = await this.SetCurrentContext(connection);

			var result = await Task<T>.Factory.FromAsync(command.BeginInvoke, command.EndInvoke, context, null);
			if (result != null)
			{
				resultSet.Add(result);
			}
		}
		catch (Exception ex)
		{
			exception = ex;
		}
		finally
		{
			if (context != null && this.TransactionManager == null)
			{
				this.contexts.Remove(context);
				context.Dispose();
			}
		}
	})
	.ContinueWith(t =>
	{
		this.ThrowIfRealException(exception);
		return resultSet;
	});
}

private void ThrowIfRealException(Exception exception)
{
	if (exception != null && !TransientDatabase.SqlRetryPolicy.ErrorDetectionStrategy.IsTransient(exception))
	{
		throw exception;
	}
}

You might notice a rather strange piece of code over here in the catch block and the ContinueWith method. This has a reason though. When I ran my tests on this piece of code it became clear to me the exceptions thrown within the retry policy scope were swallowed, even when they weren’t transient exceptions.
As you can imagine, this isn’t a desirable scenario so I had to think of a work around for this. Now, when an exception occurs, I’m able to store it locally and throw it later on, out of the scope of the retry policy.

Using the above examples it’s rather doable to create a stable connection to the database. Hope it helps!

Now that you have configured Elastic Scale for your solution there are still some changes to be made in your application.
At the moment there are 2 sample applications available, one using plain old SQL-queries (ADO.NET) and another one using the Entity Framework. I would suggest checking out these samples before doing any serious Elastic Scale work.

/posts/files/2dd3b4d5-f687-46e3-9c1e-b4a5e1dc6f9b.png

You will probably notice there are some small differences between the ADO.NET sample and the Entity Framework sample. One of them is the Entity Framework not having a sample for doing multi sharded queries. This is because Entity Framework (or any other ORM for that matter) can’t use the multi shard connection objects which are used by the Elastic Scale libraries. I’ve developed a small work around (hack) for this which is shown at the end of this post.

First, let’s step through the basics on how the Elastic Scale library is supposed to be used.

As stated in my earlier posts about this subject, Elastic Scale should be used when you are creating different shards each with subsets of data. In order to connect to the a shard all you need to do is use the Elastic Scale library.

// Looks up the key in the shard map and opens a connection to the shard
using (SqlConnection conn = shardMap.OpenConnectionForKey(customerId, credentialsConnectionString))
{
	// Create a simple command that will insert a new order
	SqlCommand cmd = conn.CreateCommand();

	//Execute your commands, the way you would normally do.
}

As you can see, all you need is the shard map you want to use, the identifier to search for and (the shardlet) and a credentialsConnectionString. The shardlet is specified in the ShardMap of the ShardMapManager. The connection string is something you will have to create yourself. It should only contain the credentials to log in to a database as the Elastic Scale library will find out the server and database by itself. The sample application has a nice helper method for it which you can use of course.

For reference, this is the implementation of the method in the sample applications.

/// 
/// Returns a connection string to use for Data-Dependent Routing and Multi-Shard Query,
/// which does not contain DataSource or InitialCatalog.
/// 
public static string GetCredentialsConnectionString()
{
    // Get User name and password from the app.config file. If they don't exist, default to string.Empty.
    string userId = ConfigurationManager.AppSettings["UserName"] ?? string.Empty;
    string password = ConfigurationManager.AppSettings["Password"] ?? string.Empty;

    // Get Integrated Security from the app.config file. 
    // If it exists, then parse it (throw exception on failure), otherwise default to false.
    string integratedSecurityString = ConfigurationManager.AppSettings["IntegratedSecurity"];
    bool integratedSecurity = integratedSecurityString != null && bool.Parse(integratedSecurityString);

    SqlConnectionStringBuilder connStr = new SqlConnectionStringBuilder
    {
        // DDR and MSQ require credentials to be set
        UserID = userId,
        Password = password,
        IntegratedSecurity = integratedSecurity,

        // DataSource and InitialCatalog cannot be set for DDR and MSQ APIs, because these APIs will
        // determine the DataSource and InitialCatalog for you.
        //
        // DDR also does not support the ConnectRetryCount keyword introduced in .NET 4.5.1, because it
        // would prevent the API from being able to correctly kill connections when mappings are switched
        // offline.
        //
        // Other SqlClient ConnectionString keywords are supported.

        ApplicationName = "ESC_SKv1.0",
        ConnectTimeout = 30
    };
    return connStr.ToString();
}

Now, if you want to use Entity Framework (or some other ORM), you can use the same methods to open up a connection. Just pass this connection to your database context and you are set to go. It couldn’t be any easier!

The next thing, which is pretty awesome, is doing queries on all shards at once. The SQL team has created a new MultiShardConnection object for us to use in such a scenario.

When using this MultiShardConnection a query is executed on all shards at once so you don’t have to create all of these connections by yourself. The code sample below is another excerpt from the Elastic Scale sample applications.

// Create the multi-shard connection
using (MultiShardConnection conn = new MultiShardConnection(shards, credentialsConnectionString))
{
	// Create a simple command
	using (MultiShardCommand cmd = conn.CreateCommand())
	{
		// Because this query is grouped by CustomerID, which is sharded,
		// we will not get duplicate rows.
		cmd.CommandText = @"
            SELECT 
                c.CustomerId, 
                c.Name AS CustomerName, 
                COUNT(o.OrderID) AS OrderCount
            FROM 
                dbo.Customers AS c INNER JOIN 
                dbo.Orders AS o
                ON c.CustomerID = o.CustomerID
            GROUP BY 
                c.CustomerId, 
                c.Name
            ORDER BY 
                OrderCount";

		// Append a column with the shard name where the row came from
		cmd.ExecutionOptions = MultiShardExecutionOptions.IncludeShardNameColumn;

		// Allow for partial results in case some shards do not respond in time
		cmd.ExecutionPolicy = MultiShardExecutionPolicy.PartialResults;

		// Allow the entire command to take up to 30 seconds
		cmd.CommandTimeout = 30;

		// Execute the command. 
		// We do not need to specify retry logic because MultiShardDataReader will internally retry until the CommandTimeout expires.
		using (MultiShardDataReader reader = cmd.ExecuteReader())
		{
			//Do some logic over here
		}
	}
}

As you can see, querying all of your shards is quite easy when using the MultiShardConnection. Just keep in mind the queries could perform a bit slow, depending on the latency between the shards.

There is one big problem when you are using an ORM and need to do these multi-sharded queries. The MultiShardConnection can not be used within a database context. So if you really need to do queries on all shards you’ll have to create connections to all shards by yourself, run the queries on the different shards and aggregate the results to your dataset. This is quite cumbersome and expensive, so don’t do it too often. I’ve had to create something which does this exact same thing in my project so I’ll just share the (very simplified) code below.

var resultSet = var resultSet = List();
var shardMap = shardMapManager.GetListShardMap("MyShardMapName");
var allMappings = shardMap.GetMappings();
var shards = allMappings.Select(a => a.Shard).Distinct();
var credentialsConnectionString = this.databaseConnection.GetCredentialsConnectionString();
foreach (var shard in shards)
{
	shard.OpenConnection(credentialsConnectionString, ConnectionOptions.Validate);
	using (connection)
	{
		using (var context = new CrossPointStoreDatabaseEntities(connection))
		{
			var result = command.Invoke(context);
			if (result.Any())
			{
				resultSet.AddRange(result);
			}
		}
	}
}
return resultSet;

This isn’t the safest way to get results, but it’s the only one we’ve got at the moment. Hopefully this will get resolved in later releases of the Elastic Scale libraries. Be warned to use this ‘solution’ with caution.

When implementing a sharding solution, you will need something which knows in what shard a specific shardlet exists. This is something you will want to store in a single location, so you know for sure you are always using the most recent information. When using the Elastic Scale libraries this is called the Shard Map Manager. The Shard Map Manager keeps track of the location & state of the shardlets and shards. As you can imagine this is quite an important aspect of the sharding solution.

In a perfect world you will generate the Shard Map Manager (SMM) once, telling it which which shardlets reside in a specific shard and never update it again. Since the Shard Map Manager only exists in one location and hardly ever changes, it’s a great candidate for caching. This is why the Elastic Scale libraries are making sure the content of the Shard Map Manager is cached right after the first call to the database. This way the latency between the SMM and the remote location will only be hit once, after this first call it will be in-memory of the invoking location.

In the real world however, the SMM will get some changes from time to time. For example, if you are sharding by continent you might decide you want to narrow them down a bit by changing US to West US and East US. When sharding with ranges (0..100, 100..200, etc.) you might have to add some new ranges from time to time.
When such updates happen, you want these changes to be reflected on all remote locations. When using the Elastic Scale libraries this cache invalidation and fetching new data for the cache is done automatically for you. This way you only have to focus on your data access code and all this managing is handled for you.

When using Elastic Scale, one of the first steps is creating a SMM and a shard. This can be done by calling the appropriate library methods like in the example below.

// Get shard map manager database connection string
// Try to get a reference to the Shard Map Manager in the Shard Map Manager database. If it doesn't already exist, then create it.
ShardMapManager shardMapManager;
var shardMapManagerExists = ShardMapManagerFactory.TryGetSqlShardMapManager(shardMapManagerConnectionString, ShardMapManagerLoadPolicy.Lazy, out shardMapManager);

if (shardMapManagerExists)
{
	ConsoleUtils.WriteInfo("Shard Map Manager already exists");
}
else
{
	// The Shard Map Manager does not exist, so create it
	shardMapManager = ShardMapManagerFactory.CreateSqlShardMapManager(shardMapManagerConnectionString);

	ConsoleUtils.WriteInfo("Created Shard Map Manager");
}

This will create your first SMM for the specified connection string (`shardMapManagerConnectionString`).

The newly created SMM is still empty as it doesn’t has any reference to a shard database and doesn’t know about any shard map (routing table to tell where a specific shardlet resides).

The first thing you want to do is adding a new shard map. This shard map is necessary to add new shards and will also contain information about shardlets. A shard map is created like so:

var shardMap = this.shardMapManager.CreateOrGetListShardMap("MyExampleShardMap");

A new shard map with the name “MyExampleShardMap” is created and can be used to add new shards.

Adding a new, empty, shard is also done by using the appropriate library calls. The sample below is en excerpt from my own management application.

// Choose the shard name
var databaseName = string.Format(CreateShard.ShardNameFormat, shardMap.GetShards().Count());

var serverName = this.configurationManager.GetAppSetting(ConfigurationManager.AppSettings.Database.ShardMapManagerServerName);

// Only create the database if it doesn't already exist. It might already exist if
// we tried to create it previously but hit a transient fault.
if (!this.sqlDatabaseUtils.DatabaseExists(serverName, databaseName))
{
	this.sqlDatabaseUtils.CreateDatabase(serverName, databaseName, this.log);
}

// Create schema and populate reference data on that database
// The initialize script must be idempotent, in case it was already run on this database
// and we failed to add it to the shard map previously
this.sqlDatabaseUtils.ExecuteSqlScript(serverName, databaseName, CreateShard.InitializeShardScriptFile, this.log);

// Add it to the shard map
var shardLocation = new ShardLocation(serverName, databaseName);
var shard = this.shardManagementUtils.CreateOrGetShard(shardMap, shardLocation);

The basis of this code can also be found in the sample application of Elastic Scale.

As you can see, the first check is if a database already exists. If not, it is created.

Once it is confirmed the empty database exists a T-SQL script is executed which populates the database with all necessary tables, stored procedures, functions, etc. This new database (shard) is then added to the specified shard map.

At this time, the system still doesn’t know which shardlets have to be stored in a specific shard. You still have to create (several) mappings (routes) in your shard map.

When using a ListShardMap you have to create a PointMapping.

shardMap.CreatePointMapping(sampleId, shard);

This code will add a new mapping to the specified shard with the specified identifier. Creating a mapping for a RangeShardMap works about the same.

var sampleRange = new Range(0, 100);
var mappingForNewShard = shardMap.CreateRangeMapping(sampleRange, shard);

After you have added all mappings to the shard map, there is still one thing to do. Adding the database schema info to the shard map manager.

As I’ve written earlier in this post and the previous post, when using Elastic Scale you have the power to distribute your data across multiple databases. You also have the opportunity to split and merge existing shard maps. For example, if you have a shard map defined with the range 0..100, you are able to change this on-the-fly to a 0..50 and a 50..100 shard map. All data which has to move will then be migrated to a new shard. In order do to these complex mutations the shard map manager has to now about the schema of the database. This is necessary to determine which records have to be moved.

When defining a schema you have to choose if a table is a Reference Table or a Sharded Table. A Reference Table is a table which contains information that’s the same on all shards. For example a Users table. A Sharded Table is a table which contains subsets of data. The data in these tables can be splitted and merged to different shards, depending on your defined shard maps.

The following code block creates a new database schema and adds it to the specified shard map in the shard map manager.

var schemaInfo = new SchemaInfo();

schemaInfo.Add(new ReferenceTableInfo("News"));
schemaInfo.Add(new ShardedTableInfo("Companies", "Id"));
schemaInfo.Add(new ShardedTableInfo("Orders", "CompanyId"));
schemaInfo.Add(new ReferenceTableInfo("Users"));

// Register it with the shard map manager for the given shard map name
this.shardMapManager.RegisterSchemaToShard("MyExampleShardMap", schemaInfo);

Keep in mind, for a Sharded Table, you have to specify the foreign key column name also. This means all sharded tables have to contain a column with a foreign key relation to their matching shardlet. The Elastic Scale libraries aren’t smart enough to check for cascading foreign key relations yet. Maybe this will be added in the future, maybe not.

Now that you have also added a proper schema to the shard map manager you are good to go and are able to do some more advanced stuff, like querying on your databases.

I would advise to check out the sample applications provided by the Visual Studio gallery and check out the specifics. After having done this you can add this awesome library to your own solution.

image

These days we all want to build the next big thing which will be deployed across the world. This of course is all fun and games, but there are also some technical difficulties you have to overcome when creating a software platform which has to be available from everywhere in the world with a responsive interface.

One of these difficulties you will have to face is getting the required data near your customers. Most of the time we are using a database to store this data for us. In the traditional form, this will likely be an on-premise database somewhere in a datacenter (with all disaster recovery aspects in place of course). Note, I’m talking about the traditional relational databases over here, but most of it will also apply on the non-relational databases.

In order to deploy your software globally, the data has to move with it. It doesn’t make much sense to deploy your software solution in a datacenter on the other side of the world when all data still remains far away, because the latency will slow down the experience. There are of course multiple solutions you can think off to solve this problem, one of them is to use sharding in your database.

Sharding is a concept which is widely used in todays noSQL database systems, most of the time the sharding is done locally though. It means you are splitting your data in multiple databases.

Wikipedia on sharding:

Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than being split into columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.

http://en.wikipedia.org/wiki/Shard_%28database_architecture%29

As you can imagine, this is a great solution if you want to make data available in a datacenter near a customer. You can put data of your European customers in database which is is deployed Europe and the data of an American customer somewhere in the USA.
I’ve found a nice, simple, picture to make the concept a bit more visual.


In the above image you see one massive database which holds all of the data on the left. On the right, this data is sharded across multiple databases. This is what sharding is all about.

The main advantage to this principle is you make sure it has the smallest possible latency between the customer and the data. Another advantage is you can create ‘premium’ databases and ‘standard’ databases, depending on the customer and the load it produces. You can even split customer data in several smaller databases (shards) and micro manage these even better.

As always, there are also several disadvantages which you will need to consider. One of them is the increased latency when querying data from multiple shards (databases). The queries will have to run on all shards, therefore it will be as slow as the slowest connection/database in the system.
Another disadvantage, which is more important to the development team, is the complexity of a proper sharding solution. When implementing a sharding solution you have to make sure all data is stored consistently, create your own transactions, copy data of reference tables across all shards, solve failure scenarios, migrating shardlets (data) between shards, etc.

Managing all of this yourself is doable, but not something you or your team want do unless it’s explicitly necessary.

Lucky for us who live in the Microsoft Azure space, Microsoft has been releasing their Elastic Scale library since October 2014 and I have had the pleasure to use this library for a customer of mine. As of April 2015 (BUILD 2015) this library has the 1.0 version number, so it’s ‘safe’ to use for everyone.

The Elastic Scale library saves us developers from the hassle it takes to create your own sharding implementation. The Elastic Scale library works great with SQL Azure databases and with the Split-Merge Service package you also get a web application which is able to move, merge & split shardlets (data) between your shards. This Split-Merge service uses an eventual consistency technique, so you don’t have to worry (much) about corrupt data when moving shardlets around.

There is quite a bit of documentation (and samples) available from the documentation map. The getting started guide is also nice place to start checking out this library. Keep in mind, you do need a Windows Azure subscription in order to use this solution.

I’ll describe some of the implementation details in the upcoming posts about this subject.

For our automated deployments we have several Azure Organizational accounts in place. These are created within the Azure Active Directory.

Because these accounts are meant for services, we don’t want them to inherit the default password policy for renewing their passwords every X days. Lucky for us, you can configure this via PowerShell. A short how-to is written on MSDN.

The thing that isn’t written (or referenced) over there is how to run the MSOL cmdlets.

I kept getting the messages `The term 'Set-MsolUser' is not recognized`. By searching a bit on this error I found a thread on the Office365 community forums where someone mentioned the “Microsoft Online Service Module for Windows PowerShell”. This set me off to searching in the right direction. Apparently you need to install a (new/extra) PowerShell module on your system in order to use the MSOL cmdlets. These cmdlets are part of the Office365 and Exchange Online services. A page with download links is provided by Microsoft Support. They provide a link to the Microsoft Online Service Sign-in Assistant for IT Professionals and the Azure Active Directory Module for Windows PowerShell (32-bit and 64-bit).

Once installed, you are finally able to use the MSOL cmdlets. Keep in mind though, you have to connect to the MSOL services first using the connection cmdlet.

Connect-MsolService -Credential $azureADCredentials

After connecting to the service, you can change the service account it’s password behavior to `Password Never Expires`.

For reference, this is the script I’ve used when changing the service account password policies:

function Set-CustomerAzureSubscription($subscriptionName)
{
    $azureSubscriptionSecurePassword  = ConvertTo-SecureString -String $azureSubscriptionPassword -AsPlainText -Force
    $azureCredentials = New-Object System.Management.Automation.PSCredential($azureSubscriptionUsername, $azureSubscriptionSecurePassword)

    Get-AzureAccount
    Add-AzureAccount -Credential $azureCredentials
    Get-AzureSubscription | % { Write-Host "Customer subscription: $($_.SubscriptionName)."}
    
    Write-Host "Selecting $($subscriptionName) as default Customer subscription."
    Select-AzureSubscription -SubscriptionName "$($subscriptionName)"
}

function Set-PasswordNeverExpiresForServiceAccounts($serviceAccountUsername, $serviceAccountPassword)
{
    $azureADCredentialsSecurePassword  = ConvertTo-SecureString -String $serviceAccountPassword -AsPlainText -Force
    $azureADCredentials = New-Object System.Management.Automation.PSCredential($serviceAccountUsername, $azureADCredentialsSecurePassword)
    Write-Host "Connecting to MSOL"
    Connect-MsolService -Credential $azureADCredentials

    Write-Host "Password never expires status of $($serviceAccountUsername)"
    Get-MSOLUser -UserPrincipalName $serviceAccountUsername | Select PasswordNeverExpires
    Write-Host "Setting password never expires status of $($serviceAccountUsername) to 'true'"
    Set-MsolUser -UserPrincipalName $serviceAccountUsername -PasswordNeverExpires $true
    Write-Host "Password never expires status of $($serviceAccountUsername)"
    Get-MSOLUser -UserPrincipalName $serviceAccountUsername | Select PasswordNeverExpires
}

Set-CustomerAzureSubscription $devSubscription
Set-PasswordNeverExpiresForServiceAccounts $devServiceAccount $devPassword

Set-CustomerAzureSubscription $accSubscription
Set-PasswordNeverExpiresForServiceAccounts $accServiceAccount $accPassword

Set-CustomerAzureSubscription $prodSubscription
Set-PasswordNeverExpiresForServiceAccounts $prodServiceAccount $prodPassword