Observations from a Full-Scale Migration to Windows Azure, Part 1 (Highlights)


Over the past several years, we have been designing and developing our systems in preparation of getting them up “into the cloud”. Whether this means Microsoft, Amazon, or whomever was unimportant as the architecture needed to allow for high-availability and load-balanced deployments of our systems – the cloud-specific issues could be figured out later. About 1 1/2 years ago, we deployed some minor systems to Azure and consumed some of their services (most importantly queueing and blob storage). Over the past month and a half, we’ve been making changes specific to Azure. And last weekend, a co-worker of mine (who I can’t express enough gratitude towards) and I spent a grueling 72 hours beginning Friday morning migrating all of our databases and systems to Azure. We learned a lot through our various successes and failures during this migration, and in the time leading up to it.

For our system, we have a single set of internal WCF services hitting the database and half a dozen internal applications hitting those internal services. One of those internal applications is a set of externally-accessible WCF services, and on our customers’ behalf, we have some custom applications consuming those “public” services. Technologies/systems that we employ include the following:

  • SQL Server (50GB database where 23GB exists in a single table)
  • SQL Server Reporting Services
  • SQL Server Analysis Services
  • SQL Server Integration Services
  • WCF
  • .NET 4.0/MVC 2.0/Visual Studio 2010
  • Claims-Based Authentication (via ADFS)
  • Active Directory
  • Probably some more that I’m forgetting. If they’re important, I’ll add them back here.

By the end of the weekend, we had successfully migrated all critical systems to Azure (that we planned to) and only a couple non-critical apps still needed migration. We (temporarily) pulled the plug on one of our non-critical applications, in part due to migration difficulties and in part due to a pre-existing bug that needs fixed in it ASAP, so we decided to just tackle both at once the following week after getting some sleep. I can’t say the migration went without a hitch. While we had some unexpected major victories on some high-risk areas, we also had some unexpected major problems in some low-risk areas.

I’ll go over some specific experiences in some follow-up posts, but here were some major key points we took away from this experience that might help others. Some of these we knew about in advance and were prepared to deal with them. Others caught us by surprise and caused problems for our migration

  1. If you have a large database (anything more than 5GB), do a LOT of testing before you start migration! Backups, dropping/recreating indexes on large tables, etc! For instance, we have one table that we can’t drop and recreate an index on and the default ways to create backups take 8-10 hours for our database!
  2. When migrating your database to Azure, don’t do it with your “base system” being locally. Upload a .bak backup file to Azure blob storage using a tool like Cerabata’s Cloud Storage Studio (which allows you to upload in small chunks to easily recover from errors and improve bandwidth speeds) and create a medium-sized Azure Virtual Machine with a SQL Server Evaluation image, and base all of your data migration work from there. You’ll save so much time doing it this way unless you get everything working perfectly your very first try (unlikely). Otherwise, for just a couple bucks (it literally cost us ~$2 for the entire weekend’s worth of VMs we used), it’s totally worth it!
  3. AUTOMATION!! Automation, automation, AUTOMATION! You do the same thing over and over and over so many times, really, have a solid build server with automated build scripts for doing this! Do NOT use Visual Studio or any manual process! The ROI on investing in a build server will pay off before your 5th deployment, most likely, regardless of how complex or simple your system is!
  4. No heaps! You must have a primary key/clustered index on every single table. No exceptions! Period! Exclamation mark!
  5. Getting Data Sync up and running is a major pain in the ass! Azure’s Data Sync has stricter limitations than SQL Server’s Data Sync (for instance, computed columns don’t play nicely at all in Azure but SQL Server has no problem with them). There are just enough nuances and so much time that it takes to find them that this you can spend quite a bit of time just figuring this out. And then figuring how to automate these nuances is yet another topic of discussion since the tools are so poor right now.
  6. Use the SQL Database Migration Wizard to migrate your data from your “base system” to an Azure database. But be gentle with it, it likes to crash and that’s painful when it happens 3 hours into the process! Also, realize that it turns nullable booleans with a NULL value into FALSE and doesn’t play nicely with some special characters, so be prepared to deal with these nuances!
  7. Red Gate SQL Compare and SQL Data Compare are GREAT tools to help you make sure your database is properly migrated! SQL Data Compare fixes up the problems from the SQL Database Migration Wizard very nicely and SQL Compare gives you reassurance that indexes, foreign keys, etc. are all migrated nicely.
  8. As I said before, test test test with your database! For us, 8-10 hour database backups were unacceptable. Our current solution for this problem is to use Red Gate’s Cloud Services Azure Backup service. With the non-transactionally-consistent backup option, we can get it to run in ~2 hours. Since we can have nightly maintenance windows, this works for us.
  9. Plan on migrating to MVC 3.0 if you want to run in Server 2012 instances.
  10. If you’re changing opened endpoints in the Azure configuration (i.e. opening/closing holes in firewalls), you have to delete the entire deployment (not service) and deploy again. Deploying over an existing deployment won’t work but also won’t give you any errors. Several hours were wasted here!
  11. MiniProfiler is pretty awesome! But the awesomeness stops and becomes very confusing if you have more than 1 instance of anything! Perhaps there’s a fix for this but we haven’t yet found one.
  12. If you have more than just one production environment, it’s very handy to have different subscriptions to help you keep things organized! Use one subscription for Dev/QA/etc, one for Production, one for Demo, one for that really big customer who wants their own dedicated servers, etc. Your business folks will also appreciate this as it breaks billing up into those same groups. Money people like that.
  13. Extra Small instances are dirt cheap and can be quite handy! But don’t force things in there that won’t fit. We found that, with our SOA, Extra Small instances were sufficient for everything except for two of our roles. Except for those two roles, we actually get much better performance with 7 (or fewer) Extra Small instances than 2 Small instances for a cheaper price (1 Small costs the same as 6 Extra Small).

In the next post, we’ll go over the things that we did leading up to this migration to prepare for everything. From system architecture to avoiding SessionState like the plague and retry logic in our DAL, we’ll cover the things that we did to help (or we thought would help) make this an easier migration. And I will also highlight the things we didn’t do that I wish we had done!



Use Lambda Expressions for Strongly-Typed Goodness, and Serialize it too


With my venture into MVC2 recently, I’ve fallen in love with the strongly-typed helpers and am attempting to provide mechanisms similar to those in many different areas of my code so I can cease the use of hardcoded/constant strings all over the place and get better compile-time error checking.

This post is to show you how you can use Lambda Expressions to perform these sorts of things in an N-Tier system in a way where you can even serialize these expressions. I’m only handling a very basic scenario here but the tools I use are VERY flexible and can handle, out-of-the-box, much more complex scenarios.

So the setting where I’m using this in is within MVC where a grid is displaying paged data and I’d like to filter and sort the data. For this post, let’s look at the dynamic sorting options. Not only do I want to sort the data, but I also want to support multi-column sorting. However the UI implements this, I don’t really care but I want to expose a strongly-typed way for it to do so, if it chooses (i.e. as the default so future coders are less likely to create bugs that sneak into production).

In order to do this, I’ve created a fairly simple class to store these sorting options. I share both on the UI side and the server side of WCF as this is acceptable in my system. However, I do so in a way that COULD be consumed by any other technology while losing the strongly-typed functionality but not losing all functionality. Here is my initial version of this class:

    public class SortOption<TModel, TProperty> where TModel : class
    {
        public SortOption(Expression<Func<TModel, TProperty>> property, bool isAscending = true, int priority = 0)
        {
            Property = property;
            IsAscending = isAscending;
            Priority = priority;
        }
 
        public SortOption()
            : this(null)
        {
        }
 
        public Expression<Func<TModel, TProperty>> Property { get; set; }
 
        public bool IsAscending { get; set; }
 
        public int Priority { get; set; }
    }

Pretty simple, right? This actually is a VERY simple class with the exception of that pesky Property object with the nasty Expression<Func<TModel, TProperty>> signature. In case you didn’t know, that nasty signature is how we would write code such as foo.Property = (x => x.FirstName); in order to say that we want the “FirstName” property off of whatever object we instantiated foo with as TModel (whether it’s a Person class or a Pet class or whatever). Because of that nasty Expression signature, there’s no way this is going to serialize nicely, or at all. Lots of Googling all over, everybody tells you that you simply cannot serialize Lambda Expressions, don’t try!

Luckily for us, they are only partially correct. It depends on what you want to do with that Lambda Expression that determines if you really can serialize it or not. All we’re using it for is to have strongly-typed code at development time. At run-time, it does nothing for us and in actuality is a slight performance hindrance. But this is the age of sacrificing performance for developer productivity, so this is okay (at least for some). So with the help of some Dynamic LINQ Libraries (haha!!), we can actually serialize these expressions that we’re wanting to use in this scenario! ScottGu has a nice post introducing these and I encourage you to read it. What we are using, as ScottGu mentions, is the code in the Dynamic.cs file in the “\LinqSamples\DynamicQuery” project.

Take a looksee through the Dynamic.cs file and you’ll see a static DynamicExpression class with some ParseLambda functions in there – these are the things that we care about mostly for this. As you can see, these functions are quite powerful and flexible.

Once we make some slight modifications to our SortOption<TModel, TProperty> object to implement ISerializable, we can leverage one of those ParseLambda functions to perform the hard part of our custom serialization of this class. Below is a newer version of this class, all with nice comments, to do exactly what we’re wanting to do:

    /// <summary>
    /// This defines a framework to pass, across serialized tiers, sorting logic to be performed.
    /// </summary>
    /// <typeparam name="TModel">This is the object type that you are filtering.</typeparam>
    /// <typeparam name="TProperty">This is the property on the object that you are filtering.</typeparam>
    [Serializable]
    public class SortOption<TModel, TProperty> : ISerializable where TModel : class
    {
        /// <summary>
        /// Convenience constructor.
        /// </summary>
        /// <param name="property">The property to sort.</param>
        /// <param name="isAscending">Indicates if the sorting should be ascending or descending</param>
        /// <param name="priority">Indicates the sorting priority where 0 is a higher priority than 10.</param>
        public SortOption(Expression<Func<TModel, TProperty>> property, bool isAscending = true, int priority = 0)
        {
            Property = property;
            IsAscending = isAscending;
            Priority = priority;
        }
 
        /// <summary>
        /// Default Constructor.
        /// </summary>
        public SortOption()
            : this(null)
        {
        }
 
        /// <summary>
        /// This is the field on the object to filter.
        /// </summary>
        public Expression<Func<TModel, TProperty>> Property { get; set; }
 
        /// <summary>
        /// This indicates if the sorting should be ascending or descending.
        /// </summary>
        public bool IsAscending { get; set; }
 
        /// <summary>
        /// This indicates the sorting priority where 0 is a higher priority than 10.
        /// </summary>
        public int Priority { get; set; }
 
        #region Implementation of ISerializable
 
        /// <summary>
        /// This is the constructor called when deserializing a SortOption.
        /// </summary>
        protected SortOption(SerializationInfo info, StreamingContext context)
        {
            IsAscending = info.GetBoolean("IsAscending");
            Priority = info.GetInt32("Priority");
 
            // We just persisted this by the PropertyName. So let's rebuild the Lambda Expression from that.
            Property = DynamicExpression.ParseLambda<TModel, TProperty>(info.GetString("Property"), default(TModel), default(TProperty));
        }
 
        /// <summary>
        /// Populates a <see cref="T:System.Runtime.Serialization.SerializationInfo"/> with the data needed to serialize the target object.
        /// </summary>
        /// <param name="info">The <see cref="T:System.Runtime.Serialization.SerializationInfo"/> to populate with data. </param>
        /// <param name="context">The destination (see <see cref="T:System.Runtime.Serialization.StreamingContext"/>) for this serialization. </param>
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            // Just stick the property name in there. We'll rebuild the expression based on that on the other end.
            info.AddValue("Property", Property.MemberWithoutInstance());
            info.AddValue("IsAscending", IsAscending);
            info.AddValue("Priority", Priority);
        }
 
        #endregion
    }

With this simple custom serialization implementation, we can now have strongly-typed, serializable Lambda expressions and hopefully prevent sneaky bugs getting in there. Granted, when we serialize/deserialize this, we’re convert it to simple bug-tolerant strings, but this is the ugly plumbing work that you tend to not change that much. And also, when you break it, it tends to be pretty obvious and not very sneaky. So that’s okay. And from another perspective, good luck serializing anything if you can’t convert it into a string of some sort!



SOA-based Architecture in progress


I’ve been tasked at coming up with a standard architecture for us to apply to our future projects. Yeah, I know, there’s no blanket answer for everything, but given the requirements I expect, I think a single architecture can handle 95% of what we’ll be doing and we can deviate/improve as necessary for the other 5%.

Ultimately I don’t yet know what this is going to look like but this is the direction I’m leaning in:
(more…)



IndyTechFest registration is now open!


IndyTechFest registration is now open! This year there is a limit of 500 registrations (I believe last year’s was like 400 and it was booked up within just a couple weeks). So I strongly encourage you to register sooner rather than later!

There is a great lineup of speakers and sessions at this year’s IndyTechFest! Some of the speakers I have seen speak before include Paul Hacker, Larry Clarkin, Michael Eaton, Arie Jones, Tom Pizzato, Dan Rigsby, and Bill Steele. There are many other great speakers that I know or have heard of. This should be an excellent event and one that is worth a good long drive to get to!

Some of the sessions that I’m really looking forwards to include Test Driven Development (TDD) w/ VS 2008, Tips and Tricks for the New C#, Tips and Tricks for the New VB .NET, Duplexing WCF in the Enterprise, and Virtualization of SQL Server. There are many other sessions that I’m going to hope to get to but alas, with it being a one-day event, I doubt I’ll get to most of the ones I really want to see. :P

Props to the people who worked hard to make this event possible, including Brad Jones, Dave Leininger, John Magnabosco, Mark McClellan, and Bob Walker, as well as all of the support of the local user groups to help drive the event!

Just as I was wrapping this post up, I received a phone call. Apparently as of 1pm (1 hour after registration opened), nearly HALF of all available registration slots were filled! If you read this post and have not registered, go register NOW and don’t wait or you’ll be left out!


Jaxidian Update is proudly powered by WordPress and themed by Mukkamu