Database Availability Group (DAG) and Client Access Array (CAS Array) brought a huge value to Exchange 2010 as a product. Both technologies compliment each other to provide high availability and seamless client experience. Switch over or failover of mailbox databases or an entire server in previous versions to Exchange 2010 were always a nightmare for the administrators to maintain the client experience. Exchange 2010 does a magnificent job to reconnecting a disconnected client back in 30-45 seconds. This is possible only because of the CAS array which works as an entry point to outlook and other native exchange clients.
Although DAG and CAS array together were an extremely successful model, they posed a complexity of operation in a multi datacenter environment. To manage a site switchover or failover; an administrator still needs to consider a lot of factors so that the clients do not experience downtime or disconnections. In an event where entire datacenter fails, an administrator must change several things in the configuration including DAG, mailbox database properties, CAS arrays, DNS, load balancing appliances etc. I have personally seen some disaster recovery drills performed at several customer places and the effort involved to get everything working could not be reduced lesser than 15 minutes. In all cases, external clients were the most affected ones. A disconnected executive user needing to send an urgent email containing some business details to partners can be even worse.
Most of all dissatisfying experiences are due to the way CAS arrays work with DAG.
I must admit how detailed study the Microsoft Exchange Team has done to overcome the challenges faced by customers (although the primary driving factor was to make office 365 a huge success 🙂 ). Exchange 2013 overcomes the site switchover or failover challenges by introducing something new called “Single Namespace”. I am not really going to detail everything about it since there are decent articles available around this topic, a good one of those exists here. Instead, I am going to write a little more about how does it work and what happens under the hood. So, let’s get started!
Windows DNS Client
Exchange Single Global Namespace requires more than one CAS servers and indeed more than one IP addresses to be specified for a single name that will be used by the clients to connect. Just similar to DNS load balancing using a single host name mapped to multiple IP addresses. Outlook clients will then learn these IP addresses with the help of DNS client. In order to retrieve these IP addresses DNS resolver is called and the received response is cached in the DNS resolver cache and the HTTP cache both. For example microsoft.com is the fqdn used for your exchange servers, when you query microsoft.com using nslookup, you find something like below"
DNS Client and HTTP Caching
While the addresses returned by querying a DNS server are cached by the DNS client, they are also cached by the HTTP client. WinInet can very well use the DNS client cache to query this cached response however, it requires a RPC call to be made to DNS cache which is still slower than querying its own cache. This way the cached addresses can be retrieved much faster and they can be used without sending a DNS query to the DNS server. Cache ignores any TTL specified by the DNS server and defaults it to 30 minutes. IE 10 and later hold up to 256 entries in the cache.
The HTTP Client
Windows clients run a built in HTTP client that is nothing but a set of DLL files sitting inside System32 folder. WinInet is an acronym used for Windows Internet and provides APIs for the application development. Internet Explorer uses this library to get you the internet resources like websites and ftp sites. WINHTTP is also provides client side APIs but its primarily used by services. HTTP is now the primary protocol of communication between clients and Exchange CAS servers, WinInet is the base for these communications. However applications which are affected by IE settings also honor and refer the WinInet for HTTP calls. Outlook is certainly one of them.
DNS Round and Robin
This is the most interesting topic for the exchange guys. This how outlook determines how to connect to the addresses returned by a DNS server. Let us take the example shown above.:
A host named microsoft.com returned two IP addresses let us call it a list of IP addresses:
The http client will attempt the connection to first address in this list, if the connection to the first IP fails, it is marked bad and next IP address is attempted. This process continues until either a successful response is received from the server, or until it reaches the end of list of IP addresses. Value of an option flag INTERNET_OPTION_CONNECT_RETRIES determines the behavior of connection attempts to each IP returned in the list of IP addresses.
Sets or retrieves an unsigned long integer value that contains the number of times WinINet attempts to resolve and connect to a host. It only attempts once per IP address. For example, if you attempt to connect to a multihome host that has ten IP addresses and INTERNET_OPTION_CONNECT_RETRIES is set to seven, WinINet only attempts to resolve and connect to the first seven IP addresses. Conversely, given the same set of ten IP addresses, if INTERNET_OPTION_CONNECT_RETRIES is set to 20, WinINet attempts each of the ten only once. If a host has only one IP address and the first connection attempt fails, there are no further attempts. If a connection attempt still fails after the specified number of attempts, the request is canceled. The default value for INTERNET_OPTION_CONNECT_RETRIES is five attempts. This option can be used on any HINTERNET handle, including a NULL handle. It is used by InternetQueryOption and InternetSetOption.
Each connection attempt could take up to 21 seconds if the DNS query returns too many number of IP addresses. This will perhaps justify why exchange team claims to have 20 seconds delay in the client reconnection. The reconnect delay may also occur due to sluggish internet connection. When you have a multi site deployment and want to use a single namespace across the sites, you can very well specify multiple IP addresses for namespace you are planning to use and the http client on the windows computers will take care of this for you
Firefox and other clients
Exchange 2013 supports multiple browsers for outlook web app. Firefox, Google Chrome, Safari are very widely used and supported web clients for OWA. NeitherI could find a lot of literature around these products nor have much interest to run 100s of tools to understand their behavior; but if you are interested to learn how firefox deals with all the stuff that this article explained so far, a good read is available at How does firefox cache DNS requests/replies?
Although Single Namespace support by Exchange 2013 leverages most of the client side features to provide a seamless experience for the outlook clients, there is a lot that servers have to do in backend when a site switchover or failover happens.
I will discuss more about what happens inside your datacenter during the site *over in the next part of this post.