Coming from a background of WinTel Enterprise Solution Architecture, it is tempting to approach the challenge of improving scalability and resilience in a web-based solution while still wearing the same hat. Conventional wisdom dictates that the ‘right’ thing to do is spec a nice big SAN storage array, a few beefy tier-1-manufacturer servers, load balancers, multiple firewall products and some VMware licenses. Operating systems should all be the latest Microsoft server product, with a separate Domain Controller (probably offsite) to ensure no issues with Active Directory replication. That site should probably be replicated to a second geographically separated location for Disaster Recovery reasons, and a tape-based backup system implemented to make sure we follow ‘best practice’…
The truth is that even the most rigorous enterprise architecture design is not necessarily suitable for an internet-based web application. The use case is different for a start: Rather than a few tens of thousands of corporate users in key head office locations we have practically unlimited users to serve from all parts of the world, around the clock. Some of those users like attacking internet-based services for pleasure or financial gain, and others simply like disrupting service and causing chaos. Neither is as large a threat inside the corporate network.
Then there is the work of the paradigm shifters – the people like Google who shun the cluster of tier-1 manufacturer servers in favour of designing redundancy into vast arrays of disposable desktop-computer-spec servers. When something breaks it is replaced with something equally cheap and replaceable. This is almost the exact opposite approach to the ‘enterprise’ design, where each component of each part of the solution is more like a high performance, high stress cog in an intricately meshing gear.
Luckily, starting with a blank canvas means we were able to pick and choose the elements of each type of solution that made the most sense to us. With the success of Google-like solutions in mind, along with the aims of the enterprise architecture, we were able to craft something that scales with our customers’ needs without bloat or becoming unwieldy to support.
As a result, the TalentBond solution comprises individually resilient servers each handling one of around ten different server roles, whilst each being capable of handling any or all roles when required. Every customer has primary and secondary server ‘clusters’ defined in their component snippets, helping to ensure that load is balanced around available hardware at all times. Data replication, backup and version management are all catered for as an intrinsic part of the software architecture rather than as part of an underlying hardware or software technology. We only build on technology we know will be available as a commodity, both in our hardware and software choices, so all software and management tasks are written in cross-platform languages so that the system has no affinity to any particular operating system. When the system is really comprised of a number of separate and discrete systems we can add capacity to the whole system by expanding it sideways – by adding more nodes.
We can continue to scale in this way in datacentres located anywhere in the world without affecting performance or complexity precisely because we were mindful of the limitations of the enterprise architecture model from the very start.