Enterprise IT now senses a threat to its existence from the Public Cloud especially as more business users have begun to utilize the Public Cloud. Vendors who will truly benefit from this trend and grow are those who enable the technology and IT consumption model transitions, as well as those who usher Cloud and hyper-scale technologies and architectures into the enterprise.
A key element of architectures deployed by the large web-scale players such as Google and Facebook and large Cloud Players such as Amazon is the abstraction of all intelligence and function from commodity hardware into a software layer across all components of the data center stack – server, storage and networking. Since all functions required to manage the architecture are resident in the software, unprecedented levels of automation are possible.
The non-blocking software based architecture also means that the architecture can scale infinitely. As more capacity is needed, new commodity hardware is added in a parallel, scale-out fashion. The software (controller) to manage the servers, network switches and storage is now hosted on software based virtual machines. This is the reason why Google, Facebook and Amazon can manage billions of concurrent users and continuing to add infrastructure to manage their growth. Another key benefit of using an architecture leveraging commodity hardware controlled and managed by scale-out software is that infrastructure can be provisioned very quickly and to the specific dynamic requirements of several customers at the same time. There is no need to physically move any hardware when a new customer requires the provisioning of infrastructure to support a new application.
How Enterprises Crunch Big Data Bytes ?
Progressive enterprises are recognizing that the ability to collect, process and analyze more data about customers, product and business operations on a real-time basis is likely to be a strong competitive advantage. We expect this Big Data trend to drive the adoption of a completely new architecture based on commodity hardware and open source software such as Hadoop and Cassandra.
Which Companies Power Hyperscale Computing ?
A fundamental requirement of a software-defined datacenter is that all the hardware in the datacenter – servers, network switches and storage – be virtualized. The next step is automation and management of the virtual infrastructure created using virtualization. VMware is the undisputed leader in server virtualization, with over 70% of all virtual machines in production in the data center running VMware. It also has the technology lead in Network virtualization via its Nicira acquisition and appears to have developed very strong partner ecosystem to take the technology to its installed base of server virtualization customers.
Most commodity hardware architectures involve some level of clustering or parallel aggregation. For instance, most Big Data solutions are being delivered as clusters of servers and storage running Hadoop File Systems. These clusters require some software such as a hypervisor or a distributed file system at the front end to aggregate the hardware as a single pool of capacity. They also require a back-end interconnect to provide a pathway for data to traverse the server cluster from one server or storage node to another in the same cluster. InfiniBand is the lowest latency interconnect that is available today. Mellanox is the market leader in Ethernet and RDMA technology powering your massively parallel Hyper-computing infrastructure.
How Google Serves Results Instantly ?
Most hyper-scale players such as Google do not use networked shared storage (SAN, NAS) to deliver “hot” data to systems. The typical architecture is a shared cluster of servers with several Terabytes of disk or flash storage on the server virtualized and shared across all servers in the cluster using a shared file system software
deployed on all servers. The advantage of this architecture is that the data does not have to traverse through layers of network switches and adapters to get to the server and instead is resident right next to the server processor. You can take advantage of this Infrastructure made available to anybody using the Google Cloud – Compute Engine, Cloud Virtual Network, Load Balancing and CDN.
The volume, velocity, variety and complexity of data handled by these web-scale and Public Cloud players is unprecedented and has given birth to the notion of Big Data which you may have been hearing everywhere 🙂
Image Courtesy: Open Compute Project