Cloud computing attention is often paid to moving compute requirements from the enterprise data center to public clouds. While the technology enables a number of innovative solutions around mobile and social environments, analytics is often seen as an in-house solution that requires highly specialized skills.
Cloud providers have upped their game in data analytics. No longer are they just providing you with an environment, managing the underlying machines, they are now managing underlying systems – systems specific to your needs like big data. This fits well with mid-sized companies that are hungry for analytic solutions to enable them to compete with large enterprise competitors.
Recently, I sat down with Jonathan Pickard of Analyzer 1. Jonathan has worked on both the business side and IT side to create game changing business intelligence solutions, for some of the largest enterprises. Focusing on rapid delivery, the cloud has been a game changer for Analyzer 1. As cloud BI solutions have begun to mature, Jonathan said, “we can now safely and effectively deliver to our clients at a pace never seen before.” I put together some questions for him:
Should an organization create big data systems themselves or use the cloud?
It depends. Let’s consider a database your company will use for big data analytics and ask these questions. Do you have a data security plan established in case of a machine outage, building fire or network attack? Have a plan to update all the layers of you system, that’s fast and non-disruptive; including hardware updates, network updates, OS updates, and database updates. Do you know the best practices associate with configuring your specific machine and database? Do you have multiple people that can be fully committed to the management the database and fix any unforeseen issues? This is just the tip of the iceberg for managing one component of your analytic system, not including associated management and planning issues.
So how can the cloud help a company with big data analytics?
Well again let’s look at a database, in this case an analytical database, Amazon Redshift. Redshift is a technology disruptor, for several reasons: analytic processing power, ease of management, and price ~ $1000 per TB per year. For people that are used to working in a larger company, they may be shocked at this price point. This is not only the price for the database per terabyte, we’re including the management of the underlying systems: facility, redundancy, maintenance, upgrades, and security features. Plus, like other systems in Amazons Web Services, it offers an easy management system that can be up and running within minutes. Typically, I demo the setup of a complete analytical system in less than 2 hours using AWS that includes the setup of a secure environment setup, ETL, analytical database, and analytical application.
What kind of organization is the cloud for?
An all-in cloud strategy is not for everyone, if you are a very large company and want to make BI, and analytics a core competency, you may want to consider developing in house. However, this includes paying for the best people in the industry, as well as all the infrastructure costs that go into running such system. While it may seem like in the long run, you may save money, you still have to dedicate your time, resources and staff to managing such infrastructure instead of putting those resources towards projects that differentiate your business. Speed is another issue. Already have systems locally associated with big data? Then you may need to move those systems into the cloud as well; else you could face bandwidth or even speed of light issues. Finally, though many cloud providers are approved to hold most data including military and financial, law to have personal oversight of the machines holding your data may restrict you.
What are benefits of practicing analytics on the cloud vs on-premise?
For most, the cloud offers big data practitioners a less expensive and faster way to get into the big data game. But what did uncle Ben say, “with great power comes great responsibility” and the same holds true with data in the cloud. Cloud provider have made it so easy to get data in the cloud, utilize it, and secure it, but some don’t also spend the small amount of time to ensure the latter. So if you are doing a proof of concept with a cloud provider use dummy data or ensure that you know what security protocols are appropriate for your situation and always use them.
Analytics is becoming the core differentiator for business success. The cloud, specifically AWS, allows companies of all sizes to take advantage of this analytic power. At the AWS re:Invent event in Las Vegas this week, the interest in Amazon’s Redshift offering has grown significantly considering the product was released just a year ago. Similar products from large enterprise software vendors are sparse and not ready for prime time yet. If mid-sized companies want to leverage analytics, cloud based solutions should definitely be on the short list. For larger companies, there may be some workloads that fit this model.