DIY or Buy? Weighing up Infrastructure Options for Big Data

big data
big data

By Vicky Falconer

Choosing how to implement your Big Data capabilities is critical; buying a complete solution could reap rewards

Building your own big data capabilities from scratch does seem tempting to many companies. They see it as a way to tailor the technology to their specific business needs and to give their IT department complete oversight and control of the processes and capabilities.

There is some truth in this, but when it comes down to it, this approach is fraught with challenges that could mean business goals take a long time to be delivered, are only partially met, or aren’t achieved at all.

In this blog, we’ll explore the challenges of the taking the DIY route with big data and how buying the capabilities packaged together on a pre-built system can overcome them.

Time to value

It’s the age old dilemma for IT departments: How can you do more with less? And this is particularly true when discussing big data, which for many companies is a new set of capabilities.

When tackling big data for the first time, most IT departments will have little experience or expertise to tap into. This makes the task of building capabilities with technology from a range of vendors all the more challenging.

And it’s not just building the infrastructure – it’s also about evaluating, testing, developing, integrating and tuning. This all takes time and, with the added issue of a small knowledge base, will take even longer.

The issue of expertise also applies in the longer-term. It’s likely that the people that worked on building the big data capability will no longer be in the team after a few years. This means the expertise that has been built up has gone when organisations want to evolve what they’re doing with big data.

There may also be a basic lack of resources to devote to the task, meaning the project takes longer to complete, delaying the business benefits that big data is meant to deliver — also known as ‘time to value’.

And if there are problems with the implementation of the new technology, this could lead to further complications and delays that will require numerous hours and investment to correct.

For example, poor network performance can often take months to solve when taking the DIY approach, especially when taking into account the multiple vendors involved. With the pre-built approach, a single vendor will take responsibility to track down and fix the issue much more quickly.

For many businesses, the ‘buy’ approach will be the better option. By investing in a suite of big data technologies packaged together, either deployed on-premise or via the cloud, companies can address all of the time to value challenges.

The technology is already tested, integrated and optimised for the task, with the vendor providing the expertise and support that may be lacking in the IT department, and which will evolve with the technology.

And while even well thought-out DIY implementations take months to make production-ready, in theory, Oracle’s Big Data Appliance can be up and running on-premise in a matter of hours.

Counting the pennies

Some organisations may see the DIY approach as a way to save money, as they can seek the best deal for each component of the stack they are building. This may also be in the belief that they are paying a premium for the work that a vendor puts into packaging the technology for an engineered system.

But, according to research by the Enterprise Strategy Group (ESG) and commissioned by Oracle, taking the pre-built approach when ramping up your big data capabilities is likely to save you money, and a significant amount at that.

For a medium-sized Hadoop-oriented big data project, ESG found that a pre-built system, like the Oracle Big Data Appliance could be around 45 percent cheaper than the DIY equivalent.

As an example of the savings that a pre-built solution provides, Oracle includes the annual subscription licence for Cloudera Enterprise as part of the fully tested and integrated hardware and software solution, whereas, buying the same Cloudera licence separately would incur an annual fee, increasing overall cost of ownership.

By taking the ‘buy’ approach, Belgian media group De Persgroep was able to deploy its big data project in a mere three months. The Big Data Appliance also proved to more cost-effective than an internally-built Apache Hadoop cluster, which would have required multiple servers and software licences, as well as greater maintenance resources.

De Persgroep analysed customer behaviour, such as website interactions and payment behaviour, so that it was able to predict subscription churn for its newspaper business with an accuracy of 92 percent.

Future proofing

Such is the speed of development of open source big data technologies, that in order for organisations to continue to be at the cutting edge of big data technologies, they will continually need to re-evaluate and integrate new open source projects whilst delivering enterprise grade platforms and services.

For example, there is currently a move towards the Apache Spark cluster computing framework. This shift means a significant migration and integration activity for Hadoop users to ensure the most relevant technology is being used.

With Cloudera’s distribution for Hadoop coming as part of Oracle’s Big Data Appliance, the technology can be easily and quickly updated as the technology evolves. The testing, integration and support efforts are part of the service that Oracle delivers to its customers.

The cloud-ready nature of Oracle’s capabilities also means that organisation can easily test their big data capabilities in the cloud, and then migrate the services on-premise if and when they feel the time is right. In contrast, the DIY approach will make this a hugely complicated and time-consuming process.

Pre-built systems, such as Oracle’s Big Data Appliance or Big Data Cloud Service, avoid many of the issues presented by organisations building their own big data capabilities — while also bringing a host of additional benefits. Building big data systems is not what gives value to businesses — it’s the value gained through the use of analytics. By choosing the pre-built route, businesses can slash the time to value, save money and future proof their capabilities. And by doing so, they will ensure their big data strategies are successful, both now and in the future.

The writer is the Big Data Solutions Lead, Oracle Australia

IMG Credit: thebluediamondgallery