OREANDA-NEWS. October 09, 2015. A headline from an InfoWorld article on the 2015 HP Big Data Conference in Boston caught our eye: “Is the cloud the right spot for your big data?” The article gives no easy answer to this question and most enterprises don’t seem to have one, either. For instance, online retailer Etsy recently decided against a move to the public cloud, saying it was “too big” of a move for its existing private big data infrastructures.

Still, we see many cloud platform and infrastructure vendors, including Equinix partners Google, Amazon Web Services, Aliyun (Alibaba Group’s cloud computing business) and Microsoft Azure, regularly bring big data and analytics solutions to market. And major big data capture and analysis players, such as Hadoop, are pointing their customers to the cloud.

But there’s no doubt the issues behind migrating big data to the cloud are as diverse and complex as big data itself. One is data portability among various data types/protocols, applications and analytics. Data latency, security and disaster recovery are other major concerns for enterprises. Another issue is where to put all of the data and how to interconnect to it.

Local Interconnection is Key to Big Data – Even in the Cloud

Big data is often generated outside the enterprise data center, especially unstructured payloads. Those come from websites, sensors, mobile phones – basically anything that can place a bit on the Internet.

Security and governance standards for this data are still undetermined, and the big question on most companies’ minds is, “Where in the world is the data in the cloud being stored?” If it’s not being stored and analyzed locally, that tends to be a show stopper for many businesses considering big data cloud migration. If the data is locally held, then the next question for the cloud provider will be, “Is it accessible and secure?”

Keeping data where it is originated and enabling secure access to it is what private, direct interconnection is all about. No matter what big data or analytics application you use, on-premise or in the cloud, a secure interconnection strategy that is elastic enough to scale for the expanding needs of big data is critical to maintaining a big data strategy. Take the Human Genome Project, for example. What started out as a gigabyte-scale project quickly morphed into petabytes requiring unprecedented amounts of scalability for the infrastructure storing it and the networks connecting to it.

In addition, many enterprises will be in “discovery mode” when it comes to big data in the cloud, so the infrastructures they support for this phase of their migration need to be able to quickly and efficiently turn big data and analytics cloud services up and down as needed.

Finally, the big data applications and analytics required by most enterprises will likely demand multi-clouds, integrated to support a variety of workload types and performance demands. This will require proximate, low-latency, high-speed connectivity among ecosystems of cloud vendors that enable the creation of data lakes, data oceans and a secure fabric of interconnected data streams and analytics.

In our article, “Big Data Poses a Big Dilemma,” we discussed one option for the enterprise: integrating high-performance computing and legacy systems with existing cloud infrastructure service providers. Access to and real-time analysis of large datasets require massive amounts of bandwidth and low-latency connections, and many datasets contain sensitive data that shouldn’t be transmitted over the public Internet. But colocating inside Equinix data centers and interconnecting big data systems via the Equinix Cloud Exchange can address those issues. The Cloud Exchange’s direct connections bypass the public Internet for improved security. And since Cloud Exchange is available in 21 global markets, it can bring the enterprise close to big data systems for the fast, high-performance connections needed for real-time analytics.