Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data

Cloudline | Blog | Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data

Intel’s Radek Walcyzk, head of PR for the chipmaker’s server division, called Wired today with Intel’s official response to the ARM-based microserver news from Tuesday. In a nutshell, Intel would like the public to know that the microserver phenomenon is indeed real, and that Intel will own it with Xeon, and to a lesser extent with Atom.

Now, you’re probably thinking, isn’t Xeon the exact opposite of the kind of extreme low-power computing envisioned by HP with Project Moonshot? Surely this is just crazy talk from Intel? Maybe, but Walcyzk raised some valid points that are worth airing.

More at Cloudline.

Big Data, Fast & Slow: Why HP’s Project Moonshot Matters

Cloudline | Blog | Big Data, Fast & Slow: Why HP’s Project Moonshot Matters

In Marz’s presentation, which describes how Twitter’s Storm project complements Hadoop in the company’s analytics efforts, Marz says in essence (and here I’m heavily paraphrasing and expanding) that there are really two types of “Big Data”: fast and slow.

Fast “Big Data” is real-time analytics, where messages are parsed and for some kind of significance as they come in at wire speed. In this type of analytics, you apply a set of pre-developed algorithms and tools to the incoming datastream, looking for events that match certain patterns so that your platform can react in real time. A few examples: Twitter runs real-time analytics on the Twitter firehose in order to identify trending topics; Topsy runs real-time analytics on the same Twitter firehose in order to identify new topics and links that people are discussing, so that it can populate its search index; a high-frequency trader runs real-time analytics on market data in order to identify short-term (often in the millisecond range) market trends so that it can turn a tiny, quick profit.

Real-time analytics workloads are have a few common characteristics, the most important of which is that they are latency sensitive and compute-bound. These workloads are also bandwidth intensive in that the compute part of the platform can process more data than storage and I/O can feed it (hence the compute bottleneck). People doing real-time analytics need lots and lots of CPU horsepower (and even GPU horsepower in the case of HFT), and they keep as much data as they can in RAM so that they’re not bottlenecked by disk I/O.

I’ve drawn a quick and dirty diagram of this process, above. As you can see, the bottlenecks for Hadoop are the disk I/O from the data archive and the human brain’s ability to form hypotheses and turn them into queries. The first bottleneck can be addressed with SSD, while fixing the second is the job of the growing stack of more human-friendly tools that now sits atop Hadoop.

More at Cloudline

The Opposite of Virtualization: Calxeda’s New Quad-Core ARM Part for the Cloud

Cloudline | Blog | The Opposite of Virtualization: Calxeda’s New Quad-Core ARM Part for the Cloud

On Tuesday, Austin-based startup Calxeda launched its EnergyCore ARM system-on-chip (SoC) for cloud servers. At first glance, Calxeda’s chip looks like something you’d find inside a smartphone, but the product is essentially a complete server on a chip, minus the mass storage and memory. The company puts four of these EnergyCore SoCs onto a single daughterboard, called an EnergyCard, which is a reference design that also hosts four DIMM slots and four SATA ports. A systems integrator would plug multiple daughterboards into a single mainboard to build a rack-mountable unit, and then those units could be linked via Ethernet into a system that can scale out to form a single system that’s home to some 4096 EnergyCore processors (or a little over 1,000 four-processor EnergyCards).

More at Cloudline