Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data

Cloudline | Blog | Intel Responds to Calxeda/HP ARM Server News: Xeon Still Wins for Big Data

Intel’s Radek Walcyzk, head of PR for the chipmaker’s server division, called Wired today with Intel’s official response to the ARM-based microserver news from Tuesday. In a nutshell, Intel would like the public to know that the microserver phenomenon is indeed real, and that Intel will own it with Xeon, and to a lesser extent with Atom.

Now, you’re probably thinking, isn’t Xeon the exact opposite of the kind of extreme low-power computing envisioned by HP with Project Moonshot? Surely this is just crazy talk from Intel? Maybe, but Walcyzk raised some valid points that are worth airing.

More at Cloudline.

Big Data, Fast & Slow: Why HP’s Project Moonshot Matters

Cloudline | Blog | Big Data, Fast & Slow: Why HP’s Project Moonshot Matters

In Marz’s presentation, which describes how Twitter’s Storm project complements Hadoop in the company’s analytics efforts, Marz says in essence (and here I’m heavily paraphrasing and expanding) that there are really two types of “Big Data”: fast and slow.

Fast “Big Data” is real-time analytics, where messages are parsed and for some kind of significance as they come in at wire speed. In this type of analytics, you apply a set of pre-developed algorithms and tools to the incoming datastream, looking for events that match certain patterns so that your platform can react in real time. A few examples: Twitter runs real-time analytics on the Twitter firehose in order to identify trending topics; Topsy runs real-time analytics on the same Twitter firehose in order to identify new topics and links that people are discussing, so that it can populate its search index; a high-frequency trader runs real-time analytics on market data in order to identify short-term (often in the millisecond range) market trends so that it can turn a tiny, quick profit.

Real-time analytics workloads are have a few common characteristics, the most important of which is that they are latency sensitive and compute-bound. These workloads are also bandwidth intensive in that the compute part of the platform can process more data than storage and I/O can feed it (hence the compute bottleneck). People doing real-time analytics need lots and lots of CPU horsepower (and even GPU horsepower in the case of HFT), and they keep as much data as they can in RAM so that they’re not bottlenecked by disk I/O.

I’ve drawn a quick and dirty diagram of this process, above. As you can see, the bottlenecks for Hadoop are the disk I/O from the data archive and the human brain’s ability to form hypotheses and turn them into queries. The first bottleneck can be addressed with SSD, while fixing the second is the job of the growing stack of more human-friendly tools that now sits atop Hadoop.

More at Cloudline

The Opposite of Virtualization: Calxeda’s New Quad-Core ARM Part for the Cloud

Cloudline | Blog | The Opposite of Virtualization: Calxeda’s New Quad-Core ARM Part for the Cloud

On Tuesday, Austin-based startup Calxeda launched its EnergyCore ARM system-on-chip (SoC) for cloud servers. At first glance, Calxeda’s chip looks like something you’d find inside a smartphone, but the product is essentially a complete server on a chip, minus the mass storage and memory. The company puts four of these EnergyCore SoCs onto a single daughterboard, called an EnergyCard, which is a reference design that also hosts four DIMM slots and four SATA ports. A systems integrator would plug multiple daughterboards into a single mainboard to build a rack-mountable unit, and then those units could be linked via Ethernet into a system that can scale out to form a single system that’s home to some 4096 EnergyCore processors (or a little over 1,000 four-processor EnergyCards).

More at Cloudline

Interview: Topsy Co-Founder on Twitter, Uprisings, Authority, and Journalism

Cloudline | Blog | Interview: Topsy Co-Founder on Twitter, Uprisings, Authority, and Journalism

Ghosh: Around 2005, people used to do this thing called “Google bombing,” where they would put links. One of the responses from Google was to require that all websites put a “nofollow” tag on links that are not created by the website itself.

So if you had a link that was posted in the comments, or posted by a user — which includes things like Wikipedia or all social media — which has not been created by the website [then you had to add a “nofollow” tag]. So the authority model — where, when a website links to something else, it gives its authority to that thing — that model breaks down because the website is no longer controlling who puts that link on its pages. So for all links of those types, they were forced to add this nofollow tag so that [the links] could be ignored for the purpose of computing authority. What that means, though, is that, while it was breaking the earlier authority model of Google, [Google] did not change their authority model in response to the way the web was changing.

And the web changed so that the authority model of the new web is that people are the sources of authority. This was always really the authority model, but 10 or 15 years ago, a website and a person were pretty much the same thing.

Wired.com: A website was a useful proxy for a person or a collection of people (an institution, say).

Ghosh: Yes. And that changed when you had different people posting on the same website, or the same people posting on different websites — that proxy didn’t work anymore. But Google didn’t change their authority model.

More at Cloudline.

Beyond Google’s Reach: Tracking the Global Uprising in Real Time

Cloudline | Blog | Beyond Google’s Reach: Tracking the Global Uprising in Real Time

On Oct. 15, groups of protesters affiliated with the Occupy Wall Street movement began filing into the branch offices of their banks to close their accounts. Later that day, videos began to show up online of those protesters being arrested. Irate branch managers had called the cops, claiming that these customers were being disruptive, so police began hauling the protesters away for booking.

The spectacle of citizens being arrested for attempting to close out their personal bank accounts made a splash in all of the usual corners of the internet. Except one: Google.

Like the larger Occupy Wall Street movement, which is often referenced online via the Twitter hashtag #OWS, the Oct. 15 protest was organized using #oct15. A search for #oct15 on the day of the protest yielded nothing but garbage results, and my searches as late as a day later yielded similar output. But despite allegations that Google — especially Google News, which still doesn’t have any worthwhile results for #oct15 — is censoring protest-related material, the more straightforward answer to the question of why the world’s largest search engine can’t produce useful results for current events in real time is that it’s simply not designed to.

As I found out on the day of Oct. 15, if you want quality information about events as they unfold in real time, then you can forget about the Google search box. Instead, you have to turn to alternative search engines, and specifically to Topsy, which had links to blog posts, videos, and pictures of the protest on the day of the protest, often mere minutes after the information was posted online. I’ve been a Topsy user for the past six months, and on Oct. 15, when Google searches were turning up garbage, I typed “#oct15″ into the Topsy search box and was able to track events as they happened.

More at Cloudline.

Meet ARM’s Cortex A15: The Future of the iPad, and Possibly the Macbook Air

Cloudline | Blog | Meet ARM’s Cortex A15: The Future of the iPad, and Possibly the Macbook Air

In addition to unveiling its Cortex A7 processor on Wednesday, the press event was also a sort of second debut for the Cortex A15. The A15 will go into ARM tablets and some high-end smartphones during the second half of 2012, and it’s by far the best candidate for an ARM-based Macbook Air should Apple chose to take this route. Just as importantly, A15 will also go into the coming wave of ARM-based cloud server parts that have yet to be announced.

As part of the press materials for the A7 launch, ARM also released the first detailed block diagram—at least that I’ve been able to find—of the Cortex A15. The company also had the first working silicon of the A15 on display running Android. So let’s take a look at the A15 from top to bottom, because it is the medium-term future not only of the mobile gadgets that we all know and crave, but possibly of some of the servers that those devices will connect to.

More at Cloudline.

ARM’s Cortex A7 Is Tailor-Made for Android Superphones

Cloudline | Blog | ARM’s Cortex A7 Is Tailor-Made for Android Superphones

The A7′s design improvements over the older A8 core are possible because ARM has had the past three years to carefully study how the Android OS uses existing ARM chips in the course of normal usage. Peter Greenhalgh, the chip architect behind the A7′s design, told me that his team did detailed profiling in order to learn exactly how different apps and parts of the Android OS stress the CPU, with the result that the team could design the A7 to fit the needs and characteristics of real-world smartphones. So in a sense, the A7 is the first CPU that’s quite literally tailor-made for Android, although those same microarchitectural optimizations will benefit for any other smartphone OS that uses the design.

The high-level block diagram for the A7 released at the event reveals an in-order design with an 8-stage integer pipeline. At the front of the pipeline, ARM has added three predecode stages, so that the instructions in the L1 are appropriately marked up before they go into the decode phase. Greenhalgh told me that A7 has extremely hefty branch prediction resources for a design this lean, so I’m guessing that the predecode phase involves tagging the branches and doing other work to cut down on mispredicts.

More at Cloudline.

Why NIST Should Scrap “the Cloud” Entirely

Cloudline | Blog | Why NIST Should Scrap “the Cloud” Entirely

I’d like to humbly suggest that if NIST really wants to help government agencies save money, it should get out ahead of the private sector and do what all of us will eventually do one day, which is completely scrap “the cloud” as an useful abstraction.

Instead of framing a discussion of the proper allocation of scarce government IT resources around some notion of “cloud,” NIST should focus first on the kinds of things that users want to do with computers i.e., e-mail, telephony, archival storage, data distribution, publication, content creation, content management, etc. Then, for each application type, the agency should use an agreed-upon set of metrics to evaluate the full range of options that modern computing supplies, from old-fashioned shrink-wrapped software to apps to client-server to SaaS to roll-your-own using PaaS or IaaS.

More at Cloudline

Tea Party vs. OWS: The psychology and ideology of responsibility

I think what’s interesting in this Will Wilkinson piece is that all parties—conservatives, libertarians, liberals—are so focused on explaining what causes people to fail or “fall behind.” I’m much more interested in the success outliers, i.e. the top 0.1%, than the bottom 50% or so. This is because I think that the difference between being in the top 5% and being in the top 1%, is mostly luck, and the difference between being in the top 1% and being in the top 0.1% is entirely luck.

This isn’t to say that hard work and individual initiative don’t matter—they’re essential for entry into a new global elite that’s no longer based on inheritance. My point is that these virtues aren’t sufficient in and of themselves to get you into the elite. You also need a large dose of luck. In this respect, I’m in complete agreement with Taleb’s famous quote: “Hard work will get you a professorship or a BMW. You need both work and luck for a Booker, a Nobel or a private jet.”

The main place where I’m in agreement with #OWS is that I do not want to live in a “winner take all” society, where the winners who are taking it all have lucked their way into that winning spot (despite what they tell themselves about how much they deserve their spoils due to their brains and hard work). I’d be happy with something like “hardworking winners take most,” “lucky winners take some extra,” and both types of winners make sure that hardworking and unlucky losers have the basics (healthcare, affordable housing, food, etc.).

Tea Party vs. OWS: The psychology and ideology of responsibility | The Moral Sciences Club | Big Think:

One of the most robust finding in political psychology is that liberals tend to explain both poverty and wealth in terms of luck and the influence of social forces while conservatives tend to explain poverty and wealth in terms of effort and individual initiative…

…But, having lived most of my adult life among them, experience tells me that when it comes to the explanation of poverty and wealth libertarians are close cousins to conservatives. It’s my view that this shared sense of robust agency and individual responsibility for success and failure is the psychological linchpin of “fusionism”–that this commonality in disposition has made the long-time alliance between conservatives and libertarians possible, despite the fact that libertarians are almost identical to liberals in their unconcern for the conservative binding foundations. That’s why controversial “social issues” like abortion and gay marriage are generally pushed to the side when libertarians and conservatives get together. As long as they stick to complaining about handouts for poor people sitting on their asses and praising rich people working hard to make civilization possible, libertarians and conservatives get along fine.

(Via bigthink.com)

A Tale of two customer bases: Amazon and Ebay

Fascinating behind-the-scenes look at the differences between these two popular retail platforms.

A Tale of two customer bases: Amazon and Ebay:

My company is four years old this week, and while we do have our own web site where we sell from, we need to be on Amazon and Ebay as well to make sure we get the largest audience possible for our products. Going through the data over the last year, and over the last four years some interesting if trivial data points are showing up in terms of how successful someone can be on someone else’s system. Admitted there is no way I could spend the money on advertising to reach 88 million visitors a month like Ebay or Amazon, what is interesting though is the customer behavior exhibited by buyers on both of these systems…

(Via CloudAve)