Monthly Archives: June 2011

Behind the scenes: Using Cassandra & Acunu to power Britain’s Got Talent

In some previous posts, I’ve talked about how we scaled Django up to cope with the loads for Britain’s Got Talent. One area I haven’t talked about yet is the database.

For BGT, we were planning for peak voting loads of 10,000 votes/second. Our main database runs on MySQL using the Amazon Relational Database Service. Early testing showed there was no way we could hit that level using RDS – we were maxing out at around 300 votes/s on an m1.large database instance. Even though there are larger instances, they’re not 30x bigger, so we knew we needed to do something different.

We knew that various NoSQL databases would be able to handle the write load, but the team had no experience in operating NoSQL clusters at scale. We had less than 2 weeks before first broadcast, and all the options available were both uncertain and high risk.

Then a mutual friend introduced us to Acunu. They not only know all about NoSQL, but have a production-grade Cassandra stack using their unique storage engine that works on EC2. Tom and the team at Acunu quickly did some benchmarking on EC2 to show that the write volume we were expecting would be easily handleable, as well as testing out the Python bindings for Cassandra. That gave us good confidence that this could easily scale to the loads we were expecting, with plenty of headroom if things went mental.

We wired Cassandra into our stack, and started load testing against a 2-node Cassandra cluster. While we’d originally expected to need more nodes, we found that the cluster was easily able to absorb the load we were testing with, thanks to the optimisations in the Acunu stack.

So how did it all go? Things were tense as the first show was broadcast and we saw the load starting to ramp up, but the Acunu cluster worked flawlessly. As we came towards the start of the live shows, we were totally comfortable that it was all working well.

Then AWS told us that the server hosting one of the Cassandra instances was degraded and might die at any point. Just before the first live finals. We weren’t too worried as adding a new node to a cluster is a simple operation. We duly fired up a new EC2 instance and added it to the cluster.

Then things went wrong. For some reason, the new node didn’t integrate properly into the cluster and now we had a degraded cluster that couldn’t be brought back online. And only a few hours until showtime. I love live TV!

The team at Acunu were fantastic in supporting us (including from a campsite in France!) both to set up a new cluster and to diagnose the problem with the degraded cluster. For the show, we switched over to the new cluster as we still hadn’t been able to figure out what was wrong with the old one (it turned out to be a rare bug in Cassandra).

Thankfully the shows went off without a hitch and no-one saw the interesting juggling act going on to keep the service running.

So a big thank you to the team at Acunu for their help “behind the scenes” at BGT – we couldn’t have done it without them.

Thoughts from the AWS Summit

Some thoughts from the AWS summit today in London. The keynote presentation was by Werner Vogels (CTO Amazon), followed by a bunch of customer stories & then workshops.

Some thoughts and impressions

  • It’s surprisingly big. For a vendor conference, this is pretty busy. I heard 1400 people were registered and the keynote was busy. Suits seem outnumbered by geeks.
  • A super customer-friendly strategy outlined. Firstly to be a cost leader – “if we make savings, we’ll turn round and lower our prices to our customers”. Secondly, “there will be no lock-in. You can use any bits you want, from any language”. Refreshing change from previous large technology platforms.
  • The sheer scale of AWS is mind-boggling. Each day, they add as much server capacity as the whole of Amazon used in 2000. There are 339 Billion objects in S3.

Some roadmap priorities were called out, but what was notable was the repeated request for customer feedback to drive roadmaps, and the separation of AWS into individual services that can innovate and launch at their own pace.

Here’s what appeared on the roadmap:

  • More geographies
  • Make it easier to build and manage applications
  • New database offerings
  • More support, billing and user management options

In conversations, the lack of a clearer roadmap was a repeated gripe. For example, one company had just built and launched their email service two days before Amazon announced their Simple Email System. The lack of roadmap doesn’t appear to be a secrecy thing, but simply reflects the customer-driven and independent team development.

This is a fundamental limitation of agile development practises when building platforms. As soon as customers are making substantial technology bets on stuff you deliver, you can bet they’re going to start asking for clarity on where you’re going, so they can aim to meet you there.

I don’t think there’s a good answer to this. At Symbian we had roadmaps galore, and plans out 2-3 years which always ended up being not what the customer wanted by the time they started thinking about their products. Amazon is at the other end of the spectrum, yet it causes a different set of problems.

Britain’s Got Talent and UK.gov

Interesting perspective from Mark Baker at Canonical on how the audience stuff we do for BGT might have lessons for the UK government.

Case study with Amazon Web Services

When the nice folks at AWS heard about the Britain’s Got Talent buzzer, they thought it would be interesting as a case study on using cloud computing for social TV.

The results are here http://aws.amazon.com/solutions/case-studies/livetalkback/

Hopefully there’s some stuff of interest to others thinking of using the cloud with TV shows in there.

The phone is the second screen

There’s lots of interest in two-screen and social TV products these days. From Million Pound Drop to Britain’s Got Talent, lots of shows now come with a play-along second screen experience.

One debate has been whether a laptop or a mobile phone is the right screen to target. At Live Talkback we placed our bet on the phone. Partly that was because we’d loads of experience in the mobile world, but mainly it was because the phone feels right: it’s personal, it feels like a remote control, everyone’s got one, and it doesn’t distract from the TV as a laptop can so easily do.

With our recent experience on Britain’s Got Talent, ITV This Morning, ESPN and other shows, and with apps across iOS, Blackberry, Android and Nokia, along with widgets for web and Facebook, we can see where people are choosing to interact with TV.

So what are the results?

First off, it’s bad news for Nokia and Blackberry (disclosure: I used to work for Nokia). Their combined share of our user base is well under 3%. If people are buying their phones, they’re not using them for apps.

Apple is the clear winner, with iOS devices being used by over 60% of users. Not only is this 6 times more than the nearest mobile competitor (Android on ~10%), but easily beats mainly-desktop web at about 25%.

One surprise for me was  just how popular the iPod Touch is. The iOS share (60%) is split roughly equally between iPhone and iPod Touch. For all the hype about the iPad, it barely registers in our sample.

One conclusion is inescapable. Despite the discovery, sharing and ubiquity advantages of the web, nearly three-quarters of people are using their phones to interact with their TVs.  The social TV battle will be won or lost on the phone.