About mattclarkeamaze

CTO at Amaze

Well Microsoft you might just have something(but is it cool enough)

I’ve been waiting for this one for some time.  We got a glimpse of some preview technologies earlier in the year, over at Redmond and I suspected the emergence of this little beast.  Not sure how your OEM partners are going to view it though.

It looks like a good attempt and is certainly better than most of the devices coming out of south and east asia.  However does it have the cool factor?  Its a bit like business men in polyester shirts.  Theyare just not cool.  However they are business men and that may be Microsoft’s game.  I think the device will makebig headway amongst the standard mundane business world away from the image conscious brand people. The stylus will make it a great note taking tool when combined with one note or even evernote and I think MS Office will be one of the biggest reasons for using the device.  Cool accessories, like the key board approach seems to really think about productivity.

ImageI think, Microsoft, you might just capture marketshare here.  The I.T. departments of the world will surely like to take this as a standard.

But I’m afraid its still not cool enough.  I’m going to keep my iPad but I might just purchase one, for those productive days where I don’t need to be as creative in my thinking.

 

Passbook cleverness

Trying wifi on American airlines for the first time somewhere over Arizona. Anyway I was excited to see Apple’s passbook launch as part of ios6. Great little feature to the iPhone and starts to help our little mobile assistant become even more key to our lives. I see the convenience and simplicity approach as a great killer application for the iPhone.

I’m going to check out the api capability and see how we can start to build features directly in to passbook. It looks pretty straight forward. I love the way we can trigger your data/functionality to be accessed and pushed at the user, based on location, calendar event or time. It’s great for the obvious such as flight bookings and coupon based activity. But we are more excited about the possibility for retail and event based promotions.

Going to do some digging.

Really keen to see where Apple go next. I.e. NFC integration. I love the approach with QR codes I.e. flight boarding passes but surely NFC in the next iPhone is going to be a real killer.

Will report back once we’ve done some prototyping.

Really excited about retail with this one.

So are you interested in how Facebook scales?

If, like me you, are interested in scaling big platforms we’ve been doing some research into exactly how the Facebook techies scale their infrastructure. Some useful techniques if you are using the LAMP technology stack. Interesting reading…

Facebook’s scaling challenge

Before we get into the details, here are a few factoids to give you an idea of the scaling challenge that Facebook has to deal with:

  • Facebook serves 570 billion page views per month (according to Google Ad Planner).
  • There are more photos on Facebook than all other photo sites combined (including sites like Flickr).
  • More than 3 billion photos are uploaded every month.
  • Facebook’s systems serve 1.2 million photos per second. This doesn’t include the images served by Facebook’s CDN.
  • More than 25 billion pieces of content (status updates, comments, etc) are shared every month.
  • Facebook has more than 30,000 servers (and this number is from last year!)

Software that helps Facebook scale

In some ways Facebook is still a LAMP site (kind of), but it has had to change and extend its operation to incorporate a lot of other elements and services, and modify the approach to existing ones.

For example:

  • Facebook still uses PHP, but it has built a compiler for it so it can be turned into native code on its web servers, thus boosting performance.
  • Facebook uses Linux, but has optimized it for its own purposes (especially in terms of network throughput).
  • Facebook uses MySQL, but primarily as a key-value persistent storage, moving joins and logic onto the web servers since optimizations are easier to perform there (on the “other side” of the Memcached layer).

Then there are the custom-written systems, like Haystack, a highly scalable object store used to serve Facebook’s immense amount of photos, or Scribe, a logging system that can operate at the scale of Facebook (which is far from trivial).

But enough of that. Let’s present (some of) the software that Facebook uses to provide us all with the world’s largest social network site.

Memcached

MemcachedMemcached is by now one of the most famous pieces of software on the internet. It’s a distributed memory caching system which Facebook (and a ton of other sites) use as a caching layer between the web servers and MySQL servers (since database access is relatively slow). Through the years, Facebook has made a ton of optimizations to Memcached and the surrounding software (like optimizing the network stack).

Facebook runs thousands of Memcached servers with tens of terabytes of cached data at any one point in time. It is likely the world’s largest Memcached installation.

HipHop for PHP

HipHop for PHPPHP, being a scripting language, is relatively slow when compared to code that runs natively on a server. HipHop converts PHP into C++ code which can then be compiled for better performance. This has allowed Facebook to get much more out of its web servers since Facebook relies heavily on PHP to serve content.

A small team of engineers (initially just three of them) at Facebook spent 18 months developing HipHop, and it is now live in production.

Haystack

Haystack is Facebook’s high-performance photo storage/retrieval system (strictly speaking, Haystack is an object store, so it doesn’t necessarily have to store photos). It has a ton of work to do; there are more than 20 billion uploaded photos on Facebook, and each one is saved in four different resolutions, resulting in more than 80 billion photos.

And it’s not just about being able to handle billions of photos, performance is critical. As we mentioned previously, Facebook serves around 1.2 million photos per second, a number which doesn’t include images served by Facebook’s CDN. That’s a staggering number.

BigPipe

BigPipe is a dynamic web page serving system that Facebook has developed. Facebook uses it to serve each web page in sections (called “pagelets”) for optimal performance.

For example, the chat window is retrieved separately, the news feed is retrieved separately, and so on. These pagelets can be retrieved in parallel, which is where the performance gain comes in, and it also gives users a site that works even if some part of it would be deactivated or broken.

Cassandra

CassandraCassandra is a distributed storage system with no single point of failure. It’s one of the poster children for the NoSQL movement and has been made open source (it’s even become an Apache project). Facebook uses it for its Inbox search.

Other than Facebook, a number of other services use it, for example Digg. We’re even considering some uses for it here at Pingdom.

Scribe

Scribe is a flexible logging system that Facebook uses for a multitude of purposes internally. It’s been built to be able to handle logging at the scale of Facebook, and automatically handles new logging categories as they show up (Facebook has hundreds).

Hadoop and Hive

HadoopHadoop is an open source map-reduce implementation that makes it possible to perform calculations on massive amounts of data. Facebook uses this for data analysis (and as we all know, Facebook has massive amounts of data). Hive originated from within Facebook, and makes it possible to use SQL queries against Hadoop, making it easier for non-programmers to use.

Both Hadoop and Hive are open source (Apache projects) and are used by a number of big services, for example Yahoo and Twitter.

Thrift

Facebook uses several different languages for its different services. PHP is used for the front-end, Erlang is used for Chat, Java and C++ are also used in several places (and perhaps other languages as well). Thrift is an internally developed cross-language framework that ties all of these different languages together, making it possible for them to talk to each other. This has made it much easier for Facebook to keep up its cross-language development.

Facebook has made Thrift open source and support for even more languages has been added.

Varnish

VarnishVarnish is an HTTP accelerator which can act as a load balancer and also cache content which can then be served lightning-fast.

Facebook uses Varnish to serve photos and profile pictures, handling billions of requests every day. Like almost everything Facebook uses, Varnish is open source.

Other things that help Facebook run smoothly

We have mentioned some of the software that makes up Facebook’s system(s) and helps the service scale properly. But handling such a large system is a complex task, so we thought we would list a few more things that Facebook does to keep its service running smoothly.

Gradual releases and dark launches

Facebook has a system they called Gatekeeper that lets them run different code for different sets of users (it basically introduces different conditions in the code base). This lets Facebook do gradual releases of new features, A/B testing, activate certain features only for Facebook employees, etc.

Gatekeeper also lets Facebook do something called “dark launches”, which is to activate elements of a certain feature behind the scenes before it goes live (without users noticing since there will be no corresponding UI elements). This acts as a real-world stress test and helps expose bottlenecks and other problem areas before a feature is officially launched. Dark launches are usually done two weeks before the actual launch.

Profiling of the live system

Facebook carefully monitors its systems (something we here at Pingdom of course approve of), and interestingly enough it also monitors the performance of every single PHP function in the live production environment. This profiling of the live PHP environment is done using an open source tool called XHProf.

Gradual feature disabling for added performance

If Facebook runs into performance issues, there are a large number of levers that let them gradually disable less important features to boost performance of Facebook’s core features.

The things we didn’t mention

We didn’t go much into the hardware side in this article, but of course that is also an important aspect when it comes to scalability. For example, like many other big sites, Facebook uses a CDN to help serve static content. And then of course there is the huge data center Facebook is building in Oregon to help it scale out with even more servers.

And aside from what we have already mentioned, there is of course a ton of other software involved. However, we hope we were able to highlight some of the more interesting choices Facebook has made.

Facebook’s love affair with open source

We can’t complete this article without mentioning how much Facebook likes open source. Or perhaps we should say, “loves”.

Not only is Facebook using (and contributing to) open source software such as Linux, Memcached, MySQL, Hadoop, and many others, it has also made much of its internally developed software available as open source.

Examples of open source projects that originated from inside Facebook include HipHop, Cassandra, Thrift and Scribe. Facebook has also open-sourced Tornado, a high-performance web server framework developed by the team behind FriendFeed (which Facebook bought in August 2009).

(A list of open source software that Facebook is involved with can be found onFacebook’s Open Source page.)

More scaling challenges to come

Facebook has been growing at an incredible pace. Its user base is increasing almost exponentially and is now close to half a billion active users, and who knows what it will be by the end of the year. The site seems to be growing with about 100 million users every six months or so.

Facebook even has a dedicated “growth team” that constantly tries to figure out how to make people use and interact with the site even more.

This rapid growth means that Facebook will keep running into various performance bottlenecks as it’s challenged by more and more page views, searches, uploaded images, status messages, and all the other ways that Facebook users interact with the site and each other.

But this is just a fact of life for a service like Facebook. Facebook’s engineers will keep iterating and coming up with new ways to scale (it’s not just about adding more servers). For example, Facebook’s photo storage system has already been completely rewritten several times as the site has grown.

So, we’ll see what the engineers at Facebook come up with next. We bet it’s something interesting. After all, they are scaling a mountain that most of us can only dream of; a site with more users than most countries. When you do that, you better get creative.

PHP and Lamp with iOS

So I tweeted some time ago about moving away from Microsoft server technology and in particular .NET.  I’ve spent most of my professional career building server solutions or leading teams using the Microsoft stack.  I’ve loved it for so many years and have spent the same years justifying its scaling credentials. We’ve even got many proven successes out their being used in anger. One success above all is the new Life platform that we’ve developed for UniServity.  The scaling capability is proving itself  over and over again and therefore my view of Microsoft server technology  remains unchanged.  However I’ve had my head turned slightly by simplicity in the last few weeks. Also by the increasing costs of licensing.

I’ve just started building a series of iPhone and iPad apps with a server / cloud component.  I naturally turned to Microsoft for the answer, however the thought of opening up Visual Studio architecting this fantastic multi layered app with an MVC architecture fills me with dread, when all I need to do is build something simple that will access a database and scale. I’m also using a Mac now-a-days have have been for the last 5 years.  So we now need to boot up parallels start up visual studio and get coding.  All a bit much for my mac book air.  (sorry slimmed down the hardware due to all the traveling)

So I dreamed of the days of Perl and CGI using BBEdit on my original macintosh to edit simple scripts.

PHP gave me the answer and when you start to dig a little deeper it really does provide a viable answer.  It scales well, Facebook proves that with its 540 billion page impressions and its simple to get coding.  I can use my mac as my server and I know it will easily port to a linux or unix platform in the future. I’ve downloaded some simple editing tools and with mysql i’m building my simple cloud based service for my iOS apps.  Except its really not that simple any more.  Its amazing how quick you can progress. Its also like riding a bike, you never forget the technology’s roots and you find that its back down to natural programming techniques.  Rather than even more layers of abstraction you get with .NET.

AND above all no license fees and no policy of requiring even more hardware to run your software.  It does scale very well I have to say and it does spark that programming geek in us all.

More to follow as I progress.

 

Product information management

I had a little bit of a light bulb moment today when discussing product content and e-commerce. We are increasingly getting drawn into conversations around e-commerce, digital asset management, saas based system, master data tool sets and content management. All enterprise level platforms that require lots of thinking and organisation to get right. Most of the conversation is based around 1) product display and 2) integration.

Surely the core hub here is product information management. This is the only tool that hooks everything together. Once its dealt with the rest falls into place. Your e-commerce strategy, your content management strategy, your digital asset management strategy and you fulfilment and order management strategy.

Get you PIM right and the rest flows from there. Go the other way and you will be left forever integrating and making platforms work together. Of course this only applies if you have a lot of product data to deal with.

If you are thinking PIM. Hybris seems to be coming up trumps time and time again without even trying.

Near field communication, can it make life more convenient?

I’ve been looking into near field communication and possible applications for it.

Its and interesting technology and there is certainly lots of application ideas for it. Obviously the biggest break through will be a move to cashless society using your mobile phone to pay for things with a simple swipe of a sensor. Oyster cards and ticketing are also a logical route especially for people like me who keep loosing their Oyster cards. But there are lots of other applications which will make life simple and convenient.

I’ve been working with NFC with my new blackberry. I’ve just taken delivery of my first batch of sticky sensors. First thing I wanted to do is stick one on my desk. I wanted my out of office and other services such as social network status to change their status when I arrived at work. It proved really easy a simple swipe and my app did the rest. It switched my out of office off and the checked me in on facebook.

Next the car. One sticker on the dashboard, out of site, mind. One swipe. The phone was configured to turn its bluetooth on and switch off the bleeps from my email to save it catching my attention whilst driving.

Saved one sticker for home. Obviously the house isn’t fully automated but I could update some settings. Much more work here.

I guess when you start thinking about convenience and applications there are a lot of things you can use the technology for. Think about it being rolled into all your favourite shops, in the packaging of product, advertising boards, hotel checkins, door entry, flight boarding, immigration, business card exchange, time keeping systems and many more such instances.

Roll on iphone 5 we need to start building apps now.

Building a viable e-commerce business model

For the last week I’ve spent my time on helping produce a viable e-commerce business model using latest enterprise class commerce platforms, a well informed seo and media strategy and a decent runway to break even. This is not the first time may I add, but it does take into account some new challenges.

The saas world, e-commerce 2, global markets and the improved use of pay per click have made it even more a numbers game to make the business model work. Obviously your product needs to be good, but if it is and you have the right partners in place and the right technology it is simply a case of playing the numbers to get a return on investment? Sounds easy? Not quiet that easy. Its a fine art, where you need to apply your skills and know how to finally tune your system to get the most out of it. Its only through experience and through having the right partners can you guarantee the numbers game will work.

To get it right depends on your conversion path and this starts by having a coordinated approach between brand, traditional advertising, pay per click, social channels and traditional product channels. All need to be identified and a strategy developed to create a buying conveyor belt to your e-commerce cart. It doesn’t start with just your site its starts a lot earlier.

Choosing the right platform and partner to route this conveyor belt through the buying process will lead to successful conversion.

The platform needs to be able to take the feeds of potential customers from all these conveyor belts. It then needs to show case the product well, display options and link to other products to engage the user and stimulate their buying emotions. This is where saas platforms that have one model to fit all fail to deliver. They fail to capture the channels and trigger the buying responses. A wholly owned platform tailored to your product, brand and customer will win hands down when it comes to maximising conversion. It requires multi channelled approach to commerce and there are only a few platforms out there that do this well.

So if you are looking to play the numbers conversion game you need to be in full control of the whole engine from advertising spend, ppc, seo to platform and design. You need to control it all in order to fine tune the animal. This I believe is critical to achieving your business plan. However think SasS when you implement.  Build the As A Service element for your global markets but own the technology yourself.

Can social really predict the future?

With the London mayoral contest being accurately predicted by social network sentiment analysis, should business really start to rely on the intelligence collected from social networks to help plan their business models?

I for sometime have taken a balanced view of key trends emerging from social networks to predict what’s going to be hot in the technology space. Most of the time its been accurate even though most of the time its been predicting the rise of social networks.

The key thing though is blending the intelligence gathered with authentic sources to get the balance right.

Analysis of intelligence from the social graph can help inform your thinking, but if you are doing something new you need to lead the pack. You could follow Steve Job’s mantra and tell your customers what they need. But to do this you need to be brave.

Its important to get the right tools setup to give you the necessary intelligence gathering capability from the social graph and the wider internet so you can bring all the information together in one clear dashboard for you to study and make your own decisions.

Another advantage for using the right tools is to amplify your message to back you up as an authority in your world. Amplify the positive statistics in realtime backs up your messaging. What the social graph can give us is an instant poll of a large targeted constituency without actually polling them. Doing this enables you to get realtime assurance around your decision making process. Social graphs will never make the decision for you but they will give confidence and backing to your ideas.

Learning Platforms Heading East (but not as we know them)

I’m heading out to Slovenia today to the Smart / Steljes launch in the Adriatic’s, thanks to my friends at Steljes we are piggy backing on their growth into this territory.

UniServity have been looking at Eastern Europe for some time.  They seem to experiencing the same growth and appetite to education technology investment that the UK enjoyed some ten years back.  Most of it is derived from European Union investment but there is a big appetite to invest at a national government level.

Its giving us a real test to see how Learning platforms are seen in today’s climate and we are getting some interesting feedback which helps understand how we should target new markets.  My findings so far see a tie up with mobile devices and the ability to distribute content, learning activities and tasks to students on mobile / tablet devices. Lots of money is being made available for shiny new devices to replace books but there is also recognition that books cannot be one way any more, they need to be more interactive and community focussed in their new digital format.  Its virgin territory heading east, however the learnings are helping for our emerging market offering and also to test some of our vision before rolling out mainstream back in the UK.

Tablets and digital content are definitely driving growth as they offer the means for digital text books.  Learning platforms are certainly heading to become the infrastructure behind this new world but only if learning platform company’s continue to invest in R&D.  Guess what we are working on behind the scenes.

 

 

Big data can we make sense of it all?

With Big Data becoming increasingly talked about, are there really solutions out there that can maximise intelligence within this data and make it clear enough for businesses to make sense of it and act on it?

My view is that its not as simple as just rolling out clever analytical inference tools like Autonomy as this just creates another layer of data. What I feel is needed is a set of communication skills that are inbuilt in most digital agencies.  Combining these skills with technology, business knowledge and your business requirements will enable us to produce clever and clear dashboards with your organisation’s data.  Whether its a set of analytical reports or e-commerce conversion statistics, we can design the tools and dashboard to manage your KPI’s, present the information to your team, collaborate on the actions and get moving with the necessary change. Technology alone won’t solve Big Data, but technology combined with a good set of communication and design tools can produce clear answers to your company’s big data challenge, more importantly it will help you and your colleagues make sense and make better decisions.