We hate Enterprise too but…..

13 years I’ve been dealing with large platforms in multi-countries and territories.  At the start of each project we say to ourselves how can we do it differently, how can we be more lean, more agile, more flexible, more fluid.  I’m a coder at heart, I’ve recently fallen back in love with the open source world and love hacking away in whatever spare time I have, building apps in an agile free flow manner.  It keeps me up to speed.  I can still code an application (not in the most elegant way, mind) with the best of the geeks in some hackathon in soho or some more exotic climate if I’m lucky.  This is my first instinct, just get shit done.  Don’t faff over architecture, spend money.  I hate layers of bureaucracy that develop in projects.  Why can’t we just sit down and code out functionality in a permanent beta rollout.  That’s surely the future? Its surely something we all aim to achieve? So yes I hate enterprise and everything that gets in the way to slow things down.

But and there is a massive but. I have learnt over the years that the odd software project can come off the rails.  Why? Geeks maybe? Bad management maybe?  The reality lies in the complexity of solutions as they gather momentum.  Whether that’s momentum in terms of the reach of the application such as a global platform, increased functionality, increased numbers of developers churning out code,  increased product stakeholders.  Even the very nature of the application becoming business critical with many moving parts. All of which need to be tamed, this agile constant beta development approach is great in startup mode but like a new puppy they eventually need to be tamed otherwise developments have a tendency of becoming unplanned. Resulting in unmanageable code and an unsustainable solution.  We need to organise the chaos into a controlled ecosystem and as the complexity increases, the modules and lines of code grow, all increasing the management burden which in turn, make the whole approach of getting releases out more rigid as we fight to tame the beast that we have created.

Getting the architecture right at the beginning helps with this process, but with everything, progress and ideas keep flowing all of which challenge this architecture immediately.  We adapt to keep up, but with this adapting, changes the nature or our original intentions and starts to introduce the odd little bit of chaos into the solution.  This chaos is a good thing, it challenges us, it helps progress our thinking, however it needs to be managed.

The real reason why enterprise ends up coming into play is because software development, which running into thousands of lines of code is by very nature complex.  But this is a myth its actually quite simple.  The thousands of lines of code have all been written to keep up with our new fast paced market and our insatiable demands for innovations, ideas and pace.  It’s our market place that is creating the requirement for enterprise, the more ideas we have the more structure we need to keep in place to tame the beast that we could end up creating.

And at the top of the human food chain with the ever increasing ideas is our digital marketeers who want us always to stay ahead of the game.  These beasts could reinvent our efforts every week. The problem is without enterprise thinking we have controls that naturally have to fall in place to protect the investment we have made so far.  On smaller solutions that do not require such heavy lifting at the backend, we can simply throw away and start again if it gets complicated.  This leads to consumable software solutions, which have their benefits and place.  However these are strategic decisions and must be made at the beginning.  Trying to change a large scale business critical solution with a strategic corporate investment into a consumable solution will lead to some very expensive software development cheques.

So what’s the answer.  I think the problem is not about about whether enterprise is a bad or a good thing.  Its about how we design the conveyor belt, how we ensure we can load the hopper of our product roadmap and ensure there is a common understanding of the order of releases.  Sometimes this requires patience, sometimes things have to wait for those larger releases that will benefit someone in your organisation but not necessary our new sexy urgent idea we may be trying to push out. Remember we are trying to bring some order to something that could quickly get so complex and out of control that the only thing we can do is throw it away and start again.  This is the real risk, every I.T. project is only a few scary ideas away from the scrap heap and this costs a lot more money in the long run.

The challenge for us how to we maintain our competitive advantage. A solution to this is a well thought out roadmap thinking ahead about the market, using real metrics about performance, your user base and how best to plan in functionality to keep them satisfied.  Enterprise is a game of chess, its hard to master and takes time.  Each move requires us to think well ahead and plan.  Even if that means planning in your agility. Its not a gain for short term game and requires a lot of patience.

2013 Predictions

To download the podcast to go with this article click here.

The sky’s are getting greyer and the nights drawing in, the year starts to draw to a close and its time again to stick my neck out and put some technology predictions together for 2013.

So I’m firing these out with no particular research other than gut feel and what I’m seeing in our sector.  Here goes…

Mobile will continue (again) to play part of 2013

In the UK and much of the developed world, 4G will be upon us this year.  This means quicker speeds and more capability from our smart phones.  Mobile browsing will be come even more of a requirement for most of the websites out there.  However mobile apps need to be thought through before investing.  We are getting to the tipping point in the app stores now where your apps simply won’t get found.  Brochureware apps are no longer novelty, so we really need to think about real application and utility. Saying all this if you get the utility right or the entertainment factor right then consumers will make the download.  When setting out on your app journey in 2013 think about two things.  How close are you to your customers and would they value a utility or an entertaining app from you on their precious phone real estate? If you get this wrong then you are going to be throwing the investment away.

On the other hand mobile web and mobile web browsing will be major.  If your websites not mobiled up for smart phones then you are going to be singled out and branded slow to adopt.  We know most smartphones can zoom browse and therefore full screen browsing works, however consumers are expecting a fat finger touch experience as the first point of entry, they will switch to your classic site after that.

Also for 2013, here in London, the tube is starting to get wifi-ed up, again increased connectivity such as wifi on planes, the underground and trains will create a pull for mobile users to consume content and that content will often be video. From a developers perspective we need to keep embracing HTML5 and progressive enhancement.  The industry will need to keep pushing the boundaries,  2012 has seen the launch of more html5 compliant websites with richer assets all of which are putting pressure on current website performance however we need to persevere because bandwidth will keep up.  It won’t be quite like moores law but it will improve, so keep innovating.

Tablet Revolution will continue

This is such an obvious one but will continue into 2013. I will explain later how Windows 8 will help buoy this.  Apple launched iPad mini yesterday, Amazon will follow with the fire and I’m left explaining to my colleagues why its all a good idea.  Surely two different sizes of tablet, how can than be a benefit, how will they sell, surely they are cannibalising their other markets in the case of Apple? The answer is handbags, all different sizes but they are all needed and we(well not me) will have more than one. I think Apple have got it right and I’m determined to have all three sizes.

“Connected Big Data”

I’m going to claim this term first before the rest of the tech world gets it.  We have seen big data growing in significance, like cloud computing in 2011, its a buzz word that means something to tech consultancy firms, but what does it really mean to our customers.  My view on this, is we are really starting to see the emergence of connected big data.  i.e. a lots of our customers are starting to join the dots between silos of data and systems with a single purpose to unify around their customer or consumer at the web layer.  The silos of data may be getting joined up globally or just with the sole purpose of providing unified information. We are seeing unification of product data, consumer data, crm data, analytical data and general content all of which require connected specialist with the ability to consult across all levels within an organisation. We typically see four core  data hubs occurring within most global companies;-

1) Product data and enriched product data

Here we are seeing platforms like SAP or Oracle working in collaboration with enrichment tools such as digital asset management platforms and content management platforms.

2) Customer and CRM data

Here we a seeing a unified customer view where we are connecting the customer with common data sources and we are joining the dots between data silos.  Starting with a single identity for a customer.  i.e. how do we create a identity passport and then map the users CRM footprint to that passport as they interact with your brand either via your website or on in the social and mobile worlds.

3) Transaction Data and Analytics

We are seeing even more sophisticated data mining techniques with business intelligence technology fed back to the web layer to optimise user experience and customer engagement.

4) Centralised and localised content

Traditionally the home of the enterprise content management platforms.  However we are seeing architecture and approaches challenging the dominance of these platforms.

So connected big data compared with just big data, sees the joining of dots at a global level between systems with big data silos such as SAP to surface that data to the web layer and allow web layer users to contribute and participate in that data as opposed to just surfacing that data for analytical purposes.

Social

Will the growth in social continue?

Yes most definitely but are we seeing increased demand to play in this area?  I believe we are seeing a plateaux in innovation, which is starting to slow down social network innovation, this means there is less opportunity for growth.  I think it will be still a major part of any digital agency’s portfolio but I think there is a level of maturity beginning to emerge.  Saying this I think we are ahead of the curve and there is still significant motion in this sector with lots of organisation now getting it and starting to put money into social for customer engagement, marketing and application.  So it will still be a big part of 2013 and certainly a time for agency’s to capitalise on the hype.  However innovation is required to lead the pack.  Facebook will continue to grow, but more slowly.  Verticalised social networks like pinterest will also see growth however there will be common sharing standards and approaches starting to emerge such a sign-on protocols like facebook connect.

The biggest advance in innovation will be context sensitive social.  It will be used to drive likes, connect friends and customise information.  Organisation that develop good delivery platforms for this from both a technology and campaign perspective will be able to take most advantage of this market as it starts to mature. Finally watch out for revenue drive innovation from the both Twitter and Facebook they have to do something in 2013.

Will Windows 8 make an impact?

Is Microsoft really on the decline, is Apple’s position now dominant for the next 10 years? As we eagerly await the launch of windows 8, are we really expecting a fundamental relaunch of good old microsoft?  Well Microsoft are expecting to spend big to push this one out.  I think it will start to make an impact on businesses with I.T. departments still hanging onto a level of control and still trying to hold off the day when they can no longer resist their staff enjoying a dose of Apple. However it certainly won’t be a revolution as we saw in 95. We will see interest regenerated by some new sexy devices, even some nicely designed tablet hybrids that will catch on.  But don’t expect too much.

What it will do is start to standardise the tablet and touch.  I know Apple, Samsung and practically every other vendor supports gesture and touch capability and have been doing it for some time. But with Microsoft coming online will mean we will start to see the standard fully adopted. Navigation of apps and web browsing will need to ensure touch is at the core of their design from now on. This will mean innovation in design, html5 and our favourite, Javascript.

Open Source at the Server

Open source is continuing grow and its now proving itself with organisations and in big projects.  One winner in this is Drupal which is rapidly becoming enterprise capable.  This will continue and Drupal could start to rival some big content management vendors. Watch this one closely for developments. We certainly are.

Content Management Platforms

This leads me on to platforms and CMS platforms in particular.  Our favourite has been Tridion for many years.  We feel it will be a push for Tridion to be knocked from its well earned enterprise content management leader. However the ones to watch and not ignore in 2013 are;-

  • Sitecore
  • Drupal
  • Adobe CQ
  • EpiServer

Amaze are developing strategies and approaches to ensure our customer can pick wisely.

E-Commerce Platforms

E-commerce continues to grow and grow.  Choice of platform remains critical and its becoming an increasing requirement as businesses start to look at second generation commerce capability.  Obviously depending on your size will depend on your choice of platform.  But if you are in the enterprise bracket i.e. you are operating a global platform or have substantial business being run through e-commerce then we still would recommend Hybris as number one.  Why? Because it is more complete than any of the other big vendors. i.e. you get more to start with; for example product information platform, e-commerce accelerators, customer support, content management and mobile.  This therefore leads to less complex software integration programmes and less risk.  It also means you can get to market quicker. We feel its still number one and one to embrace in 2013.

Content Delivery Networks

Is a big theme for infrastructure going into 2013.  There are still a lot of companies that have not even looked into this let alone deployed CDN.  With content increasing and globalisation becoming an ever increasing factor content delivery networks are becoming a necessity.  There are only a few key players that do this well;-

  • Akamai
  • MaxCDN
  • Edgecast
  • Amazon
  • Rackspace

So happened to?

Finally I just wanted to wrap up by revisiting last years buzz word and trends that I have not mentioned here.

HTML5 – adoption continues.  Yet still not a ratified standard but the industry is ploughing ahead. It must continue to be embraced.

Cloud – interest and adoption growing more slowly than the hype, however its here to stay and will continue to be a big factor in any big infrastructure project.  Microsoft Axure has been slowest to succeed whilst Amazon is trail blazing.  What we are seeing is growth in vendors starting to provide onDemand services to their traditional software license models.  This is good news for the industry.

Cloud Computing

Cloud computing has certainly gained momentum over the last 12 months.  It has no doubt struck accord with cash strapped businesses.  But our view is cloud computing is too low down the software stack and is predominantly concerned with vitualising platforms.  As more and more businesses compete in this space we see the value of cloud computing moving up the stack and unleashing its service orientated flexibility on the domain of traditional software as a service vendors.  Confusing?  They both operate in the cloud but for pure software as a service vendors to add even more benefits to business they need to reach into organisations.  Clouds will grow tentacles into businesses as the membrane between the traditional I.T. systems and the cloud gets thinner.  What does this mean in real terms?

Data will become the platform but it will extend with its applications into the cloud.  As with the web the data will reside in the cloud along with an app store approach to enterprise applications.  Internal private clouds and infrastructure will become agents to the cloud.  The cloud concept will increase up the value chain where cost savings are only half the reason why people will make the move.

So are you interested in how Facebook scales?

If, like me you, are interested in scaling big platforms we’ve been doing some research into exactly how the Facebook techies scale their infrastructure. Some useful techniques if you are using the LAMP technology stack. Interesting reading…

Facebook’s scaling challenge

Before we get into the details, here are a few factoids to give you an idea of the scaling challenge that Facebook has to deal with:

  • Facebook serves 570 billion page views per month (according to Google Ad Planner).
  • There are more photos on Facebook than all other photo sites combined (including sites like Flickr).
  • More than 3 billion photos are uploaded every month.
  • Facebook’s systems serve 1.2 million photos per second. This doesn’t include the images served by Facebook’s CDN.
  • More than 25 billion pieces of content (status updates, comments, etc) are shared every month.
  • Facebook has more than 30,000 servers (and this number is from last year!)

Software that helps Facebook scale

In some ways Facebook is still a LAMP site (kind of), but it has had to change and extend its operation to incorporate a lot of other elements and services, and modify the approach to existing ones.

For example:

  • Facebook still uses PHP, but it has built a compiler for it so it can be turned into native code on its web servers, thus boosting performance.
  • Facebook uses Linux, but has optimized it for its own purposes (especially in terms of network throughput).
  • Facebook uses MySQL, but primarily as a key-value persistent storage, moving joins and logic onto the web servers since optimizations are easier to perform there (on the “other side” of the Memcached layer).

Then there are the custom-written systems, like Haystack, a highly scalable object store used to serve Facebook’s immense amount of photos, or Scribe, a logging system that can operate at the scale of Facebook (which is far from trivial).

But enough of that. Let’s present (some of) the software that Facebook uses to provide us all with the world’s largest social network site.

Memcached

MemcachedMemcached is by now one of the most famous pieces of software on the internet. It’s a distributed memory caching system which Facebook (and a ton of other sites) use as a caching layer between the web servers and MySQL servers (since database access is relatively slow). Through the years, Facebook has made a ton of optimizations to Memcached and the surrounding software (like optimizing the network stack).

Facebook runs thousands of Memcached servers with tens of terabytes of cached data at any one point in time. It is likely the world’s largest Memcached installation.

HipHop for PHP

HipHop for PHPPHP, being a scripting language, is relatively slow when compared to code that runs natively on a server. HipHop converts PHP into C++ code which can then be compiled for better performance. This has allowed Facebook to get much more out of its web servers since Facebook relies heavily on PHP to serve content.

A small team of engineers (initially just three of them) at Facebook spent 18 months developing HipHop, and it is now live in production.

Haystack

Haystack is Facebook’s high-performance photo storage/retrieval system (strictly speaking, Haystack is an object store, so it doesn’t necessarily have to store photos). It has a ton of work to do; there are more than 20 billion uploaded photos on Facebook, and each one is saved in four different resolutions, resulting in more than 80 billion photos.

And it’s not just about being able to handle billions of photos, performance is critical. As we mentioned previously, Facebook serves around 1.2 million photos per second, a number which doesn’t include images served by Facebook’s CDN. That’s a staggering number.

BigPipe

BigPipe is a dynamic web page serving system that Facebook has developed. Facebook uses it to serve each web page in sections (called “pagelets”) for optimal performance.

For example, the chat window is retrieved separately, the news feed is retrieved separately, and so on. These pagelets can be retrieved in parallel, which is where the performance gain comes in, and it also gives users a site that works even if some part of it would be deactivated or broken.

Cassandra

CassandraCassandra is a distributed storage system with no single point of failure. It’s one of the poster children for the NoSQL movement and has been made open source (it’s even become an Apache project). Facebook uses it for its Inbox search.

Other than Facebook, a number of other services use it, for example Digg. We’re even considering some uses for it here at Pingdom.

Scribe

Scribe is a flexible logging system that Facebook uses for a multitude of purposes internally. It’s been built to be able to handle logging at the scale of Facebook, and automatically handles new logging categories as they show up (Facebook has hundreds).

Hadoop and Hive

HadoopHadoop is an open source map-reduce implementation that makes it possible to perform calculations on massive amounts of data. Facebook uses this for data analysis (and as we all know, Facebook has massive amounts of data). Hive originated from within Facebook, and makes it possible to use SQL queries against Hadoop, making it easier for non-programmers to use.

Both Hadoop and Hive are open source (Apache projects) and are used by a number of big services, for example Yahoo and Twitter.

Thrift

Facebook uses several different languages for its different services. PHP is used for the front-end, Erlang is used for Chat, Java and C++ are also used in several places (and perhaps other languages as well). Thrift is an internally developed cross-language framework that ties all of these different languages together, making it possible for them to talk to each other. This has made it much easier for Facebook to keep up its cross-language development.

Facebook has made Thrift open source and support for even more languages has been added.

Varnish

VarnishVarnish is an HTTP accelerator which can act as a load balancer and also cache content which can then be served lightning-fast.

Facebook uses Varnish to serve photos and profile pictures, handling billions of requests every day. Like almost everything Facebook uses, Varnish is open source.

Other things that help Facebook run smoothly

We have mentioned some of the software that makes up Facebook’s system(s) and helps the service scale properly. But handling such a large system is a complex task, so we thought we would list a few more things that Facebook does to keep its service running smoothly.

Gradual releases and dark launches

Facebook has a system they called Gatekeeper that lets them run different code for different sets of users (it basically introduces different conditions in the code base). This lets Facebook do gradual releases of new features, A/B testing, activate certain features only for Facebook employees, etc.

Gatekeeper also lets Facebook do something called “dark launches”, which is to activate elements of a certain feature behind the scenes before it goes live (without users noticing since there will be no corresponding UI elements). This acts as a real-world stress test and helps expose bottlenecks and other problem areas before a feature is officially launched. Dark launches are usually done two weeks before the actual launch.

Profiling of the live system

Facebook carefully monitors its systems (something we here at Pingdom of course approve of), and interestingly enough it also monitors the performance of every single PHP function in the live production environment. This profiling of the live PHP environment is done using an open source tool called XHProf.

Gradual feature disabling for added performance

If Facebook runs into performance issues, there are a large number of levers that let them gradually disable less important features to boost performance of Facebook’s core features.

The things we didn’t mention

We didn’t go much into the hardware side in this article, but of course that is also an important aspect when it comes to scalability. For example, like many other big sites, Facebook uses a CDN to help serve static content. And then of course there is the huge data center Facebook is building in Oregon to help it scale out with even more servers.

And aside from what we have already mentioned, there is of course a ton of other software involved. However, we hope we were able to highlight some of the more interesting choices Facebook has made.

Facebook’s love affair with open source

We can’t complete this article without mentioning how much Facebook likes open source. Or perhaps we should say, “loves”.

Not only is Facebook using (and contributing to) open source software such as Linux, Memcached, MySQL, Hadoop, and many others, it has also made much of its internally developed software available as open source.

Examples of open source projects that originated from inside Facebook include HipHop, Cassandra, Thrift and Scribe. Facebook has also open-sourced Tornado, a high-performance web server framework developed by the team behind FriendFeed (which Facebook bought in August 2009).

(A list of open source software that Facebook is involved with can be found onFacebook’s Open Source page.)

More scaling challenges to come

Facebook has been growing at an incredible pace. Its user base is increasing almost exponentially and is now close to half a billion active users, and who knows what it will be by the end of the year. The site seems to be growing with about 100 million users every six months or so.

Facebook even has a dedicated “growth team” that constantly tries to figure out how to make people use and interact with the site even more.

This rapid growth means that Facebook will keep running into various performance bottlenecks as it’s challenged by more and more page views, searches, uploaded images, status messages, and all the other ways that Facebook users interact with the site and each other.

But this is just a fact of life for a service like Facebook. Facebook’s engineers will keep iterating and coming up with new ways to scale (it’s not just about adding more servers). For example, Facebook’s photo storage system has already been completely rewritten several times as the site has grown.

So, we’ll see what the engineers at Facebook come up with next. We bet it’s something interesting. After all, they are scaling a mountain that most of us can only dream of; a site with more users than most countries. When you do that, you better get creative.

Designing a global e-commerce platform to overcome network latency

I’ve been working on efficient network latency model for a global e-commerce solution. Having designed a few major global platforms there are a number of factors that need to be considered. Misconceptions around content delivery networks make it easy for an organisation to say we can just apply an Akamai or Amazon webservices to solve the problem often mislead potential customers. Whilst Akamai and Amazon do solve content delivery problems particularly for services that rely on heavy media types they do not solve the problem for transactional websites. For this we need to turn to network design, the positioning of data centres, peering of networks and the routing of traffic through key undersea cables.

Whilst the later can often be something out of your control, by choosing the right data centre partner in the right location will mean you can solve the problem relatively easy.

So what do we need to look out for;-

Geography of the data centre.

Designing the solution to cater for your largest population of users. For example trans oceanic links between Europe and East coast US are very strong and reliable. But equally West Coast US and Tokyo or equally reliable. So if your user populations mainly reside in these two geographies then we can accurately select a good location for the data centre.

Data Centre Routing and Peering

We then need to factor into the equation data centre routing and network peering. This if you are a solution provider is often out of your control however again choosing the right data centre organisation with the appropriate peering and routing arrangements can shorten the amount of hops your traffic will need to take between the server and your user. To do this well we need to look at the financial sector, particular realtime volume trading where big trading centres such as London, Tokyo and New York are paired by many global hosting organisations. All of which will have the links and relationships set up ready for you to choose.

Business Model

Then there’s your plan for global domination. You need to factor your business model into this. There is no point designing the most perfect global hosting platform if you do not have the business case in each market. We all know the safest and efficient global transaction solution is to have servers located in each of your markets, however this is often not viable and doesn’t play to the strengths of the internet. If we are talking billions of pounds/dollars of revenue than there is naturally a business case, but if you are starting out, a well placed set of servers with a good hosting business will serve you adequately well and will leave you with dollars to spend making the shopping experience more rewarding.

And there is a place for CDN

And yes Akamai can still help, but only with your rich product content. It will be useless on a transactional level. But really transaction is often the smallest factor in your overal website design.

And finally there’s Australia

This is something that will challenge every network engineer. But links are getting better.