Always New Mistakes

December 5, 2010

Localization Wars: Facebook vs Foursquare

Filed under: Technology — Tags: , , , , , , , — Alex Barrera @ 7:17 pm

Some months ago, Facebook unveiled their location strategy aka Facebook Places. It comes as now surprise that the next move from them was getting into location. After buying Friendfeed, copying Twitter, it’s the turn of Foursquare and Gowalla.

Most of the discussion going around is centered around the struggle between Facebook and Foursquare. Even though Facebook has been saying they’ve been working with them, it’s crystal clear, the new announcement has hurt them in a really bad way.

So, the big question, will Facebook outflank Foursquare? I’m afraid the odds of Foursquare winning this battle are rather thin. The key driver in the location space is the user base of these systems. The more people checking in and leaving geotags the more useful the system becomes. In that aspect, Facebook just dwarfs Foursquare.

Not only that, but to use those systems you need a specific app. Problem is that most people already have the Facebook app, getting into Foursquare would mean going through the extra hassle of downloading and configuring a new app, while the Facebook app not only has a much larger audience, but it comes preconfigured in most new smartphones.

Now, what could Foursquare do to fight back? It’s clear they really need to differentiate themselves from Facebook. They need something that you can only get on Foursquare and not from Facebook. I’m not sure if that’s something they’re going to be capable of doing if Facebook plays their cards correctly. Any new feature that gets any traction can be easily replicated by Facebook. They only possible way is to make something so different from the Facebook genoma that they won’t be able to replicate it because it goes against the Facebook strategy.

For example, even though they tried to transform the news feed into the Twitter feed, the Twitter audience kept using Twitter. The main reason is in how different the follow/followers dynamics are from the friend/no friend dynamics of Facebook. That simple thing is what really prevents Facebook from collapsing Twitter. The thing is that Facebook can’t change the way the deal with social relations in an easy was because that would mean changing the core dynamics of the company. Something like that is what Foursquare should pull out of their sleeve if they want to stay alive.

Many people have been saying that what the location space needs is a shared geolocation database, maintained by several companies so that each company could focus on developing cool things on top of it. With Facebook on the loose, they’re saying that Facebook could become the maintainer of that database. The problem is that, while the original hypothesis made sense, the Facebook one is extremely dangerous. The first one assumed several guardians, the second one just one, Facebook. Needless to say that having Facebook control the whole location data is like handing the keys of your kingdom. Information and data is key here, even though it’s a tedious task, it’s critical to build features that allow you to differentiate from Facebook.

Let the geowars begin!! Any comments? Insights? Should have Foursquare planned for this attack before hand? What do you think?

Images: cnet, jamesnorquay.

July 20, 2009

Scalability issues for dummies

Filed under: Business, Technology — Tags: , , , , , , — Alex Barrera @ 2:34 pm

Every once in a while I get people asking me what’s taking me so long to open my startup Inkzee to the public. They also ask me what exactly have I been doing as the web seems exactly the same. I normally answer that things aren’t easy, that it takes time, specially if you are alone, like I am. After a while I end up explaining my problems with scalability and that’s the point where people just can’t follow you. I’m going to explain here what are scalability problems and how deep the repercussions are for a small company.

Most web applications, like Inkzee, Facebook, Twitter, … are made of 2 parts. What we, the tech nerds, call frontend and backend. The frontend is the part of the application that’s exposed to the users, that is, the user interface (UI), the emails, the information that is shown. All that UI is a mix of different programming codes, let it be PHP, javascript, html, etc. The frontend is in charge of drawing the UI on the user’s screen and to display all the information the user is expecting from the application. But this information has to come from somewhere, well, that’s the backend.

concentro-rackable-data-center

The backend are all the programs and software applications that run behind the scenes and that are in charge of generating, maintaining and delivering the information the frontend displays to the user. The backend can be very homogeneous or very heterogeneous, but it’s normally comprised of 2 parts, the database (where the information and data is stored) and the software that deals with that database, does the data crunching and connects this to the frontend.

Now, some web applications have a barebone backend, very simple and light weighted. Normally some software that gets what the user inputs on the interface and stores it in the database and viceversa, retrieves it from database and shows it to the user. Other web applications have an extremely complex backend (i.e Twitter, Facebook, …). These not only manage the data retrieval, but have to do really complex operations with the data. Not only complex, but very expensive operations in terms of computational power. For example, each time a user uploads a picture to Facebook follows this path:

  • The picture is stored in a specific hard drive. The backend has to determine which hard drive corresponds to that user (yes, there are multiple hard drives and each one is assigned to a bunch of users so the load is distributed).
  • Once stored, the picture is sent to a processing queue where it will be turned into a thumbnail by an image processing software. This process is expensive as it has to analyze the picture and reduce it to a smaller representation if the image but still maintaining part of its quality.
  • After processing it, the backend stores the newly created thumbnail in the database and stores, both the picture and the thumbnail in an intermediate “database” in memory for faster access (cache). This is because it’s faster to retrieve data from memory than from a hard drive.

This is an approximation of what a picture does when you upload it to a social network. I’m pretty sure it goes through a lot more processes though. So, supposing 1% of a social network’s users are uploading pics at any single moment, imagine uploading ~20 photos per user, 2.5 million users at the same time (Facebook has around 250 million users currently). Trust me when I tell you, that’s a lot of data crunching.

The problem

The best user interfaces (frontend) are designed so that all that complexity that goes behind the scenes is never showed to the end user. The problem is that the frontend depends gravely on the backend. If the backend is slow, the frontend won’t be able to have the info the user is requesting or expecting and it will seem SLOW to the end user. Not only slow, but in many cases inefficient or just not available to use at all (meet the Twitter Fail whale :P).

whale

So, now, what will cause the backend to be slow? Ohhhhhh don’t get me started!! There are so many reasons why the backend might be slow or broken! But, most of them are triggered by growth. That is, as the web application is being used by more and more users, the backend will start to fall apart. That’s what, in the tech world is known as scalability problems. That is, the backend can’t scale at the same speed the users pour into the application. The problem is that it’s not only a problem of more users, but having users that interact more heavily with the site. For example you might have 100,000 active users but never had experience big scalability problems. Suddenly you release a feature that allows your users to share pictures more easily… BAM!! Your backend goes down in 10 minutes. Why!! Why?!! you might scream while you watch your servers go down in flames. After all you have the same amount of users, so what happened? Well, most probably your backend system that handles picture sharing was designed and tested only with few users. Now it chokes with the big deal.

scal_image06

The REAL problem

Once you have scalability problems, the next logical step is to find where the bottleneck is and why is it happening. This, which might seem very easy, isn’t at all. It’s like looking for a needle in a haystack. Big backends are normally screamVERY complex with many parts coded in different programming languages by different persons. Not only that, but sometimes problems arise in different parts of the backend. So after a couple of really stressful hours you find the bottlenecks and think of a solution to fix them. Ahh my friend, then you realize it’s not as easy to fix as you thought. First of all, you have no clue if the fixes your team has come up with are good enough. Why? Because you’re stepping into unexplored territory. Few persons have had to tackle a similar problem and even less people have dealt with your data and systems. So even if you find someone else with the same problem, the solution might be slightly different depending on what systems you use for your backend or which architecture you have. This is the point where you realize that developers aren’t engineers, but craftsmen and that fixing these problems isn’t exactly a science but black voodoo magic.

So, here you are, with a bunch of possible fixes to a problem but with no clue if they will really work or it will just be a patch that will need extra fixes in 2 weeks. Normally you try to benchmark the solutions, but that’s not an easy task, specially because you have no real load to test it against except in your production servers and no, you don’t want to fuck the productions servers more than they are.

Finally, after some black magic and some simple testes you cross your fingers and try the fix on the production servers. After several hours of monitoring the backend for new “leaks”, you scream of happiness as the patch seems to work. Then you start to realize that the patch won’t hold on forever and that you need some extreme solution to the problem.

You sit down with your tech team (our on your own as it’s my case 😦 ) and you start drafting a new solution. Suddenly you realize that the best fix implies changing the way your backend works. And by change I mean, you need to redevelop a big chunk of your backend to fix the problem. This implies a couple of things, you’ll need to invest a lot of time and resources, you’ll loose the stability your backend had (prior to the incident), you’ll walk into a new unexplored territory for your team and worst of all, you can’t just unplug your production servers and change the backend, you need to do it so both backends coexist for a while until you switch all of your servers from using the old one to the new one.

Now, the REAL problem is that this change, this new redesign grinds the whole company to a halt. All msntv-tech-teamresources, let it be people or money are invested in redesigning efforts so nothing new can be done. Most outsiders just don’t understand the depth of this change and will bash the company for not doing new things, for not releasing new features, for not fixing old bugs, etc. Not only that, investors will start to get anxious and will demand things to start moving. So, the outside world only sees that you’ve stalled, while the inside teams are suffering the pressure. Not only that, developers inside the company will get extremely frustrated by the pace of things. They won’t be able to add new features and even when fixing bugs they’ll need to fix them twice, one in the old backend, one in the new backend.

So, in the end, you realize the shit hit the fan and you got all of it. It’s hard, very hard to be there. If you haven’t experienced it you have no idea how hard it is. Not only as a developer but as a founder, CEO, or executive position you’ll feel the pain. You won’t be able to publicize your site cause more stress might accelerate the old backend problems, you can’t give users new features because you have no resources, you will try to explain the problem to investors but they won’t understand a clue of what you’re talking about… “backend what?”. Current customers will be pissed at you because the site is running slow and you are doing nothing to fix it. So, in the end, everything freezes until the new backend is in place.

How long does this takes? Depends. Depends on the size of the redesign, the size of the tech team, the skills of the team and specially, the skills of the management. During this phase, management must execute impeccably. Sadly, this is not the case in most places and so priorities are changed, mistakes are made and the redesign gets delayed over and over again.

It takes a very good leadership to make it through this period. Someone that knows where their priorities lie and that is able to foresee the future and the importance of the task ahead. Needless to say that such figure is lacking in most companies. That’s the reason it took so long for Twitter to pull their act together, to speed up Facebook, etc.

I am there, I am suffering the redesign phase (twice now). It’s hard, it’s lonely, it’s discouraging and frustrating, but it needs to be done. I just wrote this post so that outsiders can get a glimpse of what is it to be there and how it affects the whole company, not just the tech department. Scalability problems aren’t something you can discard as being ONLY technical, it’s roots might be technical but its effects will shake the whole company.

Let there be light 🙂

January 30, 2008

Because every word counts: Twitter experiences

Filed under: Business, Technology — Tags: , , , — Alex Barrera @ 3:47 pm

Recently I started using Twitter. I must confess I wasn’t very fond of it. I just didn’t understand what use I couldtwitter.png get out of it. Even though I’m still not a great fan of the service, I have to admit that it gives me some value. Many people try to describe Twitter, and most of them end up saying that it’s like a chat (irc, icq, etc.). My own definition would be that “Twitter is a slow motion chat where you get to decide who talks in it“. The key and really interesting part is the decision of who talks in the chat. For me that’s a huge difference between irc and Twitter.

From a business perspective I use Facebook to see what key people in my industry are doing. I can monitor which events they are going to, with whom they are talking, what posted items they are sharing. Again, the good thing about Facebook is that I choose who I want to be friends with. Nevertheless, one of the differences between Facebook and Twitter is that, for Facebook I always need the friendship to be approved, while on Twitter (except for protected accounts which are rare) I can follow whoever I want.

As for the quality of the information I must say that it’s just different. If you want to write about something and it’s long enough (be it more than 2 phrases) you’ll probably write it down in your blog. But if it’s just a link you want to share or an idea about something, you don’t have a tool to share it to a wide audience. Granted that you could write it as a blog post, but you risk burning your readers with a high frequency of posts with very few content. So, that’s where Twitter gets into action. It allows you to post your short musings to a different kind of audience. Getting back to the quality of the information, the good part of it is that you get to choose high profile twitters that you think might say or share interesting things. For example, Martin Varsarsky, Jeremiha Owyang or Mike Butcher are good examples of that. Again, if you don’t like someones content you can always “unfollow” them with no repercussion.

Finally, while reading a book from Ricardo Semler (Angel, thank you so much for the recommendation), I read a very good quote from Mark Twain: “I’m sorry I wrote a long letter, I didn’t have time to write a shorter one“. It holds an awful big truth, it’s harder to write small but meaningful texts than big cluttered ones. So that got me thinking about Twitter and its repercussions on heavy users. How will a 140 character restriction will transform there way of writing and even thinking? I suppose this is something we won’t see at first, but in the long row. I know that I’ve changed the way I listen to people. I’m so used to crawl hundreds of blog posts a day that I look for the essence of things and only if I like the essence, then I’ll read the whole post. This way of working is transcending into my offline life. Now I always find myself telling people to cut the crap and to get to the bottom line (I must say that people in general and in Spain specifically talk, way too much and say way too little).

I also think that, in the same way bloggers evolve and the way they write posts change with time (for better I hope), the same principle applies for Twitter. At first users just write about there life, and then they start to shift away from that and into a more information rich environment (this doesn’t applies to everybody though).

In conclusion, Twitter covers a different niche than blogs or Facebook dies and it targets a different audience. That being said, I recommend people that consider themselves information junkies to give it a try if you haven’t. You can follow me on my Twitter account and hopefully I’ll start changing what I write there. Twitter should read: “What are you thinking?” instead of “What are you doing?”.

November 15, 2007

Are social networks putting us in danger?

Filed under: Security — Tags: , , , , — Alex Barrera @ 2:51 pm

Today I would like to talk about an issue I’ve been ranting lately. Are social networks putting our privatePirate Flag and personal information in danger? I’ve been working for some time in the information security industry and I’ve seen many crazy things. Due to the recent popularity of social networks, we are beginning to see a shift on which information gets stolen. At first the bad guys targeted big company servers, nowadays exploiting remote bugs on current operative systems is getting much harder (thanks to things like ASLR, Non exec stacks, grsec, etc.). That’s why the bad guys are focusing on hacking browsers and their web applications. Each day we spend more and more time playing, working and using web applications, gradually incrementing the time we are exposed to them. Given the fact that the use of social networks is expanding at an incredible rate and that part of the experience consists in giving away our personal and private data, we have a ticking bomb on our hands.

So, we have motivation, we have interesting information to steal and best of all, we have a huge community of web developers who lack the security knowledge to code secure and reliable web applications. Don’t get me wrong, it’s not that the developers don’t care. First of all, they do care, but the don’t know what to look for, they don’t know how an exploit works and of course, the don’t have time to deal with it. It’s much more important to deal with scalability issues or with SEO strategies. The problem is that, due to the growing popularity of the social networks and things like Facebook apps or Open Social, these issues are acquiring an important weight. But you might think, why should I care? It’s not as if I’m exposing my credit card number, isn’t it? False, you are exposing a wealth of information much more important. We are who we are, our hobbies, our sports, our political views, our friends, etc. If someone can steal that information, we could be easily impersonated at all levels. From being victims of online scams, telephone scams, bank scams, to being denied a job due to some piece of information floating around. It can even cost you business deals or strategic partners.

Now the facts. It took theharmonyguy 45 minutes to find a way to hack the RockYou OpenSocial application emote. It took him 20 minutes to hack the iLike application on Ning. Today theharmonyguy announced that the Compare People application on Facebook leaks private information some information to the adSense network. Well, that is some scary stuff. Not only are we going to be data mined by Facebook, but we are also being targeted by adSense at the same time. Last, but not least, we have the great MySpace hack of Alicia Keys profile. The exploit was rather trivial, not highly sophisticated, but quite easy to avoid in most cases. Worst of all is that most social networks aren’t listening to security experts that are point out other hacks, scams or flaws in their systems. On the other side, it’s true that most application developers for social networks platforms are fast responders when a security flaw is found on their products. Why the actual social network cares less is beyond my understanding.

I don’t want to claim someone can eradicate all security bugs. They will always exists, for as long as we are humans. What I want to point out is that most of the bugs come from lazy developments. Right now, there is much more at stake than it was two years ago, so guys, pull out your security hats and lets hack some decent code, for the sake of all the “social networkers”.

UPDATE: As you can see, the guys from Compare People took a fast step and responded promtely to this issue. You can see their comments below. As far as I know, Facebook applications shouldn’t be using some of a user’s profile to feed adSense, or at least they should alert you about it. I hope this gets straight pretty soon. Thanks again to naval ravikant for the comments and the fast response.

UPDATE2: Venturebeat has a statement from a Google spokesman: “We recently allowed some application partners to send us additional keywords to improve ad performance. A limited number of the keywords sent to Google did not comply with the developer’s agreement with Facebook. When we realized this conflict, we asked the partners to discontinue sending those keywords. We are no longer using those keywords. No personally identifiable information was exchanged between Google and the application developers“. They do have a good point, is it going to take a blogger whistleblower to identify security breaches? Is it going to be like this with OpenSocial? Let’s hope no. At least they’ve answered pretty fast to the issue.

November 7, 2007

Facebook’s nextgen ad platform analysis

Filed under: Business — Tags: , , — Alex Barrera @ 2:21 am

Today, Facebook unveiled at New York their new ad platform. There is a great fuzz around this and hundreds of blogs are posting about it. That’s why I wont be talking about the actual system. For those interested in knowing how it works, I encourage you to read Owyang’s summary about it. Instead I’m going to try to analyze how, why and what can be done with the new system.

The social ad platform is structured around two ideas, brand awareness and friend’s trust. Some days ago I was discussing with a friend what this announcement really meant to Google. Would Google’s ads revenue be damaged by it? After reading today’s news I understand that Facebook is trying to build a brand awareness machine. This means that the objective for advertising in Facebook would be different from that of Google’s adSense network (based on purchase intentions). Now, the question is, which one will bring more revenues to advertisers? Generally speaking, it’s harder to trace the effectiveness of brand awareness ads than Google ads, so will the investment pay off for marketers?

To solve this itchy problem, they are offering what they call Facebook Insights. Here is where things get funny: “Facebook Insights gives access to data on activity, fan demographics, ad performance and trends that better equip marketers to improve custom content on Facebook and adjust ad targeting“. Ok, let’s analyze each one:

Activity
At first I though this was about ad performance, but it seems it’s different. Making a wild guess I can imagine they can track which pages you visit most (friends profiles, brands pages, groups pages, etc.). They might track your actual normal activity within your profile pages (page views, how much time you spend on each page, what sequence of pages you navigate more often, at which hours you are most active, in my case from 11am to 14pm for example, which parts of a page you give more attention to, etc.), so they know which are the best spots/time slots to feed you ads.

Fan demographics
Of course, data mining to the rescue. They are going to drill down the users profiles and retrieve all their information, including country, state, city or town, political views, relationship status, etc. Pretty scary isn’t it? This is something I’ve been ranting about for some time now. People aren’t really aware of the value of their personal information or the wealth of information they put on the Internet. But, most people that are screaming right now about this, should read the Facebook’s terms, as they clearly state that the information you pour into Facebook is theirs to use. I’m wondering what more interesting things they can retrieve from your profile. Lets see, which networks are you linked via your friends, which might give you which type of friends you normally interact with. In my case, most of my friends are from Berkeley, so you can infer I get along quite well with people from Berkeley, or I’m interested in Berkeley. My posted items can also be analyzed to see which items are the ones I like most. Of course, the likeness application is a gold mine. They can extract (I suppose with consent from likeness developers) which friends are “more like me” and easily target them, or vice versa. The same can be applied to the Wall application (no consent needed here as it’s from Facebook).

Ad performance
How are they going to track this? I assume they’ll pull all the hits either on a banner or the user’s news feed. Standard procedure here. Interesting to see which one gets a better hit ratio. Intuition tells us that the news feed will be the winner, but intuition isn’t always right. We’ll have to see some numbers. I haven’t seen any indications yet of price differences for banners and news feed ads, I’m assuming here they’ll probably be different.

Trends
I speculate they’ll show some nice graphs where you can see how the campaign is going. Brand tracking might fall under this category. Zuckerberg said on the press release you would be able to track your brand through Facebook’s public forums. I wonder if this would extend in a future to personal walls or even inboxes. That’s a scary thought, even though the marketer won’t be able to track which wall or inbox the buzz came from.

FacebookTrends

Some wild guesses on the outcome of this new ad system. I think it really hits on a sweet spot, but as some people have already said, it’s going to depend on implementation and the way the roll it out. For example, they are creating a new niche for application developers that want to target business and brand profiles. I wonder if the interaction between Facebook members and business pages will make websites like getSatisfaction.com go away. I have some doubts about the viral spam spreading. Facebook has been clear on their privacy policy about this new features: “Facebook users will only see Social Ads to the extent their friends are sharing information with them. […] In keeping with Facebook’s philosophy of user control, Facebook Beacon provides advanced privacy controls so Facebook users can decide whether to distribute specific actions from participating sites with their friends“. Now my question is, will I be able to change my news feed preferences to limit or filter spam noise? I currently have around 88 friends on Facebook, not too much, so I might bare with the noise, but what will Robert Scoble do with his 5000 friends (myself included)? Maybe he’ll finally thank Facebook for setting the 5000 friends cap. Another question that comes to my mind is, will marketers be able to control the text that gets injected into someone’s friends news feed? That could be very interesting, as personalized messages or specially crafted texts can make a big difference in marketing.

All comments are welcomed, I want to know what people think about the future possibilities of the system, or even if they are thinking about using it. New ideas for more data mining on my Facebook profile?

UPDATE: I received the first spam message from Robert Scoble. He created a brand page for himself so you can join and be a fan of Scoble. It’s going to be very interesting to see how all this develops.

Image Credits: Wikipedia.org, Techcrunch.com, Blogpulse.com

Blog at WordPress.com.