Always New Mistakes

June 18, 2008

Next generation search engines

Filed under: Technology — Tags: , , , , , , , , , , , , , — Alex Barrera @ 2:31 pm

I was reading Scoble’s post about Windows Live Search and I realized what the future of search is going to look like (or so I think). I realized that the users don’t know how to express in a written way what they are looking for. Most of the times, you type a couple of keywords that should, theoretically, yield some results from which you can identify the one you are looking for. Human powered search engines like Mahalo have the same problems. They rely in human beings building pages with the most relevant information about a topic, but if you are looking for something not that common you’ll run into problems. Last but not least, semantic search engines like Powerset are closer to the goal, but there is still a big hurdle in the user’s way. How do you phrase, as a user, the information you are looking for? You need to type a phrase, but it’s not that obvious what that phrase should be, making it hard and slow to search things.

Now, the big problem again is writing down what are you looking for in a way the search engine understands it. How about another approach? How about a search engine that reads your mind so that it knows what you are really looking for? Most readers must have had a good laugh with the former statement but I have to say that mind reading devices are a big reality with their own field of expertise called Brain – Machine Interfaces (BMI). Several gaming companies are already using these devices to allow their players to control virtual avatars with their minds.

And how do these devices work? Generally speaking, it’s a helmet that reads neuron impulses in several areas of your brain. In the gaming example, they read the brain areas dedicated to movement, mapping neuron firing patterns to an specific movement in the game. This technology is still giving its first steps in the commercial arena, but I’m pretty sure  we’ll see more and more devices working with it.

Now, is it a big stretch to say that we can use similar devices to read our search intentions? It is indeed, it’s something that is still out of reach. Not because of technology but because of a lack of Neuroscientific data that can be use to pinpoint which brain areas we use when searching online. But it’s just a matter of time (I’m talking about 5 to 10 years here).

Big problems with this type of search, you not only need a web index, but a neuron firing pattern index and an engine to understand them and translate that into a web search query. Another big issue is brain privacy. Your neuron firing patterns would need to be transmitted through the Internet and stored somewhere. That’s a source of major privacy concerns that should be address before using a search engine like this.

Nevertheless, and with all the problems than might arise with an idea like this, I truly think we’ll someday see something like this and I have to say it will be awesome. I don’t know if any company is currently investing in developing a mind controlled search engine, but it would be a great project for a big company like Google, IBM or Microsoft.

Do you like the nextgen search engine? What problems do you see with it? Would you use something like that?

Advertisements

November 21, 2007

Powerlabs: An insight into Powerset’s technology

Filed under: Business, Natural Language Processing — Tags: , , , — Alex Barrera @ 7:42 pm

Finally I received my invitation for the Powerset’s Powerlabs website. I’ve been playing with it for a couple of weeks now and I’m quite impress with some of the things they’ve accomplished. Powerlabs is an invitation only community for beta testers, built around five demos (they just added a new one yesterday). Their main goal is to show Powerset’s technology (via the demos), and to discuss problems, questions or ideas related to either the technology or the web interface.

The web site has five main sections: Dashboard (like you home page), Demos, Discussions, Queries (wished-for queries by other members) and People (list and ranking of current members). You can basically break the website in two big sections, the demos and the discussion area (more on this later).

Dashboard
This section is the user’s homepage for Powerlabs. As you can see on the screenshot, here you can monitor your stats within the community (nº of discussions, nº of comments, global rank based on karma points, etc.), Powerlabs latest news (“New Sports Demo” right now), a list of recently implemented ideas (with a link to the post where the idea was made and the author) and your news feed. The news feed is probably one of the best parts of Powerlabs. It’s quite similar to the one you find in Facebook and it basically keeps you updated with the latest activities related to your user and you summited ideas.

Dashboard

Demos
This is one of the most important areas of the website. Here you can play with five demos that show you Powerset’s technology at work. I stress the word technology, because you won’t find a Natural Language Search demo here. So it’s not a demo of the product, it’s a demo of the algorithms they are using to build the product. The demos have two big restrictions, they use predefined queries (you are just able to fill some words of a longer phrase) and they only work with the Wikipedia corpus (hopefully it seems they are trying to expand the corpus in a very near future). The demos are divided in several categories: sports, the arts, business, quotes and PowerMouse. The first four are the same demo, the only difference is on the queries you can ask.

sport demo1

For example, for the sports demo you can ask some of the following questions:

  • What did X win?
  • What did X draft?
  • Who X (defeated or beat) X?

For the business demo, you can ask things like:

  • What does X own?
  • Who did X acquire?

sports demo 2

The PowerMouse demo is probably the most fun to play with, it lets “you examine how structured information is extracted from open text“. As they say, it’s not a search application per se, but it’s a window into how the results are obtained. When you start the demo you are asked to fill the following structure: Something (subject) – Connection (verb) – Something (object). There are no restrictions on what three words to use. This demo will give you all the possible combinations of your query. When it can’t find the exact query it will use broader words to try and get what you where looking for. Take a look at the screenshot for more insights.

PowerMouse Demo

Discussions
This is the heart of the website. The discussion area is like a forum but on steroids. It’s divided in various categories (wikipedia, query examples, labs, the arts, …). You also have a category where you’ll see all the posts either order by date or by relevancy. Each post has it’s title, author, nº of views, votes and comments. It works in a similar way as Digg does. Members read a post and vote if the like the idea. The more number of votes an idea gets, the higher it gets on the relevancy list. For each post you can also set a flag that will allow you to follow the activity (you’ll get updates on your news feed). Each time a comment or an idea you’ve posted gets a vote, you’ll get a new entry on your news feed. Every time someone comments on either a comment you made or an idea you sent, you’ll get an entry on your news feed. There is even a RSS feed for the discussions, albeit the url is hidden in one of the posts. One of the coolest features is that, before post something, the system suggest similar already summited ideas. If by any chance someone has already posted a similar question or idea, you’ll know before you actually send yours, avoiding sending duplicated post.

Discussions

Idea 1

Idea 2

Queries
This section lets you browse some of the member’s wished-for queries. It lets you create new queries and comment on the ones that are stored already.

People
The site spins around the idea of karma points. That’s similar to many other ranking/voting systems around like Slashdot, Digg or ycnews. You earn points for commenting, for using the demos, for posting an idea and for every vote your ideas or comments get. The People‘s area lets you monitor what members are on the system. You can order them by karma rank, recent activity, number of ideas, etc. In the future some demos will require a certain level of karma, so it’s always important to reach a good karma level.

People

My opinions? Well, I think Powerset has achieved a great goal, get lots of testers involved. It’s true that the demos disappoint, I think most people expect a less rigid demo, but hey, at least they are showing that the technology isn’t vaporware. I love the approach of letting people participate at all stages of the product development. Few companies do that and it’s a breath of fresh air. The interface and user experience of Powerlabs is awesome. One of the best I’ve seen so far. It’s easy, straightforward and useful. Importing the voting scheme from places like Digg is a very smart move, they’ve managed to engage a lot of users and that’s great. The downside, after playing for some weeks and due to the lack of more comprehensive questions and corpuses, you end up not knowing what more to do. In my opinion, it lacks three important things, an RSS feed for the news feed, that way you can keep updated instead of having to refresh the browser, a way to ask much more open questions and a bigger and updated corpus (I think this might be on its way). Nevertheless, they are moving fast and each week they are adding new features, so I’ll keep checking and I’ll update when necessary.

Any Powerseters willing to add some comments?

Image credits: Powerset

November 5, 2007

Powerset’s internal problems

Filed under: Business — Tags: , , — Alex Barrera @ 3:15 pm


Powerset
is a San Francisco based startup that is trying to build the next generation search engine. Co-founded by Barney Pell, Steve Newcomb and Lorenzo Thione in late 2005, they have already raised $12.5 million in a series A investment round. Powerset is attempting to develop a public and global semantic search engine. Current search engines like Google are based on keywords. You need to type the right keywords so you can find what you are looking for. Even though this approach works great (just take a look at GOOG’s soaring stock), with today’s information rivers, we need smarter ways to find what we are looking for. For many, semantic searches are the next logical step. Instead of searching for keywords, you ask, in plain English, what is you are looking for, just in the same way you would ask a human.

I don’t want to jump into any conclusion, building a search engine is a really tough job, but applying AI and natural language processing algorithms to it, is even harder. I’ve been there and I know it well. So I understand why it’s taking so long for them to release a product to the public. But creating the fuzz Powerset did, and not delivering a product in a year’s time is a tough call. And if that wasn’t enough ammo for good critics, the recent stepping down of their CEO and the departure of one of the cofounders doesn’t adds very well.

Let’s analyze the situation in detail. Why would Barney Pell step down as CEO? As he exposed in his blog: “After extensive thought and reflection, the Board and management team decided that the time was right for us to bring in a new CEO to take the company to the next level and for me to transition into the role of CTO“. Well, why would that be? After all, Powerset doesn’t has a public product yet (not until 2Q of 2008), isn’t making any revenue, isn’t getting ready (AFAIK) for a new round of investment and much less for an IPO. What next level is that then? Most rumors point out at pressures from investors which are getting nervous. I don’t have all the facts, and as such, I won’t jump into conclusions, either way, I do think it’s a very unwise move for a startup that is against the ropes in terms of credibility.

To make matters worse, Steve Newcomb, one of the co-founders is leaving the company. It’s funny how all other related posts have only focus in Pell’s stepping down, instead of the departure of a co-founder. I personally think this is much more relevant of what the inside situation “might” be. It’s quite strange that, as Barney puts it: “Steve lead the company internally and brought strengths in execution on several other fronts“, but nevertheless he’s being “expelled” from the company. At least that’s the image that’s being projected. It isn’t usual that the co-founder and leader defects the company before they have a product. I’m quite sure the work isn’t finished yet and that there are more reasons for his departure. As I’ve said before, I’m only speculating on this as I don’t have all the facts. Still waiting for Steve to post something on his blog.

For me, not only the external image of the company is being damaged by this management change, but worst than this is the fact that people inside the company might be suffering from this restructuring. I would love to hear opinions from Powerset engineers and what are their views in all this.

Just for the record, I’ve been following Powerset for some time now. I think they have a brilliant technology and very good strategic partners. I don’t think they should go to the deadpool (yet), as they are still to show their technology during 2008. As I’ve said, it’s a tough field and I’m confident they’ll produce some nice technology in a near future. Again, inside views of the matter are greatly appreciated.

 

Image credit: Powerset.com

Blog at WordPress.com.