How
This post is about the way in which SiftFu accomplishes its goals as outlined in the manifesto. The challenge is certainly difficult at face value: it is a combination of search (finding the right content), comprehension (knowing which content is truly relevant), organization (displaying that content in a meaningful way), and connotation (making it comprehensible). This is not a technical post – it is intended to explain features that everday users will begin to see and understand as the product comes further along.
Buried within these challenges are the inherent limitations of starting a new company with a small team. There is little opportunity to throw a sheer volume of computing power into the mix, as server costs are still significant. Developing new algorithms or techniques that could take years is also out of the question. SiftFu, rather than relying on any sort of new technological power to overcome competitors, is built to be fundamentally different from what’s out there.
Impermanence
This is the 1st big divergence between a sift and a search. A search is a 1-time event – it takes a few terms and gives back matches right on the spot, assuming you’ll be done with it before anything new comes crashing onto the web.
SiftFu makes the opposite assumption. Terms, times, locations, and whatever else you use to look for subjects of interests all stick. They create a perennially present feed of new content that updates whenever things appear. Rather than a momentary curiosity, SiftFu is optimized for a long-held interest that remains relevant even as the available information changes.
Volume and Specificity
What do the following have in common?
- the sun
- a strawberry
- scala (programming language)
- Barack Obama
- The Yeah Yeah Yeahs
For me (results may vary due to personalized search), all of these include Wikipedia in Google’s 1st page of results. For what Google is, this is the most sensible option – Wikipedia is highly regarded, has been around for a relatively long time, and appeals to the widest possible user based as it contains content intended for those who have never before heard of the subject at hand. It fits the use case of search well – a user encounters a topic they’ve never seen before and desires to satisfy their curiosity in a single moment.
It doesn’t make sense, however, for more expert users. I know what most of these things are and probably would know to check Wikipedia on my own if I didn’t. Furthermore, once I’ve read a source, its recurring value falls substantially. Showing me the most popular, lowest-common-denominator pages for something like Scala (the programming language I use for SiftFu) or The Yeah Yeah Yeahs (a band I kind of follow) is useless to me, kind of like teaching 1st grade arithmetic to a class of physics PhDs. The content I want in this case is new and specialized – it is not widely applicable to the entire human population, but instead focused on Scala programmers and Yeah Yeah Yeahs fans. It is also most valuable if I haven’t seen it before, which is especially true of new content.
This is the 2nd way in which SiftFu diverges from typical search – rather than favoring the oldest and strongest, SiftFu looks for new content that is the most targeted. In other words, we do not seek the loudest voice in the room, but rather the whisper directed to the listener.
Disambiguation (note: this feature isn’t finished yet, which is why you might not see it in the 1st release)
What is a dog?
Sometimes a dog is an animal. Other times it is someone we don’t like. Then again, it could be an action that makes quite a pest out of the one who does it.
Now what do a dog, a canine, a mongrel, a German shepard, and a mutt all have in common? All of these words relate to a single concept, one that should be well-defined and present a clear mental image to any with good knowledge of the English language. There is no longer any ambiguity when all these words are present – they relate to a single thing.
Search engines run on words. SiftFu runs on concepts. It takes multiple words and from them generates a single idea, which in turn applies to many more words. In this way, it avoids getting caught up on the particulars of a term and instead can focus on those all important nuances of meaning.
Levels of Configuration
Convenience, privacy, control. Pick 1. That’s how it generally works – you can give up more information about yourself in order to gain convenience (letting the system guess better what you want), but you sacrifice privacy and can’t control what it does. Alternatively, you can take the controls and enter all the information yourself – but you’ve just lost any semblance of convenience and are still revealing quite a bit of personal stuff.
SiftFu allows you to specify how much the system guesses, knows and learns. It is possible to start an account with no more than a throwaway username and password, and immediately start a sift over publicly available sources for a topic that is nearly impossible to trace back to its owner. At the other end of the spectrum is the browser extension – this will track and crawl the sites you regularly visit for content that is likely to update regularly and ensure that anything which appears in a feed (RSS, Atom, Facebook, Twitter, etc.) also appears in you SiftFu results, weighted by how much time you spend there.
Posted in SiftFu | No Comments »