Ramblings on Search.
A Better Search Engine.
X= (C*15) + (E*40) + ((A1*40) + (A2*30) + (A3*20) + (A4*10) + (A5*5)) + (M*20)
X = The ranking value for any given entry for a given search query.
C = The natural ranking given by a natural search engine analysis (such as provided via Google, Yahoo, Bing).
E = The ranking value given by individuals noted as experts in the field. For example – an individual “recognized” as an expert in American Civil War History voting on SiteA as of high importance to queries on “Vicksburg battle” would cause SiteA to receive a ranking value boost of E*40, where E is the ranking given by the expert, ranking inverted so that 1 = 1 (first result) and 10000 = .00001 (ten thousandth result).
A(1,2,3,4,5) = Aggregate of users rankings. These are aggregated based on “trust.” A user receives trust as they demonstrate reliability over time. This would be determined by a sub-algorithm that considered factors such as (a) how often user agreed with experts, (b) how often user agreed with crawler, (c) how often user agreed with other high-level users, etc.
M = Whether the user has verified their identity and linked a valid credit/banking account to their account. Fines would be imposed on individuals abusing the system using this linked information. Linking to valid monetary funds would not be required but would be an optional means of increasing trust.
Since at least 2003 I have believed there was a better way to do search – and that way is socially. I nearly launched a business at that juncture to create such a search engine. I’ve waited eagerly over the years for someone to implement what seems so common sense to me – only to repeatedly be disappointed.
Google has now killed off SearchWiki. While far from what I envision – it was still closer and a move in the right direction for the company that has always insisted that machines can do it better. I had hoped it was the beginning of a change for Google – but the reversion to stars is devastating. So, I wanted to whine….and here it is.
I’m not going to spend all night on it at this juncture…but I may add to this article on occasion. That’s all.
What Would It Take?
Some would suggest that this project would be nigh impossible to complete – certainly impossible against a behemoth like Google. I don’t think so.
- There is a free/open source robust, web search engine currently available called Nutch. It’s actually been around for years (I was looking at it back around 2003 as well).
- There is also the option of using one of the many discarded web search engines – or getting a larger partner on-board like Yahoo! or Microsoft. Wink had something going for a while, Eurekster also looked like it had potential.
- Matt Wells has demonstrated what can be done on a low budget with web search for ten years now with Gigablast. Think <$10,000 to start for hardware.
- Hiring “experts” isn’t that hard. For initial seeding one could use educated non-experts (e.g. college students) who are willing to work for a low hourly rate ($10/hr.) but can make intelligent choices between web results.
- Wikipedia has demonstrated that it is possible to create an open eco-system which remains fairly spam free.
For both businesses and individuals there would be an incentive to play fair, to contribute content, etc.:
- Businesses would receive “cred” for good submissions/votes which they could then use to promote their own valid content (they’d lose cred quickly if they abused their cred).
- Individuals could do the same for their own websites.
- We also find the pride of ownership and accomplishment would play a significant role as seen in Wikipedia, YouTube, and DMoZ.
- It’d make sense to me to implement revenue sharing (at the “higher” levels of user trust).
For Each userx
myportion = myrelevantresults + (mytrustlevel * mytrustpoints)
We’d then divide the 25% ($250) by the sum of all userx’s myportion (sumx). Then give each user sumx*myportion (ex. $250 / 3000 = $0.083 * 60 = $5). Not a lot of cash – but that is a rough guestimate on a single search query!
Why Don’t You Do It?
I’m sure some will wonder why I didn’t do it in the past and why I haven’t done it now. Ahh – that is the question. There are numerous contributing factors both past and present but the essence comes down to, I like ideas more than implementation (who doesn’t?)…and more importantly, I find myself more the aggregator of knowledge than the creator of methods. In other words, to some small extent, I’m a walking search engine – and I would love to input my knowledge into an engine like this…but I am not a skilled developer. I mean – I program, but I’m no Scott Guthrie (or…more in my realm, Corey Palmer, Ash Khan, or Kevin Clough).
- Consider, we have say 500 individuals with levels A1, A2, or A3 trust who have voted on at least one result on a query result. This query result over a months time generates $1,000 in revenue. 25% is set aside for user compensation. We’d do something like this:↩