Well, perhaps not *that* kind of social engineering.
When I first designed the new system I put in a few little extra columns in the database for each link, one such field is called “clickthru” and another is called “impressions.” These two fields actually say quite a lot. “Clickthru” basically stipulates how many times the given link was clicked on. “Impressions” refers to the number of times the given link was “shown” to a user.
If you divide clickthrus by impressions, you get a clickthru percentage. Ad companies use this very basic formula to determin pricing (and other things, like cost per click etc.). Yes, this is a vast over-simplification, but just hang in there for a moment.
What I am wanting to do is, “promote” popular links, and at the same time, create a “neighborhood” of common links.
There are two big concepts here, so let me take them one at a time. “Promote” refers to “pushing to the top” of search results. The idea is, a link that has a high clickthru rate is one that provides either a lot of relevance, value or is just uniquely positioned within its given niche. This, of course, has the potential to be abused. Well, I’ve gotten that pretty well figured out. There will be fail-safe measures in place that will prevent someone from “hammering” the system to artificially boost their clickthru rate. No, I’m telling either what it is!
The goal of this is to push good links up and poor one’s down. Of course, everyone starts with an equal footing, no matter who or what you are. Everyone starts at 0, and there will be nothing in the system that would prohibit a “poor” link from rising within the system. In fact a poor link could become a popular link within a few clicks, depending on the subject matter.
The other concept of “neighborhoods” is a bit more complex. Every link has a given set of information related to it, keywords, descriptions and an owner. What the goal of this is, is to create a neighborhood “on the fly” for a given link. So, when you click on a link’s details, a neighborhood will be created that will also give you the option of looking at relevant links, not as alternates, but just to fill out the subject matter you are currently interested in.
In essence, a “social network” of links is created by relevance. I do not want to get into the Slashdot/MySpace/Live Journal concept of “friends” which to me is a very static way of establishing a network. This will be completely dynamic, and will not be based around an individual. The social network (for the lack of a better term) will be based soley around subject matter.
These are ideas are still be considered, and I would love to hear some feedback to further refine these concepts, but its been something I’ve been kicking around for a few years now, and now that I have a platform that really enables full data integration, we can entertain concepts like this.
About clickthrough, you need to be careful of the snowball effect. Any site that gets promoted better will be seen first by future users, and will therefore get more clickthrough, and get promoted again. Perhaps giving new links a little boost into the top would offset that, I don’t know.
As for neighborhoods, I think you’ve got most of what you need already in your keywords. For a given link, it’s neighbors would be any other link that it shares a keyword with. You could make a couple passes through expanding the neighborhood by compilining a list of the most common keywords in the neighborhood, and then adding any other link to the neighborhood if it has one of those keywords.
Not sure if you want to join the two properties. You could, for example, give higher precendence in the neighborhood to keywords from sites with a high click-through.
Incidentally, click-through is not really a good indicator of the quality of the actual site’s content. It’s more a measure of the quality of the link description. It’s not likely that a large number of people will find a site, wander around enough to decide it’s a good fit, and then come back to diysearch to click it again. If it’s a good site, they’ll bookmark it, and it it’s not, they’ll use the back button and click some other link next. I’ve always wondered if there was a good way of capturing this type of data. Consider it a downvote whenever someone clicks a link and then returns to the page within less than a minute. If they don’t come back at all, consider it an upvote. That’s very simplistic, and gameable, and the data would have a lot of noise. I’m not sure if there’s a good way to clean it up.
Hmmm no preview option for comments? OK… Hope I didn’t mess up my spelling too bad
June 26th, 2006, at 8:34 am #pythor:
first, yeah wordpress, in their ineffable wisdom, did not implement a preview function. *sigh*
second. you raise some interesting points. you are right clickthru rate is not an indication of quality, but its an indication of either a bit of cleverness and thought on behalf of the link owner, and therefor, is *often* (not always) some indication of the *relevance* of the target site. That to me is the key. My gut tells me that this system would promote relevance, not popularity. But then again, my gut has been wrong in the past.
So, being objective here, lets say my ideas are crap, or at least add no value to the user experience - what would?
Really, I’m really quite open to all ideas and my only desire is to make the user experience more rich and useful, more valuable.
June 26th, 2006, at 9:21 am #I wouldn’t go so far as to say your ideas are crap. I’m just suggesting a little bit of caution, and maybe some ideas to look to for the future. I’m not even sure I’m the right audience to be asking here. I’m not a creator of anything, really, except some small python scripts. I’m not particularly interested in music or art. I try to browse the categories that look interesting, but most of what I find is either stale (not updated since 99) or re-routed to a search page.
June 27th, 2006, at 2:53 pm #That said, the code behind the scenes and the algorithms that get used interest me. I like to think, both to understand, and to imagine something better.