from the whirlpool.net.au discussion forums
web hosting by WebCentral Australia
   Poll: 'We have found threads similar to yours' View full version
User #187613   2237 posts
Whirlpool Forums Addict

Well i notice the people who post without searching is increasing rapidly.

Why dont we have a feature like digg.com that when you post a thread, it will say "We have found threads similar to yours, are you sure yours isnt a duplicate?"

Then it can list, say 5 or 10 threads. This way, people asking for help can see other threads that have already been answered. And much much more.

The only downside is 1 additional SQL query, but I'm sure the server will live ^^

Thanks for reading, feedback please :)

posted 2008-May-17, 12pm AEST
User #151061   5875 posts
Whirlpool Forums Addict

Whirlpools SQL database is already stressed enough as it is without this.

Most people who dont search are normally just too dumb. If someones keeps not doing it, dont reply.

posted 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

But this is 1 step to stopping the evolution of Whingepool!

posted 2008-May-17, 1pm AEST
User #40586   17918 posts
Senior Moderator

It is something that has been suggested before and isn't a bad idea, but it probably won't be implemented on the current hardware we have.

posted 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

I did some math ^^

Adam - Whirlpool says:
thats alot of querys when you add it up
Jay says:
not really
whirlpool opened when?
2005?
thats only 292,118 queries a year
only 24,343 queries a month
average of only 811 queries a day
33 queries an hour
0.55 queries a second!

posted 2008-May-17, 1pm AEST
User #151061   5875 posts
Whirlpool Forums Addict

Jay! writes...

Adam - Whirlpool says:

Yay! Thats me.

But ye, whirlpool has been open since before 2005.

posted 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Aww, that makes my calculations wrong :(

Well if we cant do this, perhaps put a notice on make a thread page saying something about use the search fuction, or put in a search box for example.

Or maybe open a can of whoop ass on them.

posted 2008-May-17, 1pm AEST
User #40586   17918 posts
Senior Moderator

Jay! writes...

I did some math ^^

I don't know where those numbers are coming from.

The trigger would be when you attempt to make a new thread. We have ~900,000 threads.

We can have around 500-1000 made in a day, many clustered in peak periods (i.e. Lunchtime, 5-10pm), which means running a search (potentially that isn't cached) for each time a thread is attempted to be posted could have a significant effect on performance.

It is something that could reduce duplicates, yes, but I'm really not sure how many people would read it based on things we see where users ignore much more blatant indications...

posted 2008-May-17, 1pm AEST
edited 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Well, what if we only included this feature for people with under 20 posts? :D

Or perhaps only activate it when the server load is below a number :)

posted 2008-May-17, 1pm AEST
edited 2008-May-17, 1pm AEST
User #40586   17918 posts
Senior Moderator

Jay! writes...

Well, what if we only included this feature for people with under 20 posts? :D

Often it isn't new users that are the problem, but lazy longer term users

posted 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Your not talking about me are you >_>

Read my edit above please :)

posted 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Or what if we only make it active for people with below average Aura's?

posted 2008-May-17, 1pm AEST
User #40586   17918 posts
Senior Moderator

Jay! writes...

Or perhaps only activate it when the server load is below a number :)

Possible, but that could make quite a few threads don't get the prompt and hence the effectiveness of the measure would be reduced quite a bit.

posted 2008-May-17, 1pm AEST
edited 2008-May-17, 1pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Thor writes...

hence the effectiveness of the measure would be reduced quite a bit.

Yes i know, but there still would be some increase of effectiveness, without a large impact on the server. *puts on party hat*

posted 2008-May-17, 1pm AEST
User #40586   17918 posts
Senior Moderator

Jay! writes...

Yes i know, but there still would be some increase of effectiveness

Actually neither of us would possibly know that definitely.

We have in the past put very bright warnings on the thread creation page for certain forums when there has been new plans etc. and people have completely ignored them, so I'm far from convinced a search on thread creation would necessarily do a lot.

posted 2008-May-17, 1pm AEST
User #77517   1015 posts
Whirlpool Enthusiast

Thor writes...

We have in the past put very bright warnings on the thread creation page for certain forums when there has been new plans etc. and people have completely ignored them

Thor, what was perhaps one of the most visible examples of a complete lack of posters bothering to check pre-existing threads was on back April Fool's Day this year.
At one point in the day, the majority of threads on the Forum Feedback page were threadkills.

Jenifur Charne

posted 2008-May-17, 2pm AEST
User #7411   19843 posts
Mangy Fleabag

Great idea in theory, but I dunno if WP's backend could take the extra load right now.

That said, it could theoretically REDUCE the load on WP eventually, especially if becomes successful in reducing dupe threads.

posted 2008-May-17, 3pm AEST
User #151652   1893 posts
Whirlpool Enthusiast

If only we had an invincible back end :(

If all the good suggestions in forum feedback were added, we would be some sort of super forum.

posted 2008-May-17, 5pm AEST
User #40586   17918 posts
Senior Moderator

Tone. writes...

That said, it could theoretically REDUCE the load on WP eventually, especially if becomes successful in reducing dupe threads.

That makes the assumption people will actually read the results and use existing threads :-)

Bitter experience has shown us many users don't...

posted 2008-May-17, 5pm AEST
edited 2008-May-17, 5pm AEST
User #151652   1893 posts
Whirlpool Enthusiast

Tone. writes...

That said, it could theoretically REDUCE the load on WP eventually, especially if becomes successful in reducing dupe threads.

Could it though? Doesn't making a thread have less impact? Consider this.

A. User clicks makes thread, then posts the thread. Then people post in it.

B. User goes to make a thread, sees warning and searches, then clicks on an existing thread, reads some posts, then posts something, then the thread is revived and people post in it.

posted 2008-May-17, 5pm AEST
User #40586   17918 posts
Senior Moderator

tikalal writes...

Doesn't making a thread have less impact?

If you are referring to raw performance, it isn't an easy answer.

Per thread a mandatory search will result in longer CF construction time and database query time for the use case of posting a new thread, but it has the potential (no guarantees) to reduce the number of threads made, hence potentially reducing the number of records in the threads table and perhaps a smaller number of records in the replies tables due to people not needing to post the same thing.

So longer term it may lessen reduce performance issues by slightly reducing the growth rate of the database assuming people read the results presented, short term, probably won't help performance, rather just impede as it is creating more queries to run against a large and already busy database.

posted 2008-May-17, 5pm AEST
edited 2008-May-17, 5pm AEST
User #212996   1038 posts
Whirlpool Enthusiast

Hasn't this topic already be covered in another thread? ;-) lol

posted 2008-May-17, 5pm AEST
User #151652   1893 posts
Whirlpool Enthusiast

Is it better to have more queries over a longer period of time than a lot of queries in a short period of time?

For example:

1. You have 30 queries in one minute and then 10 queries per minute for 10 minutes after that.

2. You get 10 queries in one minute and then 30 queries spread over 30 minutes.

My logic is that (another example) 100 queries over 100 minutes is better than 20 queries in one minute.

My examples are extreme, and I'm probably missing something; I don't really know what I'm talking about and have never run a server or anything. So Thor, basically what you're saying is that the difference between the two in terms of server load are negligible or too hard to calculate?

I failed maths :P

posted 2008-May-17, 7pm AEST
edited 2008-May-17, 8pm AEST
User #40586   17918 posts
Senior Moderator

tikalal writes...

But isn't it better to have more queries over a longer period of time than a lot of queries in a short period of time?

Depending on the amount of data the queries return, yes.

So Thor, basically what you're saying is that the difference between the two in terms of server load are negligible or too hard to calculate?

What I'm saying is that having a mandatory search on thread creation would definitely make more significant (in terms of time to execute and data retrieved potentially) impact to performance on a day to day basis for users than the current situation.

Long term any positive effect of the measure could be dwarfed by growth of the site anyway.

posted 2008-May-17, 8pm AEST
User #151652   1893 posts
Whirlpool Enthusiast

This is all too confusing!

I trust you Thor.

posted 2008-May-17, 8pm AEST
User #73332   980 posts
Whirlpool Enthusiast

◄ŞKyЯЇDΣ► writes...

Most people who dont search are normally just too dumb

What an ignorant thing to say. It would help if this forum actually had a decent search function that was usable half the time. But no, constantly being disabled for "performance reasons". And no Google does not suffice. For one thing it cannot search all the forums, and it really is lacking as a forum search tool.

posted 2008-May-17, 9pm AEST
User #10   8377 posts
Benevolent Dictator

Jay! writes...

Why dont we have a feature like digg.com that when you post a thread, it will say "We have found threads similar to yours, are you sure yours isnt a duplicate?"

I've been wanting to implement a feature like this for some time. However it's not possible without an advanced full text index to power the suggestion algorithm. I need more server resources for that.

posted 2008-May-17, 9pm AEST
User #10   8377 posts
Benevolent Dictator

tikalal writes...

Doesn't making a thread have less impact?

If the choice is between making a new thread or reviving a relevant existing thread, I'll pick the latter with few exceptions.

posted 2008-May-17, 9pm AEST
User #54877   6240 posts
Whirlpool Forums Addict

Simon Wright writes...

I need more server resources for that.

What about the new Google App Engine? Maybe you could set one up for WP and hand off the full text search to the App Engine?

posted 2008-May-17, 9pm AEST
User #10   8377 posts
Benevolent Dictator

Yansky writes...

What about the new Google App Engine

Lots of reasons why that won't work. The 500MB limit is just one.

posted 2008-May-17, 9pm AEST
User #53837   3939 posts
Whirlpool Forums Addict

tikalal writes...

. User goes to make a thread, sees warning and searches, then clicks on an existing thread, reads some posts, then posts something, then the thread is revived and people post in it.

User makes post> see's warning> ignores> posts> flamed because there are 40,000 other threads.

We dont want to reuse threads, a "what camera should I buy" thread from 2005 is going to be pointless in just a month, imagine how far off it would be if it were still alive now.

Edit: If we let it run into past threads we would end up with only a few threads (all "part 400XX") I can imagine it, "What PC should I buy", "What ISP to go with", "Telstra sucks", "Telstra doesn't suck"

posted 2008-May-17, 9pm AEST
edited 2008-May-17, 9pm AEST
User #40586   17918 posts
Senior Moderator

Futurama writes...

We dont want to reuse threads, a "what camera should I buy" thread from 2005 is going to be pointless in just a month, imagine how far off it would be if it were still alive now.

Agreed, which is why as Simon said, to do it properly you would really need an advanced full text search, which isn't an option right now due to hardware limitations.

It wouldn't really work reliably and intelligently enough to be useful and be worth the performance hit with the current thread title search.

Sure you could in theory do simplistic restrictions like thread post date, last post date, forum it is in, views etc., but they can only do so much.

posted 2008-May-17, 9pm AEST
User #44690   9580 posts
Whirlpool Forums Addict

Thor writes...

which isn't an option right now due to hardware limitations.

What I would like to know is if there is any plan to increase Whirlpool's hardware capacity. I'm sure there would be a lot of Whirlpool users, myself included, who would be willing to chip in a small donation if we knew it was going to a specific, planned hardware upgrade.

posted 2008-May-17, 10pm AEST
User #151061   5875 posts
Whirlpool Forums Addict

I love Kevin Rudd writes...

What an ignorant thing to say.

Im not even talking about searching. Over in the PC Hardware forum, there can sometimes be like 7 threads on the first page called "PC problem" or something to that effect. And 3 of them can have virtually identical problems.

I should probably have said lazy. Not dumb. The search tool isnt bad. Not good, but it isnt that bad. The goodle one is fine.

posted 2008-May-17, 10pm AEST
User #40586   17918 posts
Senior Moderator

Foonly writes...

What I would like to know

/forum-replies.cfm?t=972274#r4 basically answers most of your queries as best as I know for the moment

posted 2008-May-17, 10pm AEST
User #21450   3943 posts
Whirlpool Forums Addict

I don't get it...

We get duplicate threads all the time in the Broadband forum. I just herring them and MM deletes it leaving a pointer to the relivant topic.

Would you also want to search the wiki at the same time?

Cheers WTW

posted 2008-May-17, 11pm AEST
User #40586   17918 posts
Senior Moderator

WTW writes...

Would you also want to search the wiki at the same time?

I can't see why not and I'd like it to be able to.

I find a lot of the problem with the wiki is that a lot of users tend to forget there is a lot of very good information in there, but since it doesn't appear in search results or along a string of links, the information is lost or forgotten.

posted 2008-May-17, 11pm AEST
User #128620   114 posts
Forum Regular

@ the OP
Most of my forum research/participation is in help-style web forums put up by retailers.
Much longer-standing is a few usenet muso and comp lists.

This grumble about duplication runs through them all and yet most users prefer to start a fresh thread - despite all kinds of flashing stickies, auto-search offers and threats of the cold shoulder.
Who knows why?

I have noticed that the most populous forums - ones where an answer can be anticipated immediately - have the most duplications. I'm guessing that people just like to be part of the real-time help feeling. And the rubbishing about being too lazy, dumb etc is also part of the scene - I like it.

I kind of like this flooding of advice, links to the orginal thread, and all the rest of the sidebar pointing that you get in dupe threads; it makes this alzheimer-ish www access feel just a little more layered, if you get what I'm saying.

Most importantly, when I go for info in a search - - Google is very consistent for me with WP, btw - - - it's got a better chance to have been picked up if there are more than a single thread.
I think.

posted 2008-May-18, 1am AEST
User #83911   577 posts
Whirlpool Enthusiast

◄ŞKyЯЇDΣ► writes...

Most people who dont search are normally just too dumb. If someones keeps not doing it, dont reply.

Yeah, that is true, but their posts make it hard for people who do search a lot harder! When I search there are just so many threads that are useless and/or are repeats, which makes the search impossible and therefore have to make a new thread.

I have seen this feature on another site, I thought it was a very good idea and I think it would be useful here as well if there was extra funding/servers available.

posted 2008-May-18, 1am AEST
User #187613   2237 posts
Whirlpool Forums Addict

And no Google does not suffice.

Rated you a :D for that. Totally agree.

Google is designed for searching the web, not a specific website.

posted 2008-May-18, 2am AEST
User #187613   2237 posts
Whirlpool Forums Addict

WTW writes...

Would you also want to search the wiki at the same time?

Now i feel that is a waste of resources. Most duplicate threads are asking for help, EG. "FireFox wont start!!!one1!", the wiki does not cover anything of this type.

posted 2008-May-18, 3am AEST
User #54136   1365 posts
Whirlpool Enthusiast

◄ŞKyЯЇDΣ► writes...

Most people who dont search are normally just too dumb.

Bad choice of words there??... maybe "unfamiliar" would be better do you think??

Just because someone is using the forum for the first time doesn't mean they are "dumb".

I am no expert, but a ""search"" button on any forum is a must if you want to reduce duplicated posts... surely??

posted 2008-May-18, 3am AEST
User #151652   1893 posts
Whirlpool Enthusiast

Futurama writes...

User makes post> see's warning> ignores> posts> flamed because there are 40,000 other threads.

Seeing the warning and ignoring it does not take up a database query.

More like:

Click "New Thread", posts thread, people post, maybe a couple of flames.

vs

Click "New Thread", search because you saw the warning, bring up an old thread, read a few pages of the thread, post, the thread is revived so people go into it and start reading lots of posts. Maybe people even start to reply to posts on the first page and re-open old arguments.

Now my example wouldn't happen every time, but it's an example of how just making a new thread can have less impact.

posted 2008-May-18, 9am AEST
User #185880   1326 posts
Whirlpool Enthusiast

I think this feature should be added, but it shouldn't stop you from posting the thread. Sometimes the thread search may give you several results, except all of them are outdated.

posted 2008-May-18, 9am AEST
User #121498   625 posts
Whirlpool Enthusiast

Thor writes...

I'm really not sure how many people would read it based on things we see where users ignore much more blatant indications...

We have in the past put very bright warnings on the thread creation page for certain forums when there has been new plans etc. and people have completely ignored them,

Sadly, every time we attempt to make something idiot-proof, the world just invents a better idiot !

posted 2008-May-18, 9am AEST
User #40586   17918 posts
Senior Moderator

tikalal writes...

impact

I think people are confusing performance impact (which is what I was referring to originally) and the impact of having duplicate threads.

We don't want unnecessary duplication of threads and hence will close or deleted and refer duplicate threads to existing ones where appropriate.

posted 2008-May-18, 10am AEST
User #150148   2619 posts
Whirlpool Forums Addict

Jay! writes...

33 queries an hour
0.55 queries a second!


Do you mean 33 queries a minute?

posted 2008-May-18, 12pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

☢Anonymous.☢ writes...

Do you mean 33 queries a minute?

Err.. Wait, i dont know. It doesnt matter though.

posted 2008-May-18, 2pm AEST
User #150148   2619 posts
Whirlpool Forums Addict

Jay! writes...

Err.. Wait, i dont know. It doesnt matter though.

I was just going to say.... :D

posted 2008-May-18, 2pm AEST
User #10408   1655 posts
Whirlpool Enthusiast

◄ŞKyЯЇDΣ► writes...

Most people who dont search are normally just too dumb.

I'm sure WP policy is that dumb $%^#s are allowed. It's up to the group to teach them WP rules.

I think I am in the "need to be educated" at times :-)

posted 2008-May-18, 7pm AEST
edited 2008-May-18, 7pm AEST
User #3103   2062 posts
Whirlpool Forums Addict

Simon Wright writes...

I've been wanting to implement a feature like this for some time. However it's not possible without an advanced full text index to power the suggestion algorithm. I need more server resources for that.

Have you considered the Lucene engine? I think I've seen it in use for the forum search before, but perhaps even that is not up to scratch to cope with the load.

I know its not appropriate for all applications, but I have used it before and have seen it perform very well in some cases.

posted 2008-May-19, 1am AEST
User #151652   1893 posts
Whirlpool Enthusiast

Thor writes...

I think people are confusing performance impact (which is what I was referring to originally) and the impact of having duplicate threads.

Yeah, I was. Thanks.

posted 2008-May-19, 7am AEST
User #187613   2237 posts
Whirlpool Forums Addict

What about caching? Simon could always cache the first page thread list for every forum, keep the cache updated every 30 seconds or something, that would surely reduce load?

posted 2008-May-19, 11am AEST
User #85070   10818 posts
Whirlpool Forums Addict

if people are too rtarded to search for a thread which may already have there answer then there probaly going to post anyway even if there is a million threads before it.

should ban titles like "slow torrent speeds" that would be actualy useful

posted 2008-May-19, 2pm AEST
User #79501   3486 posts
Whirlpool Forums Addict

If there was server capacity this would be great just like posting a question to support eg netlogistics it shows you similar questions before you ask yours

posted 2008-May-19, 5pm AEST
User #40586   17918 posts
Senior Moderator

Jay! writes...

Simon could always cache the first page thread list for every forum

But that doesn't help with searches...

posted 2008-May-19, 5pm AEST
User #4771   1505 posts
Whirlpool Enthusiast

Jay! writes...

Why dont we have a feature like digg.com that when you post a thread, it will say "We have found threads similar to yours, are you sure yours isnt a duplicate?"

Then it can list, say 5 or 10 threads. This way, people asking for help can see other threads that have already been answered. And much much more.


This idea would also rely on "replying to archived threads" in order to be successful:
/forum-replies.cfm?t=967362

posted 2008-May-19, 7pm AEST
User #4771   1505 posts
Whirlpool Enthusiast

Double posted by Seamonkey.

posted 2008-May-19, 7pm AEST
edited 2008-May-19, 7pm AEST
User #40586   17918 posts
Senior Moderator

Wilber Washbucket writes...

This idea would also rely on "replying to archived threads" in order to be successful:

No. I'd say sanely archiving threads would be critical for the idea to work so people don't bump a 5 year old thread on a topic, but can still read it.

posted 2008-May-19, 8pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Thor writes...

But that doesn't help with searches...

Average load drops 20%, allows 20% of load to be used on searches. Dur!

posted 2008-May-19, 11pm AEST
User #187613   2237 posts
Whirlpool Forums Addict

Wilber Washbucket writes...

Double posted by Seamonkey.

What?

posted 2008-May-19, 11pm AEST
User #151652   1893 posts
Whirlpool Enthusiast

I think Seamonkey is some kind of program or script.

posted 2008-May-20, 8pm AEST
User #73862   667 posts
Whirlpool Enthusiast

Search is very punishing on database servers, especially with the amount of data involved in forums.

I usually read the last few pages of threads before posting a new one, and all that's after trying to bludgeon information out of google.

How about re-locating the 'new thread' button to the 3rd page of the threads listing. So that at least you have to read the last 90 thread titles before posting yours.

posted 2008-May-24, 10pm AEST
User #40586   17918 posts
Senior Moderator

tikalal writes...

I think Seamonkey is some kind of program or script.

It is a Web Browser.

posted 2008-May-24, 10pm AEST
User #92566   19387 posts
Whirlpool Forums Addict

I agree with the suggestion, but like Thor concede it will make minor difference to serial offenders. Heck these are the people who not only don't bother to search, they typically remake their thread when it gets herringed/deleted and just try again.

(yes I'm just as worked up about this as excessive threads, lol).

On the issue of load, given the frequency the Whirlpool search is disabled quite often now I'm not sure it'll work that well either (unless you can improve the accuracy or sorting of the Google implementation)

posted 2008-May-29, 12pm AEST
User #124700   4581 posts
Whirlpool Forums Addict

The search engine needs to be improved, that's all. I try to search and unless I type EXACTLY the phrase I'm looking for, it won't find it. Having to do 8 searches before posting is just ridiculous.

posted 2008-May-30, 10am AEST
 
© Whirlpool Broadband Multimedia