[lug] Dealing with wiki spam: the BLUG wiki, moinmoin and mediawiki

Chris chris-blug at syntacticsugar.org
Thu Jan 13 14:35:14 MST 2005


On Thu, 13 Jan 2005 13:30:00 -0700 Neal McBurnett <neal at bcn.boulder.co.us> wrote: 
> The BLUG wiki is completely broken at the moment - even "recent
> changes" is nothing but spam.

Quite true.  Sadly, I didn't check to notice this until just the other
day.

> It seems to me that it is hard for ordinary users to roll back spam on
> moinmoin-based wikis - they have to actually edit out the changes by
> hand.

Thus my suggestion to scrap MoinMoin altogether.

<snip>
> It is based on a community of wikis that maintain a list of
> disreputable urls that are linked to by wiki spammers, and then "Any
> save with links that match one of those regular expressions will be
> denied".

That seems like a good idea, except that then we're playing
whack-a-mole against the spammers based on regexes.  I'd have more
faith in a Wiki that incorporates crm114 or some other trainable text
classification tool to detect spam vs. nonspam.  I wouldn't want to
use spamassassin - Wikispam looks very different from email spam, and
you don't even have email headers to help you.  From the list of IPs
responsible for the changes, I think they've got a botnet of
compromised windows systems they're using to make the changes.

> Mediawiki is better at dealing with spam.  I've been running a
> mediawiki-based wiki for 6 months, and have found it easy to explain
> to people how to help deal with spam - just a matter of four clicks.

Well, that's an improvement... but it's still four clicks, and
constant checking for spam.

In my email, things looking spammy are set aside into a folder of
their own.  I'll go through them once every few days to make sure
nothing legit slipped by.

I'd be impressed by a Wiki that has two queues for updates.  If a
classifier says something *might* be spam, it goes into the queue for
having a moderator look at it before sending it back.  On the other
hand, that's not horribly different from what Mailman's been doing for
the BLUG lists.  I'd have to go to a webpage once every few days to
flush the spam out of the moderator queue, often having to click,
scroll, click, scroll, etc. for almost half an hour to just remove the
spam.  It was very bad for patience and wrists.

I eventually changed the Mailman settings so that now, unless
someone's subscribed to the list, their posts drop into the void.
Sorry if that's bit any of you, I just don't have the time to spend my
life moderating away spam.

Thanks for the links.  I appreciate the references and suggestions,
it's useful reading.

I'll take a straw poll at BLUG tonight, but since we've really only
using it for book reviews, it'll probably just be easier to ask you
all to email me your reviews directly.  The mail does get through, if
you've been worrying.

--
    Chris Riddoch
epistemological humility



More information about the LUG mailing list