Recent Topics

1 Sep 08, 2005 18:47    

It appears many suffer from spam. There is a need of a "ultimate" antispam plug-in or hack for [url=http://b2evolution.net]b2evolution[/url]. That plug-in would include:
[list]

  • referrer filtering including:[list][*]a local and shared dynamic referrer blacklist of banned URLs (base domains, pages and partial keywords);

  • a local and shared dynamic referrer whitelist of trusted sites (base domains, base domains with path or exact pages only, no partial keywords);

  • a local referrer dynamic greylist of unknown referrers to bann (once or add to the blacklist) or to authorize (once or add to the whitelist) with optional alert system;

  • a double check (load the page, check if it does link to the page it pretends to) of the greylist entries based on a scheduled basis or when the server has lot of unused CPU (take into account some page may link, but are not accessible: online e-mail services, etc.) with optional alert system;

  • anti-flood feature by limiting the amount of logged hits on a period of time from the same source base domain, from the same source page on the whole destination blog, on the same destination URL (avoiding logging and sending a "server busy" message of any kind);

  • an optional manual validation of greylisted hits before using them into the publicly available site stats (once authorized, those hits should automatically be included to stats);

  • possible .htaccess update (real-time, scheduled or manual);

  • easy access summary of latest referrers with admin actions;[/list:u]

  • IP filtering including:[list][*]a local and shared dynamic IP blacklist of banned hosts (single IPs or IP ranges);

  • a local and shared dynamic IP whitelist of trusted hosts to authorize;

  • a local dynamic IP greylist of unknown hosts to bann (once or add to the blacklist) or to authorize (once or add to the whitelist) with optional alert system;

  • an optional manual validation of greylisted hits before using them into the publicly available site stats (once authorized, those hits should automatically be included to stats);

  • anti-flood feature (avoiding someone reloads the page too often reducing server performance, a 304 HTTP command or other "server busy" message might be sent then);

  • possible .htaccess update (real-time, scheduled or manual);

  • easy access summary of latest IPs with admin actions;[/list:u]

  • user agent filtering:[list][*]a shared and local dynamic blacklist of banned user agents;

  • a shared and local dynamic whitelist of authorized user agents;

  • a local greylist of unknown user agents to bann or to bann (once or to add to the blacklist) or to authorize (once or to add to the whitelist);

  • possible robots.txt update;

  • possible .htaccess update (real-time, scheduled or manual);

  • easy access summary of latest IPs with admin actions;[/list:u]

  • comment filtering including:[list][*]a local and shared dynamic e-mail blacklist of banned e-mails;

  • a local and shared dynamic e-mail whitelist of authorized e-mails (needs any kind of identity authentification);

  • a local and shared dynamic e-mail greylist of unknown e-mails to bann or to authorize;

  • a comment validation system sending an e-mail to the pretended comment author in order to validate it (be aware of not becoming a spamming service ourselves by this way!);

  • fake comment forms before and after the real one (see [url=http://www.simong.org/index.php?p=739]A short monograph on the theme of blog comment spam[/url]);

  • CAPTCHA systems (see [url=http://www.w3.org/TR/turingtest/]Inaccessibility of Visually-Oriented Anti-Robot Tests (W3C)[/url], [url=http://www.w3.org/2004/Talks/0319-csun-m3m/Overview.html]Escape from CAPTCHA (W3C)[/url], [url=http://sam.zoy.org/pwntcha/]PWNtcha - captcha decoder[/url]):[list][*]visual (see [url=http://www.village-idiot.org/archives/2005/01/28/b2evo-captcha-explained/]Captcha for b2evolution, explained[/url]);

  • audio;

  • logic (see [url=http://wp-plugins.net/index.php?id=533]WP-SpamQuiz[/url]);[/list:u][*]accept comments from JavaScript-enabled browsers only, adding a client-side:[list][*]"keypressed" checker (see [url=http://www.simong.org/index.php?p=739]A short monograph on the theme of blog comment spam[/url]);

  • hash computation code (see [url=http://elliottback.com/wp/archives/2005/05/11/wordpress-hashcash-20/]Wordpress Hashcash 2.3[/url]);[/list:u][*]a [url=http://www.sixapart.com/pronet/plugins/plugin/bayesian.html]bayesian filter[/url];

  • a link filtering:[list][*]usage of the referrer filtering blacklist and whitelist to remove spamming URLs;[*]usage of the IP filtering blacklist and whitelist to remove spamming servers (convert the domains of the given URLs into IPs, then test the IP);[*]removing links from displayed comments;[*]limit the maximum number of words per link;[*]limit the maximum number of links per comment;[/list:u][*]an optional manual validation for both white- and greylists;

  • anti-flood feature limiting the amount of added comments on a period of time globally, on the same post, from the same IP or with the same content;

  • [url=http://www.sixapart.com/pronet/comment_spam#obscurity]Security by Obscurity[/url]:[list][*][url=http://www.sixapart.com/pronet/comment_spam#commentscript]Changing the Name of Your Comment Script[/url];[*][url=http://www.sixapart.com/pronet/comment_spam#formvalues]Adding Additional Form Values[/url];[*][url=http://www.sixapart.com/pronet/comment_spam#obfuscation]Form Obfuscation[/url].[/list:u][/list:u]

  • trackback filtering:[list][*]usage of the previously defined referrer filtering (take into account an acceptable trackback may not link back to your post);

  • usage of the previously defined IP filtering;

  • an optional manual validation for both white- and greylists;

  • anti-flood feature limiting the amount of added comments on a period of time globally, on the same post, from the same IP or with the same content;

  • hide trackbacks' URL (from RDF metadata and (X)HTML code) and display it (in the (X)HTML) using JavaScript (see [url=http://mt-hacks.com/20041203-mtdisguisetrackbackurl-v05-beta.html]MTDisguiseTrackbackURL[/url], [url=http://underscorebleach.net/jotsheet/2005/08/trackback-spam]TrackBack spam[/url]);

  • auto-generate trackback URLs with expiration delay (see [url=http://www.paulstimesink.com/index.php?op=ViewArticle&articleId=381&blogId=2]Other possible ways to fight trackback spam[/url]);[/list:u]

  • visitors behavior filtering (see [url=http://www.ioerror.us/software/bad-behavior/]Bad Behavior / Bad Behaviour[/url]);

  • external services usage:[list][*]use external DNSBL services for checking and reporting spamming IPs ([url=http://dsbl.org/]Distributed Sender Blackhole List[/url], [url=http://opm.blitzed.org/]Blitzed Open Proxy Monitor List[/url], [url=http://www.spamhaus.org/]SpamHaus[/url], [url=http://www.spamcop.net/bl.shtml]SpamCop[/url], [url=http://bsb.empty.us/]Blog Spam Blocklist[/url], etc.) and URIs ([url=http://bsb.empty.us/]Blog Spam Blocklist[/url], [url=http://www.surbl.org/]Spam URI Realtime Blocklists[/url], [url=http://www.antisplog.net/]Antisplog.net[/url],etc.), see [url=http://weblog.sinteur.com/index.php?p=8106]Yet another anti-spam measure[/url] for an example;

  • use other blogging software antispam blacklists ([url=http://www.jayallen.org/comment_spam/]MT-Blacklist[/url], etc.) for both checking and reporting spamming IPs and URLs;

  • report spamming sites to [url=http://www.google.com/contact/spamreport.html]Google[/url] (must be fully documented, including any objective and usefull evidences that both the spammer and the promoted site are related; see [url=http://spamhuntress.com/]Spam Huntress[/url]);

  • external authotification/identity validation:[list][*][url=http://www.sixapart.com/typekey/]Six Apart TypeKey[/url];

  • [url=http://www.microsoft.com/net/services/passport/developer.asp]Microsoft .NET Passport[/url];

  • [url=http://www.gravatar.com]Gravatar[/url] (this is not an authentification system, but might be used as a poorly-protected user identification system using the [url=http://www.gravatar.com/blog/archives/2005/03/22/xml-info-via-rest-api/]Gravatar REST API[/url]);

  • [url=http://avatars.yahoo.com]Yahoo! Avatars[/url] (this is not an authentification system, but might be used to check [url=http://www.yahoo.com]Yahoo![/url] e-mails);

  • etc.[/list:u][/list:u]

  • maybe implemented as a separate plug-in, this antispam plug-in-based plug-in or a duplication of the same plug-in (two installs of the same plug-in with different parameters?), a pink list might avoid garbage of spam lists, that list would contain adult-oriented checkers (base domains, base domains with path, page URLs, keywords, [url=http://www.icra.org]ICRA[/url] label checking with sex and violence level evaluation).[/list:u]

  • All the above features should be optional and choosen by the user. The order the filtering is performed should also be choosen by the user (with automatic suggestion based on server performance).

    In the above list, "dynamic lists" means those list can both grow and reduce in order to keep them efficient, but as small as possible, since more data processing involves more CPU resources and many [url=http://b2evolution.net]b2evolution[/url] users complain about [url=http://b2evolution.net]b2evolution[/url] CPU usage. Each entry should have a life duration after what it should be confirmed to maintain or disappear from the list.

    An "alert system" is a summary of miscellaneous events sent to the blog admin. This summary should be available in the backoffice with optional e-mail (and maybe other ways to immediatly alert the admin) on real-time or scheduled basis.

    Finally, a strong help should be included with automatic suggestions in order to help beginner bloggers to efficiently fight against spam.

    2 Sep 08, 2005 19:07

    your suggestions are all awesome .. but havent a good deal of them been included/discussed by you here already though? And you mention this being covered in a plugin, but you have it in feature requests.. Thats important only because :

    I preface this by saying im not a dev, BUT:

    Adding all of those features to the core of b2evo would seem to me, atleast, an attempt to make b2evo an anti-spam application, as opposed to a blogging package. Know what I mean?

    In other words, plugins yes, all of that in the core, no.

    Thats my vote atleast. Except for the moderation of comments -- which is a defualt feature in nearly every other blog package Ive looked at.

    Ive always been one to stress the development of the core features, not begin adding extras.

    Thats the joy of plugins. No bloat, and the user can pick and choose.

    3 Sep 08, 2005 20:11

    You're right, whoo, most of those suggestions have already been discussed in several threads in the forums, including those ones:

    4 Sep 09, 2005 07:51

    There is one argument I can't agree with you, however. Some of b2evolution users are so much harassed by spam they have strong difficulties to use that blogging system. Making that plug-in or hack part of b2evolution would significantly reduce spam. It appears spammers look specifically for b2evolution blogs to spam them (see How to control referer spam hits from search engines ). Making it more difficult for spammers to spam b2evolution-based blogs would make that platform more efficient and usefull. When you pretend to craft the best car in the world, you also ship it with a good (if not the best) locking system, don't you? The current antispam system is fine, but not perfect (okay, none is).

    I defer to Wordpress.

    A MUCH larger target for spam, I assure you -- and very little in the way of built-in defenses..

    There is a proxy block for comments. Last I checked, it was still susceptable to blocking regular/non proxy ips. (it might be fixed -- I havent looked at it lately)

    There is comment moderation.

    There is a VERY easy one line hack for adding trackback moderation.

    The rest : ALL plugins.

    The logic is simple:
    1. An intuitive plugin interface allows for easy plugins to be made.
    2. That (#1) allows users to pick and choose without the bloat that built in stuff causes.
    3. Devs are able to work on making it a better blog.

    Yes, it is arguable that "less spam" makes for a better blog, I realize that.

    ----

    MT: same thing, plugins

    phpBB: same thing: mods.

    CAPTCHA was a phpBB addon longgggg before it was actually included in a default install of phpBB. And thats pretty much all the phpBB devs plan on adding. Again, they want to focus on the core.

    ---

    Perhaps, a blog that came standard with a gazillion ways to block spam would garner more attn, in the long run, but thats an if, for sure -- and if the blog itself were substandard at the expense of fighting spam -- thats surely a bad bet.

    5 Sep 12, 2005 01:31

    whoo wrote:

    The logic is simple:
    1. An intuitive plugin interface allows for easy plugins to be made.
    2. That (#1) allows users to pick and choose without the bloat that built in stuff causes.
    3. Devs are able to work on making it a better blog.

    You are probably right on these points. Moreover, competition is supposed to lead to better products. Having several teams working on several antispam plug-ins is probably making all these teams better work.

    There is still one condition to that: there must be several teams! For now, I see about half a dozen of people working on the core and another half a dozen publishing some hacks and plug-ins. Apparently, there is not enough developers involved in [url=http://b2evolution.net]b2evolution[/url] to make competition really efficient here...

    I really hope the next major version of [url=http://b2evolution.net]b2evolution[/url] would make possible to write easily installable plug-ins so anybody is going to be able to install them.

    6 Sep 12, 2005 02:01

    Looks that way to me. A bit of skin editing for the ones that make sidebar stuff, but basically drop the file in the folder and turn it on in your back office.

    7 Sep 12, 2005 17:28

    The previously defined external services features:

    • external services usage:
        an external services referrer and IP DNSBL;
      • external authotification/identity validation (Six Apart TypeKey, etc.);[/list:u][/list:u]appear to be incomplete. Maybe there should be something like:
          [*]external services usage:
            [*]use external DNSBL services for checking and reporting spamming IPs and URLs;
          • use other blogging software antispam blacklists (WP, MT, etc.) for both checking and reporting spamming IPs and URLs;

          • report spamming sites to [url=http://www.google.com/contact/spamreport.html]Google[/url] (must be fully documented, including any objective and usefull evidences that both the spammer and the promoted site are related; see [url=http://spamhuntress.com/]Spam Huntress[/url]);

          • external authotification/identity validation (Six Apart TypeKey, etc.);[/list:u][/list:u]

          • I've just updated the above list to make it up to date.

    8 Sep 12, 2005 17:58

    The previously defined comment filtering features:

    • comment filtering including:
        [*]a local and shared dynamic e-mail blacklist of banned e-mails;
      • a local and shared dynamic e-mail whitelist of authorized e-mails (needs any kind of identity authentification);

      • a local and shared dynamic e-mail greylist of unknown e-mails to bann or to authorize;

      • a comment validation system sending an e-mail to the pretended comment author in order to validate it (be aware of not becoming a spammign service ourselves by this way!);

      • a CAPTCHA system for comments;

      • an optional manual validation for both white- and greylists;

      • anti-flood feature limiting the amount of added comments on a period of time globally, on the same post, from the same IP or with the same content;[/list:u][/list:u]might be extended to include some additional features:

        • comment filtering including:
            [*]a local and shared dynamic e-mail blacklist of banned e-mails;
          • a local and shared dynamic e-mail whitelist of authorized e-mails (needs any kind of identity authentification);

          • a local and shared dynamic e-mail greylist of unknown e-mails to bann or to authorize;

          • a comment validation system sending an e-mail to the pretended comment author in order to validate it (be aware of not becoming a spammign service ourselves by this way!);

          • a [url=http://www.village-idiot.org/archives/2005/01/28/b2evo-captcha-explained/]visual[/url] and/or audio CAPTCHA system for comments;

          • a client-side "keypressed" checker;

          • accept comments from JavaScript-enabled browsers only;

          • a [url=http://www.sixapart.com/pronet/plugins/plugin/bayesian.html]bayesian filter[/url];

          • a link filtering:
              [*]removing links from displayed comments;[*]limit the maximum number of words per link;[*]limit the maximum number of links per comment;[/list:u][*]an optional manual validation for both white- and greylists; [*]anti-flood feature limiting the amount of added comments on a period of time globally, on the same post, from the same IP or with the same content;[/list:u][/list:u]I've just updated the original list of features.

    9 Sep 17, 2005 00:36

    There is an interesting article about comment spam:

    10 Sep 24, 2005 01:59

    I've just found a very interesting piece of software:

    11 Sep 30, 2005 00:13

    I've updated the features list with the following positions:

    12 Sep 30, 2005 01:29

    kwa, great posts and links and appreciated.

    I think there needs to a focus on plugin solutions for any spam fixes to B2 because as a user and not a coder I am reluctant to install any of the myriad of excellent hacks and solutions floating around this forum, knowing that a major version update is just around the corner.

    I would appreciate some comment from Francois on the matter and what direction he sees Anti Spam taking with B2

    13 Sep 30, 2005 03:32

    John wrote:

    I think there needs to a focus on plugin solutions for any spam fixes to B2 because as a user and not a coder I am reluctant to install any of the myriad of excellent hacks and solutions floating around this forum, knowing that a major version update is just around the corner.

    For now, I try to understand what spammers do to spam blogs and what solutions are already applied on other blogging systems and web applications to reduce spam effects at minimum cost.

    Some people are publishing statistics on their filters efficiency. The [url=http://www.simong.org/index.php?p=739]A short monograph on the theme of blog comment spam[/url] article appears to be very interesting from that point of view. Renaming comment submission script and using hidden forms is cheap, but it works fine. Sending the position of the mouse on the submission button when the user clicked is useless and creates false positives. It's interesting to know where to spend time!

    As a developer, I might be interested to write an antispam plug-in for [url=http://b2evolution.net]b2evolution[/url]. However, I probably don't have enough experience in web applications development, PHP and antispam techniques. However, since a lot of techniques have already been successfully experienced in other blogging systems plug-ins, that should make easier to make similar things for [url=http://b2evolution.net]b2evolution[/url]. If the software licences make it possible, it would even be possible to use the original code as a start point. (I don't like to reinvent the wheel and I don't claim I would do better than others.)

    Moreover, I noticed very rare blogging systems have a centralized blacklist as [url=http://b2evolution.net]b2evolution[/url] has. In fact, I know of [url=http://b2evolution.net]b2evolution[/url] and [url=http://www.jayallen.org/comment_spam/]MT-Blacklist[/url]. [url=http://www.wordpress.org]WordPress[/url] plug-ins mainly use the [url=http://www.jayallen.org/comment_spam/]MT-Blacklist[/url] blacklist. Other plug-ins use some real-time blacklists to check spamming IPs and domains.

    Finally, after reading the list of features to have in that "ultimate" antispam tool, it appears that cannot be a one developer project.

    14 Sep 30, 2005 03:44

    I've just updated the initial features list request with the following changes:

    • trackback filtering:

    15 Oct 05, 2005 14:18

    I've just discovered a new word: [url=http://en.wikipedia.org/wiki/Splog]Splog[/url]. The [url=http://b2evolution.net]Wikipedia[/url] defines it as:

      Splog From Wikipedia, the free encyclopedia. Spam blogs, sometimes referred to by the Neologism splogs, are Web Log (or "blog") sites which the author uses only for promoting affiliated websites. The purpose is to increase the PageRank of the affiliated sites and/or get ad impressions from visitors. Content is often nonsense or text stolen from other websites with an unusually high number of links to sites associated with the splog creator which are often disreputable or otherwise useless Web sites. Splogs have become a major problem on free blog hosts such as Google's Blogger service. These fake blogs waste valuable disk space, bandwidth, and pollute search engine results. The term splog was popularized around mid August 2005 when it was first used by some high profile bloggers but appears to have been used a few times before for describing spam blogs going back to at least 2003.[/list:u] I've been spammed by some comment spammers for the past few days. They appear to promote splogs. [url=http://www.antisplog.net/]Antisplog.net[/url] identifies splogs. It claims referencing 2 millions of them. It references those that spam me right now. I've added in the above list that service, other similar services exist. If you know any of them, I would be interested to hear about.

    16 Oct 23, 2005 21:47

    I just had a referer antispam idea that involves b2 grabbing a copy of the referer page and parsing it to make sure there actually is a link to your site from that page. Of course this does require some processing power and additional bandwidth so having the ability to turn this feature on and off would be required but it could theoretically eliminate the vast majority of referer spam.

    17 Oct 24, 2005 06:40

    BenFranske wrote:

    I just had a referer antispam idea that involves b2 grabbing a copy of the referer page and parsing it to make sure there actually is a link to your site from that page. Of course this does require some processing power and additional bandwidth so having the ability to turn this feature on and off would be required but it could theoretically eliminate the vast majority of referer spam.

    Some might argue some pages are legit referrers, but cannot be accessed. Those include e-mails (from online services), private pages (like forums you have to log in before accessing) and other cases (including Flash and JavaScript enables pages requiring browser capabilities). However, those pages don't need to appear as referrers to visitors.

    That idea has been alread presented there:

      [*]referrer filtering including:
        [*]a double check (load the page, check if it does link to the page it pretends to) of the greylist entries based on a scheduled basis or when the server has lot of unused CPU (take into account some page may link, but are not accessible: online e-mail services, etc.) with optional alert system;[/list:u][/list:u]


    Form is loading...