{"id":2743,"date":"2017-01-10T22:13:01","date_gmt":"2017-01-10T21:13:01","guid":{"rendered":"http:\/\/olivier.hoarau.org\/?p=2743"},"modified":"2017-01-10T22:14:38","modified_gmt":"2017-01-10T21:14:38","slug":"lutter-contre-le-spam-referrer-avec-awstats","status":"publish","type":"post","link":"https:\/\/olivier.hoarau.org\/?p=2743","title":{"rendered":"Lutter contre le spam referrer avec awstats"},"content":{"rendered":"<p style=\"text-align: justify;\">J&rsquo;exploite les logs de mes domaines<a href=\"http:\/\/www.hoarau.org\"> hoarau.org<\/a> et <a href=\"http:\/\/www.funix.org\">funix.org<\/a> h\u00e9berg\u00e9s avec <a href=\"http:\/\/www.online.net\">online<\/a> (h\u00e9bergement mutualis\u00e9) avec <strong>awstats<\/strong>. Toutes les nuits, <strong>cron<\/strong> r\u00e9cup\u00e8re les fichiers log d&rsquo;<strong>Apache<\/strong> sur un serveur <strong>ftp<\/strong> et je lance l&rsquo;analyse avec <a href=\"http:\/\/funix.org\/fr\/linux\/index.php?ref=logapache#Analyser%20les%20logs%20d%27Apache%20avec%20webalyser\">webalizer<\/a> et <a href=\"http:\/\/funix.org\/fr\/linux\/index.php?ref=logapache#Analyser%20les%20stats%20avec%20awstat\">awstats<\/a> comme expliqu\u00e9 par <a href=\"http:\/\/funix.org\/fr\/linux\/index.php?ref=logapache#Une_application_pratique\">l\u00e0<\/a>.<\/p>\n<p style=\"text-align: justify;\">La page des referrers est pollu\u00e9e par des SPAM qui la rend inexploitable et c&rsquo;est assez p\u00e9nible. En fait c&rsquo;est une technique des spammeurs qui font des requ\u00eates pour que le site \u00e0 promouvoir apparaisse dans la liste et ainsi cela am\u00e9liore son positionnement sur les moteurs de recherche en multipliant les liens. Encore faudrait-il que cette page soit visible sur internet, ils peuvent \u00e9galement esp\u00e9rer qu&rsquo;un administrateur clique sur un lien.<\/p>\n<p>Il y a plusieurs techniques pour lutter contre \u00e7a, l&rsquo;une d&rsquo;entre elles est de leur bloquer l&rsquo;acc\u00e8s au site avec un bon vieux<strong> .htaccess<\/strong> \u00e0 la racine. Ce n&rsquo;est pas forc\u00e9ment l&rsquo;id\u00e9al car \u00e7a engendre un temps de traitement et \u00e7a peut ralentir l&rsquo;acc\u00e8s au site. J&rsquo;opte plut\u00f4t pour la technique en temps diff\u00e9r\u00e9 pour faire le m\u00e9nage avec <strong>awstats<\/strong>. Pour cela il faut activer la variable suivante<\/p>\n<p><strong>SkipReferrersBlackList=\u00a0\u00bb\/etc\/awstats\/blacklist.txt\u00a0\u00bb<\/strong><\/p>\n<p style=\"text-align: justify;\">avec un fichier <strong>blacklist.txt<\/strong> qu&rsquo;on trouvera dans l\u2019arborescence d&rsquo;<strong>awstats<\/strong> mais qui date un peu. Alors\u00a0 par <a href=\"ttps:\/\/perishablepress.com\/blacklist\/ultimate-referrer-blacklist.txt\">ici<\/a> on trouvera une blacklist nettement plus r\u00e9cente. Dans ce fichier, d&rsquo;apr\u00e8s mes tests il semblerait que la premi\u00e8re partie qui commence par des <strong>RewriteCond<\/strong> ne serve \u00e0 rien pour <strong>awstats<\/strong>, elle n&rsquo;est utile que si vous filtrez le SPAM referrer avec un<strong> .htaccess<\/strong>. Ce n&rsquo;est que la seconde partie qui est r\u00e9ellement utile et qui fonctionne avec <strong>awstats<\/strong>,<\/p>\n<p style=\"text-align: justify;\"><!--more--><\/p>\n<p style=\"text-align: justify;\">Elle commence par:<\/p>\n<p><strong># This is the URL blacklist from the chongqed.org database<\/strong><br \/>\n<strong> # it is available from http:\/\/blacklist.chongqed.org\/<\/strong><br \/>\n<strong> # You can use each line below as a regular expression<\/strong><br \/>\n<strong> # that can be tested against URLs on your wiki.<\/strong><br \/>\n<strong> # The last spammer was added on 2008-09-11 10:14:51<\/strong><br \/>\n<strong> # Check http:\/\/blacklist.chongqed.org\/ for updates<\/strong><\/p>\n<p>J&rsquo;ai commenc\u00e9 \u00e0 compl\u00e9ter la liste par les lignes suivantes<\/p>\n<p>[pastacode lang=\u00a0\u00bbmarkup\u00a0\u00bb manual=\u00a0\u00bbhttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fproxtrail%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fdenterum%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fprofeservice%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fbazakanstovarov%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fbalkanfarma%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fsobervoditel%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Favtokor-23%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Favtokor23%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fxn--j1at1a.xn--p1ai%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Frupolitshow%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fvyezd-viyezd%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Figru-2015%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fjeribetejewu%5C.c0%5C.pl%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fcreditservise%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fmegamashiny%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fseoxbeep%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fwoman3050%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fwww.vselgoty%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fplaypokeronline%5C.dk%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fmedical%5C.in%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fviagralevitradzheneriki%5C.ru%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3F%5C.com%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Ftasgroup%5C.it%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3F%5C.co%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fkiev%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fbringtwo%5C.net%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fcleaningservices%5C.kiev%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fstudio-topkapi%5C.eu%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fkruchen%5C.com%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Ffreedom%5C.co%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fbringtwo%5C.net%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fstudio-topkapi%5C.eu%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fc0%5C.pl%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fbazakanstovarov%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fxikiz%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Flesbianmilf%5C.xblog%5C.in%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fpamyatniki-in-kiev%5C.com%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fcarivka%5C.com%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fagent-05%5C.su%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fwebsolution%5C.com%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fpamjatnik%5C.com%5C.ua%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fstartimes%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Farktech%5C.co%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fgoohey%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fhimalayan-imports%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fpizza-imperia%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fnowellgroup%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fddrgame%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Ftorrinomedica%5C.it%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Foliveriobalcells%5C.com%0Ahttps%3F%3A%5C%2F%5C%2F(%5B%5E%5C%2F%5D*%5C.)%3Fgiocagiocagioca%5C.com\u00a0\u00bb message=\u00a0\u00bb\u00a0\u00bb highlight=\u00a0\u00bb\u00a0\u00bb provider=\u00a0\u00bbmanual\u00a0\u00bb\/]<\/p>\n<p style=\"text-align: justify;\">Ca m&rsquo;a vite saoul\u00e9 parce que tous les jours il y a des nouveaux sites qui font leur apparition, le plus simple est de bloquer carr\u00e9ment certaines extensions !<\/p>\n<p style=\"text-align: justify;\"><strong>^https?:\/\/[^\/]+\\.ru<\/strong><br \/>\n<strong> ^https?:\/\/[^\/]+\\.ua<\/strong><br \/>\n<strong> ^https?:\/\/[^\/]+\\.su<\/strong><br \/>\n<strong> ^https?:\/\/[^\/]+\\.link<\/strong><br \/>\n<strong> ^https?:\/\/[^\/]+\\.cc<\/strong><br \/>\n<strong> ^https?:\/\/[^\/]+\\.in<\/strong><\/p>\n<p style=\"text-align: justify;\">\u00e7a bloque tous les sites en<strong> .ru<\/strong>, en <strong>.ua<\/strong>, etc. C&rsquo;est assez radical mais comme ces extensions sont \u00e0 99,9% li\u00e9es \u00e0 des spammeurs, le risque de faire un faux positif est quasi nul. Comme il est quasi impossible de tenir \u00e0 jour une liste de sites \u00e0 jour, c&rsquo;est sans doute la m\u00e9thode la plus efficace. Tant qu&rsquo;\u00e0 faire comme ce n&rsquo;est pas forc\u00e9ment int\u00e9ressant de voir les referrers de son propre domaine, on les bloque \u00e9galement:<\/p>\n<p><strong>https?:\\\/\\\/([^\\\/]*\\.)?hoarau\\.org<\/strong><br \/>\n<strong> https?:\\\/\\\/([^\\\/]*\\.)?funix\\.org<\/strong><\/p>\n<p style=\"text-align: justify;\">et l\u00e0 je peux vous garantir que la liste des referrers est assainie et enfin exploitable.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>J&rsquo;exploite les logs de mes domaines hoarau.org et funix.org h\u00e9berg\u00e9s avec online (h\u00e9bergement mutualis\u00e9) avec awstats. Toutes les nuits, cron r\u00e9cup\u00e8re les fichiers log d&rsquo;Apache sur un serveur ftp et je lance l&rsquo;analyse avec webalizer et awstats comme expliqu\u00e9 par l\u00e0. La page des referrers est pollu\u00e9e par des SPAM qui la rend inexploitable et &hellip; <a href=\"https:\/\/olivier.hoarau.org\/?p=2743\" class=\"more-link\">Continuer la lecture de <span class=\"screen-reader-text\">Lutter contre le spam referrer avec awstats<\/span>  <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":true,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false,"_share_on_mastodon":"0"},"categories":[5,12,10],"tags":[],"class_list":["post-2743","post","type-post","status-publish","format-standard","hentry","category-logiciels-libres","category-vie-de-funix","category-vie-de-mes-sites"],"share_on_mastodon":{"url":"","error":""},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/peOjJ-If","jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=\/wp\/v2\/posts\/2743","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2743"}],"version-history":[{"count":2,"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=\/wp\/v2\/posts\/2743\/revisions"}],"predecessor-version":[{"id":2745,"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=\/wp\/v2\/posts\/2743\/revisions\/2745"}],"wp:attachment":[{"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2743"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2743"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/olivier.hoarau.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2743"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}