+ Post New Thread
Results 1 to 2 of 2

Thread: How to redirect/block bad bots/crawlers

  1. #1
    Fli
    Fli is online now
    Administrator Fli's Avatar
    Join Date
    Mar 2013
    Posts
    2,587

    How to redirect/block bad bots/crawlers



    I found this code that should redirect some bad bots and crawlers out of the site:

    Code:
    RewriteEngine On
    RewriteCond %{REQUEST_URI} !/robots.txt$
    RewriteCond %{HTTP_USER_AGENT} ^.*BLEXBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BlackWidow.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Nutch.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Jetbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebVac.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Stanford.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*scooter.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*naver.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*dumbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Hatena\ Antenna.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*grub.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*looksmart.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebZip.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*larbin.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*b2w/0.1.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Copernic.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*psbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Python-urllib.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*NetMechanic.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*URL_Spider_Pro.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*CherryPicker.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*EmailCollector.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*EmailSiphon.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebBandit.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*EmailWolf.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Email.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*ExtractorPro.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*CopyRightCheck.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Crescent.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*SiteSnagger.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*ProWebWalker.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*CheeseBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*LNSpiderguy.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*ia_archiver.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Alexibot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Teleport.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*MIIxpc.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Telesoft.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Website\ Quester.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*moget.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebStripper.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebSauger.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*NetAnts.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Mister\ PiX.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebAuto.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*TheNomad.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WWW-Collector-E.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*RMA.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*libWeb/clsHTTP.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*asterias.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*httplib.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*turingos.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*spanner.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Harvest.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*InfoNaviRobot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Bullseye.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebBandit.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*NICErsPRO.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Microsoft\ URL\ Control.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*DittoSpyder.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Foobot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebmasterWorldForumBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*SpankBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BotALot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*lwp-trivial.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebmasterWorld.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BunnySlippers.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*URLy\ Warning.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*LinkWalker.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*cosmos.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*hloader.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*humanlinks.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*LinkextractorPro.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Offline\ Explorer.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Mata\ Hari.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*LexiBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Web\ Image\ Collector.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*The\ Intraformant.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*True_Robot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BlowFish.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*SearchEngineWorld.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*JennyBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*MIIxpc.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BuiltBotTough.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*ProPowerBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BackDoorBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*toCrawl/UrlDispatcher.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebEnhancer.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*suzuran.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*WebViewer.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*VCI.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Szukacz.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*QueryN.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Openfind.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Openbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Webster.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*EroCrawler.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*LinkScan.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Keyword.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Kenjin.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Iron33.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Bookmark\ search\ tool.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*GetRight.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*FairAd\ Client.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Gaisbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Aqua_Products.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Radiation\ Retriever\ 1.1.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Flaming\ AttackBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Oracle\ Ultra\ Search.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*MSIECrawler.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*PerMan.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*searchpreview.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*sootle.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Enterprise_Search.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Bot\ mailto:[email protected]*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*ChinaClaw.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Custo.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*DISCo.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Download\ Demon.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*eCatch.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*EirGrabber.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*EmailSiphon.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*EmailWolf.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Express\ WebPictures.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*ExtractorPro.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*EyeNetIE.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*FlashGet.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*GetRight.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*GetWeb!.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Go!Zilla.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Go-Ahead-Got-It.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*GrabNet.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Grafula.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*HMView.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*HTTrack.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Image\ Stripper.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Image\ Sucker.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Indy\ Library.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*InterGET.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Internet\ Ninja.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*JetCar.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*JOC\ Web\ Spider.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*larbin.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*LeechFTP.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Mass\ Downloader.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*MIDown\ tool.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Mister\ PiX.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Navroad.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*NearSite.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*NetAnts.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*NetSpider.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Net\ Vampire.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*NetZIP.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Octopus.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Offline\ Explorer.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Offline\ Navigator.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*PageGrabber.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Papa\ Foto.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*pavuk.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*pcBrowser.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*RealDownload.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*ReGet.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*SiteSnagger.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*SmartDownload.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*SuperBot.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*SuperHTTP.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Surfbot.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*tAkeOut.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Teleport\ Pro.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*VoidEYE.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Web\ Image\ Collector.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Web\ Sucker.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebAuto.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebFetch.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebGo\ IS.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebLeacher.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebReaper.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebSauger.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Website\ eXtractor.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Website\ Quester.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebStripper.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebWhacker.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WebZIP.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Widow.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*WWWOFFLE.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Xaldon\ WebSpider.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Zeus.*$ [NC,OR] 
    RewriteCond %{HTTP_USER_AGENT} ^.*Semrush.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BecomeBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*AhrefsBot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*MJ12bot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*rogerbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*exabot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*Xenu.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*dotbot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*gigabot.*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*BlekkoBot.*$ [NC,OR]
    RewriteCond %{REMOTE_ADDR} ^45\.117\.156\.154 [OR]
    RewriteCond %{REMOTE_ADDR} ^164\.132\.192\.23 [OR]
    RewriteCond %{REMOTE_ADDR} ^62\.210\.148\.119 [OR]
    RewriteCond %{REMOTE_ADDR} ^150\.107\.4\.13
    RewriteCond %{REQUEST_URI} !^/allowedbyallips\.html$
    RewriteRule ^(.*)$ http://pastebin.com/9dPXX0D9 [R=307,L]
    http://pastebin.com/9dPXX0D9 can be replaced by some URL where bad bots will be redirected. The IPs that are on the bottom can be ones that are not welcome and also will be redirected.

    More effective solution to minimize traffic wasted by the bad bots is to use tools like zBBlock or CIDRAM. Downside of these is that they will block from time to time also valid site visitors. (Blocked visitor can be given chance to contact admin via e-mail or just fill the captcha in case of CIDRAM)

    Here are another htaccess rules that can block more dangerous bots/people (hackers, exploiters): https://github.com/alidbg/htaccess_firewall

  2. #2
    Junior Member kumkum's Avatar
    Join Date
    Apr 2018
    Posts
    14


    Is this useful / helpfull? Yes | No
    We can manage bots with robots.txt file, you can block bots and allow block by creating this file.
    You can use below code in robot.txt file and then add below code:
    User-agent: *
    Disallow: /

    You can check file about robots.txt file.

+ Post New Thread

Similar Threads

  1. What is difference between spiders, crawlers and robots?
    By Peter97 in forum SEO - Search Engine Optimisation
    Replies: 4
    Last Post: 11-08-2017, 07:02 AM
  2. How to block bad bots like Ahrefs or MJ12 Majestic
    By Fli in forum HTML,CSS, Javascript Coding & Programming
    Replies: 1
    Last Post: 07-29-2016, 02:02 AM
  3. Make robots.txt readable to bots only?
    By Fli in forum HTML,CSS, Javascript Coding & Programming
    Replies: 0
    Last Post: 04-08-2014, 12:31 PM
  4. Replies: 0
    Last Post: 04-08-2014, 09:48 AM
  5. Replies: 1
    Last Post: 11-18-2013, 01:35 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
 Protected by : ZB BLOCK  &  StopForumSpam