PDA

View Full Version : How to disallow/ban Google bot to view certain webpage



Fli
01-13-2016, 09:57 AM
How to disallow Google bot to view a webpage?

Here is one idea:

<?
if(strpos($_SERVER['HTTP_USER_AGENT'],'google') !== false ) { header('HTTP/1.0 404 Not Found'); exit(); }
if(strpos(gethostbyaddr(getenv("REMOTE_ADDR")),'google') !== false ) { header('HTTP/1.0 404 Not Found'); exit(); }
?>

If user agent contains "google", then header 404 (not found) is sent and script is stopped.
If IP address host contains "google", then same thing happens.

---------
Similar topic, how to hide part of webpage from bots: https://internetlifeforum.com/html-css-forum/1743-how-hide-link-other-part-webpage-bots-like-googlebot/

ELyon01
06-01-2017, 12:02 PM
Hey there! thanks for that code. Helped a lot.

virtueltime
05-07-2020, 12:33 PM
an other way is to add <meta name="googlebot" content="noindex" /> as suggested here developers.google.com/search/reference/robots_meta_tag

vijaykhatri96
05-11-2020, 09:44 AM
an other way is to add <meta name="googlebot" content="noindex" /> as suggested here developers.google.com/search/reference/robots_meta_tag

I totally agree with this option but if you want to do that with the help of on-page coding then you can add a tag like <meta name="googlebot" content="noindex" /> That will block the request of crawling this page.

Leasedlayer
08-13-2020, 10:47 AM
Here are a few examples

Robots.txt file URL: www.example.com/robots.txt
Blocking all web crawlers from all content
User-agent: * Disallow: /

Blocking a specific web crawler from a specific folder
User-agent: Googlebot Disallow: /example-subfolder/

kumkum
07-20-2021, 04:42 PM
You can disallow Googlebot in robots.txt file. And you can also use this file and mention any other bots.
You can get more details in robotx.txt file here (https://hoststud.com/resources/how-to-create-robots-txt-file-to-block-bots.522/).