Richard

Getting WP Site Indexed/Crawled

To the many, getting your site indexed/crawled by Google and other related search engines is a pain and time consuming. You will hear yourself (between the screaming, swearing and hair pulling) saying 'What more can I do to get indexed?! Using Wordpress can be a tough haul getting your message out there, that's for sure. What many do not know and scratch their heads as to why they cannot seem to get their site links indexed is, in the main, due to a small file called 'robots.txt'. Its a clever little bugger, it can be your best friend or your worst enemy! If you don't get this right you will soon hear Google will tell you (if you have an account) that it has a problem with your site. Typically it will say : "Google systems have recently detected an issue with your homepage that affects how well our algorithms render and index your content." "Specifically, Googlebot cannot access your JavaScript and/or CSS files because of restrictions in your robots.txt file. These files help Google understand that your website works properly, so blocking access to these assets can result in sub-optimal rankings." Euston, we have a problem!... If you do not have this or is not set up right, you'll have problems. So to get the maximum exposure for your site and make Google happy, you will need to generate an 'optimal' Wordpress 'robots.txt' file. I have included the following : <!-- START ROBOTS --> User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /category Disallow: /tag Disallow: /author Disallow: /trackback Disallow: /*trackback Disallow: /*trackback* Disallow: /*/trackback Disallow: /*?* Disallow: /*.html/$ Disallow: /*feed* # Google Image User-agent: Googlebot-Image Disallow: Allow: /* # Google AdSense User-agent: Mediapartners-Google* Disallow: Allow: /* Sitemap: https://www.yoursite.com/sitemap.xml <!-- END ROBOTS --> Of course you can further 'tweak' this to suit your needs but if your relativity new to all this, just stick to the basics for now. Not everyone is a 'techy' so here's how : 1) Open WORDPAD or some text editor, 2) Open a new blank page, 2) Cut and paste the code above into it, 3) Change the 'sitemap' url to reflect your domain which points to your 'sitemap.xml' (and yes, you need to create one of those too!) 4) Save this file as (lowercase) > robots.txt 5) If you know what your doing, go ahead and upload this file to the 'main route' of you website - if you don't, send the file to your host and ask them to do it for you. OK, let's check all is well : Go to https://www.google.com/webmasters/tools/ (If you don't have an account, set one up, its free) In there, you can get google to 'look' at this new file (self explanatory) It will, if you followed the procedure, tell you all is well and your done! You should see a significant change to how your site pages are indexed, usually within 24 hours. Bu remember, this is just a 'tip of the iceberg' in getting your message out there but getting this part right will be a major step forward! Enjoy! ...

Created: August 8, 2015 at 4:55 am
  • In: Directory Theme
  • Started by: RichardRichard
  • 3 members left 3 comments
  • Last reply from: JasonJason

  • Mark
    Mark
    August 8, 2015 at 8:23 am

    Actually Richard, you should just have the following in your :

    [code title=””]User-Agent: *
    Disallow: [/code]

    As explained in this article:

    If you have a WordPress SEO by Yoast plugin installed, you can edit the file from within the WP admin area by going to admin > SEO > Tools > File Editor – the top text box should be the

  • Richard
    Richard
    August 8, 2015 at 3:31 pm

    Hi

    According to Google, it has no issues with this code.

    I am not keep to leave the whole whole site open when only parts are required to be indexed/accessed.

  • Jason
    Jason
    December 16, 2017 at 7:48 pm

    Hi Richard.

    Can you upload one here that has all this stuff, one that we can just alter out domain name on it please buddy ?


Viewing 4 posts - 1 through 4 (of 4 total)