Why won't Site Auditor crawl my website?


Site Auditor will crawl most websites but there are some situations where Site Auditor's page crawler is stopped in its tracks before it can accumulate data on your pages:

  • Links on your website use Javascript. Site Auditor only follows "a href" links and does not currently follow Javascript links. That includes websites built using Wix, which are not supported in Site Auditor.

  • You’re blocking IP addresses. Raven uses Amazon Web Services (AWS), so if you're blocking their range of IP addresses, Site Auditor won’t be able to crawl your site. You can write an exception to allow access to our specific user agent in this instance, which is as follows: Mozilla/5.0 (compatible; RavenCrawler/2.0; +https://raventools.com/seo-website-auditor/). This should also work if your site is in development and you're only giving access to specific user agents.

  • You're using security software to prevent unwanted traffic. Tools like Cloudflare and Incapsula are valuable for preventing DDoS attacks against your website, but it also prevents tools like Site Auditor from accessing and crawling your website. You'll need to whitelist our user-agent in order to crawl your website.
  • Your robots.txt file is blocking search engines. If your robots.txt file is set to disallow page crawls (see robotstxt.org for details), our crawler will not be able to access your site. To allow Raven to crawl your site, add the following code to your robots.txt file:

    User-agent: RavenCrawler
    Allow: /

Because security is different from website to website, we aren't able to assist with creating security exceptions – we can only provide information about how the crawler is accessing your website. Additionally, please note that Auditor will only crawl public pages. Anything behind a log-in screen or password will not be crawled.

Have more questions? Submit a request