Results 1 to 6 of 6

 

Thread: robots.txt wildcard question

  1. #1
    Registered User

    Status
    Offline
    Join Date
    Mar 2006
    Posts
    48
    Thanks
    0
    Thanked 0 Times in 0 Posts


    I need all search engines to exclude any files that end in .axd on my website.


    User-agent: Mediapartners-Google*
    Disallow: *.axd

    User-agent: *
    Disallow: *.axd

    User-agent: Googlebot*
    Disallow: *.axd


    Will this work? I have heard that the wildcard only works with Google?

  2. #2
    Negative SEO is fun!

    Status
    Offline
    Join Date
    Sep 2003
    Posts
    1,389
    Thanks
    0
    Thanked 39 Times in 35 Posts
    All you need is this section :

    User-agent: *
    Disallow: *.axd

    Any "good" bot should see that, and obey it. Bad bots you'll have to deal with yourself (exclusion by IP and / or UA)

    Note that browsers are never affected, so there will be no user impact.

    >> I have heard that the wildcard only works with Google?

    All bots that obey robots.txt should be fine with wildcards. Try and make sure you use Unix line endings though, some bots need it to parse the file correctly. If you don't know how, get a free text editor that supports them (I like NoteTab, personally), and create your file as normal, just save it using the "Unix line endings" option

  3. #3
    Registered User

    Status
    Offline
    Join Date
    Mar 2006
    Posts
    48
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I don't think that is correct - the wildcard (*) is [i]not[/n] standard robots.txt protocol.

    This is from Matt Cutt's Blog:

    "Finally, each search engine has slightly different extra options that they support. For example, Google permits wildcards (*) and the “Allow:” directive MSN and Yahoo do not support wildcards - it's not in the official robots.txt protocol"


    Can anyone help me with this, or do I need to exclude all the files manually?

  4. #4
    Negative SEO is fun!

    Status
    Offline
    Join Date
    Sep 2003
    Posts
    1,389
    Thanks
    0
    Thanked 39 Times in 35 Posts
    Well, if you're going to get technical, robots.txt isn't a standard at all...

    User-agent: * will work fine though; wildcards are permitted in the UA field. You will need to move all your .axd files to a specific folder, and simply disallow that folder, so :

    User-agent: *
    Disallow: /axd

    where /axd contains all your .axd files would do just fine, and will be compliant. Otherwise you could simply list every specific .axd file thus :

    User-agent: *
    Disallow: foo.axd
    Disallow: bar.axd
    Disallow: snafu.axd
    etc...

  5. #5
    Registered User

    Status
    Offline
    Join Date
    Mar 2006
    Posts
    48
    Thanks
    0
    Thanked 0 Times in 0 Posts
    unfortunately that is not possible. they are virtual files created by the site... guess i'll have to exclude them manually...

    robots.txt isn't a standard? well we'll call it a pseudo-standard as all major SEs use it lol...

  6. #6
    Negative SEO is fun!

    Status
    Offline
    Join Date
    Sep 2003
    Posts
    1,389
    Thanks
    0
    Thanked 39 Times in 35 Posts
    >> they are virtual files created by the site

    Couldn't you jig the site to assign a URL to the files of <root>/axd/filename.axd? Use a JS document.write statement to do the same? Or just us JS links to the files anyway, so SEs won't "see" the links in the first place?

    >> robots.txt isn't a standard? well we'll call it a pseudo-standard as all major SEs use it lol...

    Technically, no it isn't a standard. It is similar to the more recent "nofollow" attribute in that the major SEs have agreed to use / support it, as it helps solve a problem for everybody. Consider how many bots you see who plainly DON'T respect robots.txt, there's frickin'thousands of 'em.... so, no, it's not a standard



Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. New member question - Shop directories
    By dave21 in forum Affiliate Marketing Lounge
    Replies: 9
    Last Post: 11-03-05, 12:18 PM
  2. Replies: 2
    Last Post: 18-11-04, 09:51 AM
  3. Question for the established\expert marketeers
    By AndyCoke in forum Affiliate Marketing Lounge
    Replies: 8
    Last Post: 24-08-04, 08:47 AM
  4. PHP question......again :)
    By uklejon in forum Programming
    Replies: 6
    Last Post: 11-05-04, 04:07 PM
  5. robots.txt
    By bertm in forum Programming
    Replies: 1
    Last Post: 17-12-02, 08:25 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
To Top

Content Relevant URLs by vBSEO 3.5.0 RC2