Page 2 of 2 FirstFirst 12
Results 16 to 26 of 26

 

Thread: Legalities of scraping a site

  1. #16
    Registered User

    Status
    Offline
    Join Date
    Jan 2009
    Posts
    126
    Thanks
    0
    Thanked 1 Time in 1 Post


    Quote Originally Posted by Colin@DVDTimes View Post
    What about Google's cache - that's full copies of the data and content from many many sites.
    true. They don't even have the decency to wrangle it first. Blatant copyright theft. They should lock Serge up with the piratebay guys.

  2. #17
    90% of all sites are crap

    Status
    Offline
    Join Date
    Nov 2003
    Location
    the moon
    Posts
    1,704
    Thanks
    89
    Thanked 68 Times in 44 Posts
    If they ran their index thru your wrangler it'd look like the Yahoo serps LMAO
    Tokyo::Paris::New York::Bromley

  3. #18
    Registered User

    Status
    Offline
    Join Date
    Jan 2009
    Posts
    126
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by tomj View Post
    If they ran their index thru your wrangler it'd look like the Yahoo serps LMAO
    :tup

    fortunately, young jedi, you know not of what you speak

    Yoda day approaches

  4. #19
    90% of all sites are crap

    Status
    Offline
    Join Date
    Nov 2003
    Location
    the moon
    Posts
    1,704
    Thanks
    89
    Thanked 68 Times in 44 Posts
    lol.........
    Tokyo::Paris::New York::Bromley

  5. #20
    Registered User

    Status
    Offline
    Join Date
    Jan 2009
    Posts
    126
    Thanks
    0
    Thanked 1 Time in 1 Post
    yoda for president!

  6. #21
    Registered User

    Status
    Offline
    Join Date
    Apr 2009
    Posts
    47
    Thanks
    0
    Thanked 7 Times in 6 Posts
    Hi,

    I thought I would suggest a couple of steps you could follow:

    1. Is the automated collection of date prohibited in the website's terms of use? If so, you would be in breach of contract and could face a claim for diminuition of performance (i.e the loss of business caused by your scraping activities affecting the performance of the site).

    (You could also be threatened with a complaint under the Computer Misuse Act, although I have not yet heard of this being done).

    2. In addition, it is likely that unauthorised scraping of content will constitute infringement of intellectual property rights including copyright (such as the whole or parts of blog posts) and/or database rights (if you are extracting chunks of the database). Wikipedia is quite good for an explanation of these rights.

    3. As with everything, the most important consideration is who you are dealing with. Is the "data" the owners key asset? Will they notice the scraping in their server logs? My only experience of taking coordinated action against screen scrapers was where it was having a significant impact on server load and customers' experience of the site. The threat of action was used to "encourage" the scrapers to utilise the owner's API rather than to prevent use of the data. As the scrapers were all affiliates, it was not in the owner's interest to prevent reuse.

    Finally, I find a useful tool for checking on whether people are reusing your content is Copyscape - Search for Website Plagiarism and Duplicate Content Online. You can also enter the sites you are scraping to see if your site is a blatant copy.

    Hope that is helpful.

  7. The Following User Says Thank You to LawyerAffiliate For This Useful Post:

    tomj (20-04-09)

  8. #22
    Registered User

    Status
    Offline
    Join Date
    Jan 2009
    Posts
    126
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by LawyerAffiliate View Post

    1. Is the automated collection of date prohibited in the website's terms of use? If so, you would be in breach of contract and could face a claim for diminuition of performance (i.e the loss of business caused by your scraping activities affecting the performance of the site).
    I find that interesting. As a lawyer, do you think that a company that publicly presents something has any right to impose terms and conditions on those who view it? I can see how USING the scraped content could be dangerous, but I can't quite get my head around the idea that you can impose T&Cs on the audience visiting a publicly accessible page.

    If that IS a firm legal precedent though, then, lawyer affiliate, I hate to tell you this but my T&Cs for this thread require you to pay me £1 for loading it in your browser (£1 per load). PM me for payment details.


  9. #23
    Registered User

    Status
    Offline
    Join Date
    Apr 2009
    Posts
    47
    Thanks
    0
    Thanked 7 Times in 6 Posts
    Hi ContentBoss,

    It is an interesting point you raise. If you look at the Terms and Conditions at the very foot of this page, para 1.1 seeks to do that - "By using the Website/Services you are fully accepting the terms, conditions and disclaimers contained in this notice. If you do not accept these Terms and Conditions you must immediately stop using the Website/Services."

    Whether this is enforceable against the user is difficult, for terms to be binding you would need to show that the users has read and accepted the terms before. With clickwrap software this is easy - and why you have the "I accept" button, but with website terms and conditions it is a little hazy.

    In practice, if you wanted to enforce the terms against a user of your content, you would allow them to scrape for a month or so then send a letter, preferable recorded, putting them on notice of the relevant terms. If they continued to use/scrape the content, they would be on notice of the terms and in breach of contract. They do not need to respond to your letter to have accepted the terms, but obtaining a recorded delivery signature provides evidence that they were notified of the terms of their continuing use.

    Unfortunately, your request for £1 should have been included at the start of the post, contractual terms should be incorporated prior to performance of a contract. But, I will knock it off my bill

    Please let me know if you have any other questions.

  10. #24
    Registered User

    Status
    Offline
    Join Date
    Jan 2009
    Posts
    126
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by LawyerAffiliate View Post
    In practice, if you wanted to enforce the terms against a user of your content, you would allow them to scrape for a month or so then send a letter, preferable recorded, putting them on notice of the relevant terms. If they continued to use/scrape the content, they would be on notice of the terms and in breach of contract.
    interesting. So I can put up a big poster on junction 8 of the M4, and in small print at the bottom of the poster write 'by looking at this poster you agree to our terms and conditions and will send £1 to lawyeraffiliates.co.uk immediately'. And then photograph cars going past it every day for a month. And then look them up via their number plates and invoice them. Doesn't sound very 'contractually convincing' to me.

    Flippancy aside, you seem to be assuming that 'scraping' is in some way different from 'surfing'. Most scrapers go to great lengths to make the target site believe it's just a regular firefox browser request or something. Which brings us back to the real point, which is that unless the scraper is incredibly incompetent, you can only even START your process once you have somehow proven they have 'scraped' your site. And that would normally rely on them republishing it. Which is a different issue altogether - copyright infringement.

    I still don't think you can unilaterally enforce a contract on someone who happens to view something that you have made publicly available. But I could be wrong.

    After all, *I'm no lawyer* :sneaky

  11. #25
    Dynamoo's Avatar
    Mooooo

    Status
    Offline
    Join Date
    Dec 2003
    Location
    Somewhere in Bedfordshire
    Posts
    1,908
    Thanks
    5
    Thanked 60 Times in 43 Posts
    On one of my websites I have a schedule of charges for syndication of material coming in at £1000/per page if paid within 14 days or £5000/page after 14 days.

    On just one occasion I have needed to send out an invoice for that. When the Finance Director of a company gets slapped with a £25,000 demand then it tends to concentrate their minds.

    Another thing: if your crawling of someone else's server interferes with their business, then they could sue you for tortious conduct which can cover many things that criminal law does not.

    Here's another grey area: RSS feeds. If you were to take someone's RSS feed and re-publish it, then I think this would be a little different from standard scraping. You could argue that as the second "S" stands for "Syndication", that publishing an RSS feed gave an implied right of re-use (unless they specify otherwise). I have dozens of sites carrying my RSS feeds and it does no harm.. although I only syndicate a summary and not the whole content.
    Never email donotemail@WeAreSpammers.com

  12. #26
    Registered User

    Status
    Offline
    Join Date
    Apr 2009
    Posts
    47
    Thanks
    0
    Thanked 7 Times in 6 Posts
    Quote Originally Posted by ContentBoss View Post
    Flippancy aside, you seem to be assuming that 'scraping' is in some way different from 'surfing'. Most scrapers go to great lengths to make the target site believe it's just a regular firefox browser request or something. Which brings us back to the real point, which is that unless the scraper is incredibly incompetent, you can only even START your process once you have somehow proven they have 'scraped' your site. And that would normally rely on them republishing it. Which is a different issue altogether - copyright infringement.
    I think this is the important part, "screen scraping" is generally used to cover two slightly different activities: 1. the constant interrogation of a site's database (e.g to build a flight comparison engine), in which case it is, in my experience, difficult for the scraper not to appear in the server logs, unless they are going to significant effort to hide their activities; or 2. the gathering of sections of text or other material on a much smaller scale, in which case I agree, it is unlikely it will ever be noticed from the server logs.

    As you state (and ducking the issue of the enforceability of the standard terms), taking "material" from another site is likely to constitute infringement of copyright and/or database right. The latter is useful where the information being scrapped is not sufficient to attract copyright in itslef (e.g business contact details) but is part of a database.

    Thanks

Page 2 of 2 FirstFirst 12


Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Amazon bot scraping my site?
    By ep90 in forum Affiliate Marketing Lounge
    Replies: 3
    Last Post: 08-04-09, 02:14 PM
  2. Server clear out deal site, ping site multi upload site and more.
    By KPR in forum Domains & Websites For Sale
    Replies: 1
    Last Post: 27-03-09, 05:52 PM
  3. Legalities.org.uk
    By pendragon in forum Domains & Websites For Sale
    Replies: 0
    Last Post: 24-03-09, 03:27 PM
  4. Merchant Scraping
    By victor_m in forum Affiliate Marketing Lounge
    Replies: 7
    Last Post: 28-10-07, 08:40 PM
  5. Screen Scraping DGM
    By Pandini in forum Programming
    Replies: 8
    Last Post: 06-04-05, 01:26 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
To Top

Content Relevant URLs by vBSEO 3.5.0 RC2