Q. Is data scraping legal?
A.Data scraping from public data repositories is very common and in most cases legal. However, copyright infringement is a major concern for us. If your purpose is to steal site Y's content so you can put it on your site and try to earn some lunch money off pay-per-click ads, don't ask us to help you.
The goal of data scraping is not to try and get something for free or little cost, but to save you lots of money in the long term by automating essential business processes that are currently being done manually or not at all due to overwhelming cost or time constraints. If you are trying to scrape an online directory that you could buy for $400 bucks, but just don't feel like coughing up the cash, our services are not for you. (And if they were we'd still charge you more than the $400 price tag of the directory.) That said, it is up to you to determine the legality of the way you plan to use our services.
ScrapeGoat operates under the assumption that the data you ask us to extract will be used legally and ethically and that you have obtained all necessary permission from the targeted data source. We reserve the right to refuse service to anyone wishing to use our services in an illegal manner or in any way we deem inappropriate.
Q. What kinds of things can you do with data scraping?
A.There are virtually unlimited things you can do with data scraping. Think market reasearch, business intelligence, market positioning. All of these areas are using data scraping to gain the edge over the competition. Imagine legally data mining your competitor's website to compare prices, products offered, business partners acquired and other critical data. We have well known corporations employing our services for these kinds of tasks right now!
Reputation or Perception management! What if you were alerted to every good or bad comment said about your company or product on a blog, forum or website and could respond with correction or enhancements before mis-information was spread around the Internet? Companies have gone out of business, products discontinued and reputations ruined all from blog attacks! Prevent this from happening to your company by using automated data scraping!
Legitimate uses for data scraping abound everywhere from content managment, data entry, data analytics, stock market analytics, data verification, data updates, quality assurance, market intelligence, automated web searches, dynamic content for wireless devices, and so many other valuable and clever uses.
Take a few minutes and just think about how you will save time, money and frustration by automating business processes that are currently being done manually. Give us a call to discuss your business processes and see how we can help!
Q. How much do you charge?
A.This varies as much as the projects we receive. Generally we estimate our bids at $85 an hour. Please describe your project to us in as much detail as possible and we will calculate the time it will take us to complete the project and provide you with an estimate. Of course your next question is probably something like, "can you just give me a ball park figure?". Ok... We generally don't accept any projects below about $750. (Well we did once, but that was a very noble cause.)
Most of our projects seem to range between $2500 and $10,000.Of course, we have a handful of corporate and government clients who spend significantly more than that:)
Q. Do you get the data for us, or do we have to get it ourselves using your tools?
A.We act as a service provider, extracting data on your behalf. We have a custom built in-house Data Management Center. Our data center is built using a very creative, patent pending combination of hardware and custom scraping software that allows us to extract data from myriads of sources very quickly and anonymously.
If you absolutely must have a stand-alone tool to run from your location we can build one for you, but there are licensing premiums, Non-Disclosures, Non-Distributions and Non-Compete contracts that must be signed before we can do so.
We are currently working on an extraction appliance server that can run from your location and interface with our Data Managment Center to provide onSite service, yet still have all the benefits of our technology and data center. The appliance server will be specifically targeted to corporate and government clients.
Q. What is a screen scraper?
A. You probably already have some idea of what it is or you wouldn't be reading this. A screen scraper is simply a programming script, service or other automated method that collects information from a webpage or other data source and returns it in an appropriate format for different needs. Once the data is collected or harvested, a screen scraper may include additional code for analyzing the data.
Screen scrapers may also be known as data scrapers, web scrapers, page scrapers, content scrapers, data harvestors, bots, agents, crawlers, spiders, indexers and probably a two dozen other names.
Q. What is anonymous scraping?
A. For various legitimate reasons, a client may want to remain anonymous to the targeted website or data repository being scraped. To accomodate those requiring anonymity, we offer proxy domain registration and anonymous hosting. The only circumstance in which your private information may be released is with a court order. In over 10 years of being in this business, we have never yet received a court order. (Knock on wood)
Q. Can a website or other data source block a scraper?
A. There are many things website owners do to reduce automated activity on their sites. This is done to prevent bots from using up limited bandwidth needed for actual clients surfing the site or otherwise using the data. With our scraping methods, we attempt to leave as small a footprint as possible. However if a site owner asks us to stop scraping their website or datasource, we will immediately comply.
Q. Another programming company told me that scraping siteabc.com is impossible because of ____, can you do it?
A. We'd sure like to try. Much of our business comes from clients that have been turned away by other programming companies for various reasons. If we can't scrape it, there is never a charge to you. It is a rare occassion that we have not been able to capture data from a targeted data source.
Q. I found a scraping software program on the web for $39.95. Can't I just use that.
A. Sure. Call us when your done pulling your hair out.
We've tried every off the shelf scraping program we can find. The vast majority are not worth the time it takes to download them. There are a couple of nice tools out there, but the license cost for the best ones are staggering. (Quite often more than you'll pay us over the lifetime of your project.)
When buying a software tool, you must learn their proprietary programming language, or hire a programmer to learn it for you. Then once you've figured out the tool or paid a programmer's salary while they figure it out, you need to learn the do's and don'ts of data scraping to avoid getting your business's reputation tarnished or your IP address blocked. Then you get to keep paying a programmer's salary for the life of the project.
You'll also need to learn how to deal with cookies, forms, http authentication, sessions, frames, javascripts, redirects, and a hundred other little obstacles inherent in data mining that your tool likely doesn't handle on its own.
Q. What language will my scraper be written in?
A. We use many different languages including PHP, Python, Ruby, C++, C#, Java, PERL, ASP/ASP.NET, etc. We will do the project in the language most suitable for the task. If you require a specific language and are licensing the source code, we will discuss your specific language requirements on a case by case basis.
Q. How fast can you have my data delivered?
A. That depends on your scraping project, the language it is built in, and the type/quality of data you wish to acquire. For some projects we start delivering data in minutes or hours. Others take months to build. We will give you an estimated time of delivery with our bid.
Q. Will I own the source code to the scraper?
A. No. We do not provide you with the source code unless there are incredibly compelling reasons for us to do so. In the past we have readily done so, only to see our code show up in public places or being used to start competing businesses. In such cases where your project does warrant licensing the code to you, we will setup a very strict contract agreement between us on a case by case basis. For most projects we will extract the data for you and deliver it to you in your required format.
Q. Do you keep or resell the data you collect for us?
A. No. We never keep data once it has been delivered to you or past the time you have asked us to store the data for analytic purposes. We are not a data broker or data storage service. If we kept all of the data we collected we would need a warehouse of equipment and space. For most projects we simply collect the data, format it for your needs and pass it on to you. You can then store it for whatever your purposes are. We purge all data after the projected is completed. For ongoing projects we purge the data as it becomes obsolete or otherwise unneeded.
Q. If the scraper breaks will you fix it for free?
A. If we are extracting data for you as a service, we will automatically fix any errors as they occur as part of your service contract. Ninety-nine percent of the time a disruption of service occurs it is because the website or datasource from which the scraper was extracting data has changed. Most of the time we catch these errors and fix them before you are even aware a problem occurred. If ScrapeGoat provided you with a stand alone tool, you were also provided with a maintenance option which would cover any bugs or disruption of service. If you elected not to have a maintenance agreement, we will fix any errors at our standard rate of $85/hr.
Q. The data I need is behind a secure login/.htaccess. Can I still get this data?
A. Yes. Our services handle secure logins, cookies, .htaccess files, etc... so long as you provide us with a correct login/password.
Q. Can you scrape files located on FTP, Email and other types of servers?
A. Yes. So long as you provide us with the source's login/password.
Q. I got a bid from China/India/Pakistan for only 50 cents for my two month project. Will you match their bid?
A. Yeah, good luck with that. We just had a client come to us whose $4000 login creditials showed up in a list of data repositories being sold for $49 out of India. Four weeks previously they'd hired an India company to code a project using that repository. Real work is required for a successful scraping project and you will likely get what you pay for.
There probably are some legit overseas companies that can handle your project, but all to often that low bid you got is so that a hacker or foreign company can have your credit card, steal any data from you, your server or data source that can be resold later for a profit. Reputable overseas companies have prices similar to ours. ScrapeGoat and all employees are based in the U.S.A.! (And we speak understandable English)
If you do decide to use an overseas company, please be very very careful. Never give them direct access to your server. (Think client list theft, credit card numbers, spam, etc.) If you must give them username/password access to a data repository, change your password immediately after the project is complete. If things go well for you great. Send us an email and let us know. If things don't go so well, we'll be here for you.
Q. How do you handle payment?
A. Once you accept our bid, a deposit of 1/2 the total estimated price is usually required to begin the project. Our preferred payment method is with Visa or Mastercard. We can also accept American Express or Discover Card through Paypal. If you wish to pay by business check, work will commence after your payment clears. Ongoing services will be invoiced to you on a monthly basis. You may leave a credit card on file with us for monthly or other re-occuring fees.
Q. Why does my credit card statement show eDream Design?
A. Although our website is ScrapeGoat.com we are officially owned and operated by eDream Design, LLC. See the
About Us page for more details.