Smartproxy

Table of content

  • What is CAPTCHA?
  • Why am I getting CAPTCHAs?
  • Types of CAPTCHA
  • How to avoid CAPTCHA?
November 16, 2021
12 minutes read

Passport Control On The Website, Or What Is CAPTCHA?

We all know that moment when we set off on a cross-continental journey: long-haul flights, visa requirements, passport checks… But did you know that some websites do something similar? Apart from collecting data about you by reading your IP (e.g. location, internet service provider), some websites might add an extra layer of identity check. Yup, this additional identity check on websites is CAPTCHAs.

You’d most probably agree that CAPTCHAs are tricky. They ruin a lot of automated work and slow down research. So if CAPTCHAs have been ticking you off, you’re in the right place as you’re about to learn how to thrash them.

CAPTCHA message: I'm not a robot
  • Smartproxy >
  • Blog >
  • Browser Fingerprinting >
  • Passport Control On The Website, Or What Is CAPTCHA?

What is CAPTCHA?

CAPTCHA is an acronym for the Completely Automated Public Turing test to tell Computers and Humans Apart. It’s a test that checks whether a request to access a website is coming from a robot or a human being. The idea is that such a test is supposed to be relatively easy for human beings but pretty tough for computers.

These tests are intended to shield websites from unwanted traffic, spam, and abuse by checking whether the user trying to access a website is a real person or a bot. E-commerce sites like eBay and Shopify can use CAPTCHAs to prevent bots from buying a stack of limited edition items that could later be resold for a higher price.

The first CAPTCHAs for commercial purposes emerged in 2000. And guess what – web admins became obsessed with them. It didn’t take long for the sneaky eye of Google to notice the growing popularity of CAPTCHAs. A couple of years later, Google bought the rights to utilize its own version of CAPTCHA, known as reCAPTCHA.

What is reCAPTCHA?

ReCAPTCHA is just like CAPTCHA. Both aim to distinguish between a human and a bot and protect websites from bots. The difference is that reCAPTCHA is an improved version of the standard CAPTCHA. The cherry on the cake is that you can use reCAPTCHA free of charge.

Why am I getting CAPTCHAs?

So CAPTCHAs are like passport control officers at the airport – they validate your identity (request) before letting you go through. So what triggers CAPTCHA requests when searching the web?

The source of your IP

First and foremost, it’s the health of your IP address. It might be that your internet service provider has given you an IP that was recently used by hackers. You might be mistaken for a hacker and, trust us, you don’t want that.

Besides, be aware of fresh IPs. As the name suggests, these are IP addresses that have never been used before. Fresh IPs have no info about themselves across different websites, and for Google, this tabula rasa isn’t quite right.

Last but not least, if you’re sharing your IP address with other internet users, it means that you’re not the only person sending connection requests to Google. Many users are doing the same simultaneously! For Google, all of them look like a single computer is sending gazillions of queries. That’ll look like spam.

Browser fingerprinting

Device or browser fingerprinting is a way to identify internet users and track their activity online. Your fingerprint reveals not only the additional details about you but also your user-agent, which indicates your browser type, operating system, and much more. If any of the above looks suspicious, has been used too many times, or doesn’t have a clean history, you’re likely to get a CAPTCHA request.

Computer issues

Don’t take all those talks about updated antivirus software for granted. If your machine isn’t malware-free, it might be used to attack other computers and websites. And the worst thing is that you may have no clue that this is happening.

Browser fingerprinting

Add-ons

SEO ranking apps, ad blockers, and security add-ons change your browser’s behavior. And if bots are getting smarter, so is Google. So too many add-ons may trigger CAPTCHA responses.

Types of CAPTCHA

By and large, you might run into text-, picture-, and sound-based CAPTCHAs. But those three might fall into many more types.

Word problems

Human check with a word problem

These are tasks where users have to decipher some text. You might be asked to write a word in capital letters, retype it, or, if there are a few words in a row, you might need to write the last one only. The downfall of word problems is that these days, bots are as smart as a whip. They’re intelligent enough to crack most of such tasks.

Math problems

Human check with a math problem

How good were you at maths at school? If it was way above your head, this type of CAPTCHA would put you in a pickle. We’re just kidding. Who can’t solve something like “3+1” or “5+2”? Surprisingly enough, for bots, solving math problems is fiddly. So this type of captcha is simple (for people), secure, and quick.

Time-based

This one is a stopwatch! It records how much time a user needs to fill out a form. Humans will undoubtedly need more time to fill out the info than bots, which can do that in the blink of an eye. Although this is a neat type of CAPTCHA, it also diminishes user experience. Having to fill out a form every time you want to comment or write a message can really get on your wick.

Social media login

This type of test asks you to sign in or sign up using your Facebook, Instagram, Google, or other social media accounts. It’s time-saving and user-friendly as you don’t have to input all of the information manually. Yet, this type of CAPTCHA means linking your social media account to the website you’re trying to access. Hence, some folks are in a tizzy over the security of personal information.

Confident CAPTCHA

Human check with a confident CAPTCHA - pick right object(s)

Confident CAPTCHA is based on images. You might be given a puzzle of pictures and asked to click on each image that shows a plane, a dog, a flower, and a whatnot. The test has a considerable success rate (hence the name “confident”), but it can be maddening. Even the slightest mistake will lead to performing the task from scratch.

Sweet CAPTCHA

Human check with a sweet CAPTCHA - drag the right object to the right place in a row

This is a sibling of confident CAPTCHA. This time, you’ll be introduced to some cute (hence, “sweet”) pictures and asked to move or match items. Say, you have an image of a basket. Next to it, there are four different images, from which you are asked to pick the ball and drag it to the basket. Like with confident CAPTCHA, it’s hard to crack for a bot. Unfortunately, it might interfere with user experience. Every mistake will result in performing the task once again.

Honeypot

A honeypot tricks bots into filling out many hidden fields that humans can’t even see. There’s a 99.9% chance that you’ve encountered a honeypot CAPTCHA without being aware of it. 

The CAPTCHA is very easy to install when creating a website. All that the developer needs to do when creating a website is add a hidden field, assign a random name to it, and make the rule “display:none” using CSS. It’ll hide the field from the human eye but will be tempting to fill out for bots.

ReCAPTCHA v2 (no CAPTCHA reCAPTCHA)

It’s a masterpiece of Google that validates the user with a single checkbox. All you have to do is click on the box saying “I’m not a robot”. Bots are methodical, so they usually click on the center of the box. Humans, on the other hand, are most likely to click in some other area of the box, not directly in the middle.

reCAPTCHA example

Invisible reCAPTCHA

Introduced a few years ago by Google, this type of test monitors users’ behavior, e.g. mouse movements, while they’re on a website. Google did a great job keeping its recipe of invisible reCAPTCHA tightly under wraps because no one really knows how it actually works.

It’s called “invisible” because the user doesn’t see it (at first). There’s no text to enter or images to match. However, if a website thinks that something fishy is going on with your actions online, they’ll ask you to fill out a form.

Although it’s supposed to be invisible, it’s not fully unseeable. Everybody knows that Google collects user information from a website. Hence, you must inform your users about that, so websites that use invisible reCAPTCHA will have to include the image below somewhere on their website:

Invisible reCAPTCHA widget

How to avoid CAPTCHA

With such a wide assortment of CAPTCHAs out there, you might feel as if internet sites are trying to box you into a corner. The good news is that there are ways to crush the army of CAPTCHAs and access the content you want. Sure, tests to tell computers and humans apart are getting smarter too so you might need to follow more than one tip below. Yet, these will help you feel much more confident when browsing the web.

To avoid CAPTCHAs, do the following:

#1 Change your IP address

#2 Get a unique static IP

#3 Ditch unreliable proxy services

#4 Mind your limits

#5 Take care of your browser

#6 If using a bot, take some extra steps

#1 Change your IP address

As mentioned before, it might be the case that your IP address is marked for spam because of suspicious activity. Luckily, there’s an easy way to get a new IP address. As internet service providers normally use dynamic IP addresses, all you need to do is reset your modem or router connection to receive a new IP address.

#2 Get a unique static IP

Remember we also said that when you get your internet connection set up, you become part of a shared Wi-Fi network? If someone on your network is sending too much automated traffic, the entire network of IP addresses used by that ISP might get blocked. Ask your ISP for a unique static IP to avoid blocks.

#3 Ditch unreliable proxy services

Proxies hide your real IP address, routing your traffic from a different location. If the proxy server that routes your traffic is iffy, your connection request will also smell fishy. Don’t fall for free or super-cheap services as they get the money you don’t pay by collecting your private data and selling it to third parties. No wonder you might come across such proxies in blacklists. Always make sure that your proxy service provider is reliable, has reviews, and provides unwavering customer support.

Smartproxy customer support.

#4 Mind your limits

We know how tempted you might feel to press that enter button when searching for something specific. However, entering keywords and hitting the enter key nonstop will make you look like a bot. 

With any automation tool that you might be using, it’s recommended to slow down your clicks and imitate human behavior. How? Randomize your request times on the automation application. For example, some tools offer custom delays on certain actions that’ll make your traffic look more genuine. The golden rule here – limit your requests and don’t cause damage to a website by bombarding it with millions of connection requests.

#5 Take care of your browser

Baby steps first: scan your browser for malware, clear your browser’s cache, sign in from a different browser, use a private (incognito) mode. Then, check your extensions, plugins, and additional software – all of them could be sending automated traffic. If that’s the case, remove or disable them.

When talking about browsers, we recommend befriending with two kinds: anti-detection or headless browsers. The good news is that Smartproxy has already designed one of them! Our X-Browser is an anti-detection browser that protects your privacy because it lets you stay undetected with multiple online identities. With this tool, your fingerprint will remain private, unique, and in good shape. This will turn security tools away from you, reducing the possibility of getting a CAPTCHA.

#6 If using a bot, take some extra steps

  • Try different endpoints or rotating ports.

    The rotating port will change an IP with every request. This means that every time you load the same page or go to a different page, you’ll be assigned a new IP.
Rotating proxy server.
  • Gather as many user agents as possible.

If you’re using a scraper or crawler for web scraping, make sure that you have a huge list of different user agents when writing custom code for it. User agents are like text messages – they know all the virtual details about you. Have a look at this user-agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36

The string above tells us the name and type of a browser, details of the system in which that browser’s running, and the info on the platform the browser’s using. A user agent is essentially a parameter attached to your request which gives you identity while visiting a website. So use custom codes for user-agents to cover your tracks and decrease chances of encountering CAPTCHAs.

  • Don’t use direct links.

Avoid direct links in your bots that are not publicly available on the website’s page. To do that, check the source code of a website and make sure that your bot can render all the necessary elements, including those written in JavaScript code. Follow the paths provided by the website itself and don’t just blindly go with a direct link.

To cut a long story short

The health of your IP, browser fingerprint, computer issues, and add-ons – all contribute to exposing you to all sorts of CAPTCHAs. If those tests were humans, they’d certainly have a punchable face… All we can do now is follow those steps that we discussed above to reduce the chance of coming across CAPTCHAs.

For more information about how to avoid CAPTCHAs when using proxies, see our documentation or contact our unbeatable customer support.

smartproxy

Ella Moore

Ella’s here to help you untangle the anonymous world of residential proxies to make your virtual life make sense. She believes there’s nothing better than taking some time to share knowledge in this crazy fast-paced world.

Frequently asked questions

Why am I getting a lot of CAPTCHAs?

In most cases, there’s an issue with your IP or browser. You might be using an IP address that is too fresh, shared with others, or has been associated with malicious activities. Alternatively, your browser has been used too many times or doesn’t have a clean history, or its fingerprinting might look suspicious. On top of all that, you might be using too many add-ons or it might be a good time to scan your computer for viruses.

Can proxies help avoid CAPTCHAs?

Yes, if you choose them wisely! For example, our dedicated datacenter proxies are great when seeking to bypass CAPTCHA. Note that if you opt for free bad-quality proxies, they won’t help you avoid CAPTCHAs. More the opposite – they’ll drive CAPTCHAs straight to you! Always make sure that your proxy service provider is reliable, has reviews, and provides customer support!

How do I know that I’m receiving CAPTCHAs because of my code or bot logs?

There are quite a lot of ways to identify whether you’re getting CAPTCHAs because of some automation tool. Here are some common signs:

- You’re not getting back the requested content, or you see a very small portion of it.

- Your scraper’s returning a response that includes CAPTCHA.

- Your requests are timing out.

Instead of 200 HTTP response codes, you’re getting codes like 40x, 50x, etc.

What is browser fingerprinting?

Browser or device fingerprinting is a technique that identifies internet users by gathering information about their activity online. This fingerprint includes such information as the type of your browser and device, language settings, screen resolution, operating system, and much more.

Can I stop browser fingerprinting?

Well, however harsh that may sound, you can’t really stop browser fingerprinting. Yet, the good thing is that you can spoof your browser attributes or use an anti-detect browser to get unique fingerprints for every profile, thus elevating your security and privacy.

Related Articles

Anti-scraping

Anti-Scraping Techniques And How To Outsmart Them

Businesses collect scads of data for a variety of reasons: email address gathering, competitor analysis, social media management – you name ...

Read more
How to avoid honeypots when gathering online data

What’s A Honeypot, And Why Should You Avoid It When Collecting Data Online?

The world of cybersecurity is evolving daily. With every great technological advancement comes a need to control and protect it from abuse. ...

Read more