Smartproxy

Table of content

December 23, 2021
10 minutes read

Alternative Google SERP Scraping Techniques - Terminal and cURL [VIDEO]

Google has become a gateway to easily-accessible information. And one of the best ways to make use of Google’s limitless knowledge is web scraping. We’ve just released a detailed blog post about scraping Google SERPs with Python, where we cover lots of useful info, including the technical part. So before you dive into this tutorial – check it out.

But what if Python is not exactly your forte? This blog post will show you how to scrape SERPs using a simpler method. One that doesn't require much tech knowledge or downloading loads of applications and software. So, what do you know about web scraping with cURL and Terminal?

Web scraping SERPs with cURL and terminal tutorial.
  • Smartproxy >
  • Blog >
  • Alternative Google SERP Scraping Techniques - Terminal and cURL

Defining cURL and Terminal

cURL

cURL or client URL (also written as curl) is a neat piece of free and open-source software that allows you to collect and transfer data by using the URL syntax. cURL is a widely used tool, popular for its flexibility and adaptability – it can run on almost all platforms. 

This software also supports more than 20 protocols (HTTP, HTTP(S), etc.), proxies, SSL connections, FTP uploads – you name it. You can even use cURL to download images from the internet or authenticate users. It’s a pretty versatile tool. 

If you’re looking to get knee-deep in cURL knowledge, we have a great blog post that covers it all. From history to usage and installation (only if needed, all new Mac OS, Linux, and Windows already have it installed), everything is there –  check it out. 

Advantages of using cURL

Advantages of using cURL.

Terminal

You probably already know a thing or two about it or have heard about it, at least. But to define it – Terminal (a.k.a. console and command line) is an interface in Mac OS and Linux that lets us automate various tasks without using a GUI (graphic user interface). 

In the most basic sense, you can give Terminal text-based commands to accomplish tasks, initiate programs, open documents, manipulate and download files. The main benefit of using a Terminal is that it’s much faster to do certain tasks than using a GUI.

Why cURL and Terminal?

For starters, it doesn’t require much tech knowledge. Also, both cURL and Terminal are already on your Mac and Linux, so you don’t have to worry about downloading additional tools. If, however, you have an older version that doesn’t have cURL – you can easily download it at https://curl.se/download.html.

Hero questioning what is cURL and terminal.

Scraping Google search results with cURL in terminal: tutorial

Finally, action time! In this tutorial, we’ll go step-by-step, guiding you throughout the process. You’ll not only learn how to use cURL, but you’ll also have a better understanding of its versatility and simplicity when dealing with smaller tasks. Oh, right – you’ll also learn a bit about web scraping!

What you’ll need to get started with this tutorial

Before we jump into the tutorial, let’s go over some of the most important things you’ll need for this tutorial. First things first: here are some basic terms and input elements for a web scrape request with cURL: 

  • Post. A simple command used to send information to a server.
  • Target. Indicates where information will be retrieved from. 
  • Query. With this, you’re going to specify what kind of information you’re after.
  • Locale. Interface language.
  • Results language. It means just that – the language in which the results of your query.
  • Geographical location. Specifies the location you want to target. 
  • Header. Contains information about the request: user agent, content type, etc. 

You’ll also need a SERP API. Scraping Google search engine results pages can be a challenge without a proper API. Smartproxy has SERP Scraping API – it’s a great set of magical tools (a proxy network, scraper, and parser) that will help you kick off your web scraping project. 

Loading video...

With our software, you’ll not only have access to a user-friendly API, but you’ll also get reliable proxies that will keep those pesky CAPTHAs at bay. It might seem a little overwhelming at first; that’s why we also have detailed documentation to help you better understand and use our product. But as far as this tutorial is concerned, you won’t need to use our SERP Scraping API that much.

Hero vs. tutorial.

Step 1: Getting started

First things first, you’ll want to sign up to our dashboard. It’s super easy and takes no time at all. Once you’ve gone through the whole process of creating an account and verifying it, go ahead and log in.  

In the menu on your left, click on “SERP” (under the “Scraping APIs” section) and subscribe to a plan of your choosing. Then, choose the user:pass authentication method and create a user. After that, you’re pretty much all set!

Oh and if you’re having trouble or you’d simply like to learn more about our API, check out our documentation. There you’ll find detailed information that almost literally covers it all. 

Step 2: Accessing Terminal

Now that you have access to Smartproxy’s SERP API, it’s finally time to start scrapin’. If you’re using Mac OS, head over to ‘finder’ and search for Terminal. For Linux users, you can either press CTRL+alt+T or click on “Dash” and type in Terminal.

Step 3.1: Let’s start cURL’ing

It’s time to enter your first line of code! Don’t worry, there’s actually not a lot of it. And to make it as clear as possible, we’ll go line by line, so you know exactly what’s going on. The line you’re about to enter will create a request to send information:

curl --request POST \

Press ‘Control+Enter’ and then add another line of code that specifies the endpoint. The link used here will always be the same no matter the scraping request. Just go ahead and re-use it for future requests.

--url https://scrape.smartproxy.com/v1/tasks \ 

Add another header line – it will specify the format of received information after this request: 

--header 'Accept: application/json' \ 

It’s time to show some credentials – the username and password you created for your proxy user. But you can’t just write it as it is, you’ll have to encode it first. 

You can do that straight from your Terminal. This command below sends a request to an encoding website and it will return your encoded credentials. Before you start this tutorial, type in this code line and you’ll get an encoded version of your login information.

For this tutorial, we encoded “user:pass” which looks like this: dXNlcjpwYXNz. Yours, of course, will be different.

echo -n ‘user:pass’ | openssl base64

Now that you have your credentials, let’s add another header that will let you access our API.

--header 'Authorization: dXNlcjpwYXNz' \

Next up – we’ll write a code line that will specify the format of the parsed information you’ll get after this request is completed.

--header 'Content-Type: application/json' \

And finally, with the code line below, we’ll indicate that we’re about to add the details of our request.

--data ' 
{

Step 3.2: Getting what you need

You’re halfway there! Now comes the fun part. We’re going to specify what kind of information we want, where we want to get it from, what language and locale to scour. We’ll also make sure to have the information parsed. Otherwise, you’ll get a block of messy text filled with letters, numbers, and symbols. 

For the purpose of this tutorial let’s say we’re looking for the best pizza in London. Go ahead and type in the rest of the code for this request: 

    "target": "google_search",
"query": "best pizza",
"parse": true,
"locale": "en-GB",
"google_results_language": "en",
"geo": "London,England,United Kingdom"
}
'

It doesn’t seem that overwhelming now that we got it all broken down line by line, right? Now that we have our whole request, it should look like this (you can totally copy-paste this straight to your Terminal. Just don't forget to change the authorization header.):

curl --request POST \
--url https://scrape.smartproxy.com/v1/tasks \
--header 'Accept: application/json' \
--header 'Authorization: dXNlcjpwYXNz' \
--header 'Content-Type: application/json' \
--data '
{
"target": "google_search",
"query": "best pizza",
"parse": true,
"locale": "en-GB",
"google_results_language": "en",
"geo": "London,England,United Kingdom"
}
'

There, your request is complete! After a couple of seconds, you should get the information you requested in a neatly parsed format. All that’s left to do is go through the provided links and find out where that one and only pizza is at.

{
"results":[
{
"content":{
"url": "https://www.google.com/search?q=best+pizza&lr=lang_en&nfpr=1&hl=en",
"page": 1,
"results":{
"paid":[],
"organic":[
{
"pos":1,
"url": "https://www.cntraveller.com/gallery/best-pizza-london",
"desc": "15 Jul 2021 \u2014 The best pizza in London 2021 \u00b7 Voodoo Rays. Voodoo Ray's. DALSTON. It isn't just the pizzas that are New York-style here: the neon signage,\u00a0...",
"title": "The best pizza in London 2021 | CN Traveller",
"url_shown": "https://www.cntraveller.com\u203a ... \u203a Eating & Drinking",
"pos_overall": 7
},

Wrapping up

And there we have it – you’ve just successfully web scraped! Granted, it might still seem a little much, but we all have to start somewhere. With a bit of patience and practice, you’ll soon be taking on big web scraping projects. 

Hopefully, now you have a better understanding of how web scraping works and how beneficial it is to gather data, compare and analyze it. Try out this Google SERP scraping technique with other requests and start getting results!

smartproxy

Mariam Nakani

Say hello to Mariam! She is very tech savvy - and wants you to be too. She has a lot of intel on residential proxy providers, and uses this knowledge to help you have a clear view of what is really worth your attention.

Frequently asked questions

Can I use cURL with Windows Command Prompt?

Absolutely. Just bear in mind that it won’t exactly be your standard copy-paste situation. While cURL works the same way on Windows Command Prompt, not every line of code will have the same meaning and outcome. Other than that, the basic principle is similar. 

Is it “okay” to scrape Google SERP?

While there are no actual laws against it, Google takes this pretty seriously. That’s why they’ve put up defensive mechanisms to counter scraping attempts, making it much more challenging. Sure, with proxies and API’s you can bypass it, but you should always be aware of what information you’re scraping and make sure you’re not violating anything. 

I’m having trouble following this tutorial. What should I do?

It’s totally fine if you’ve hit a roadblock. If you’re stuck, just drop a line to our customer support – no matter what happens, we’re here to help you 24/7. Don’t forget our help documentation as well; it’s packed with helpful information to help you out. 

What proxies are best for web scraping?

Web scraping is definitely on the rise. Individuals from different industries and professions benefit from it. However, to successfully web scrape without interruptions, the best option would be to get rotating residential proxies. With these, you won’t have to worry about being blocked or fighting off myriads of CAPTCHAs.