• Smartproxy >
  • Scraping >
  • Amazon >
  • Webinar

Loading video...

If you’re serious about doing business on Amazon, you should monitor your competitors. Our eCommerce Scraping API will allow you to gather structured, ready-to-digest data from Amazon, and you will be ready to stand out from the competition. Learn more handy tips & tricks on Amazon data gathering in our latest webinar.



Hello, and welcome to the first-ever Smartproxy webinar. My name is Nathan, and I'm a sales team lead here at Smartproxy. Every day I get to consult numerous businesses, and during my time here, I understood that collecting eCommerce data can provide great insight for growing businesses.


However, it may bring some challenges, especially when it comes to scraping giants such as Amazon. That's why today I'm joined here by Paulius. Hello Nathan. He's our research and development manager, and he will be joining me in a discussion about eCommerce, especially Amazon scraping, and how to overcome the challenges associated with it. I even heard he will be showing one of our scaping tools in action. Is that right? Yes, and that's not it. I will also try to demonstrate not only how you can get the data but how you can use it for your business insight. Oh, so stay tuned for that.


But first, Paulius, I would like to ask you about the tools which are used to gather eCommerce data. As far as I know, there are two worthy options: proxies and scraping APIs. So, how do they differ exactly? Yeah, well, if you're just starting out, it can be a hard question to answer.


But to make it easier for you, you can ask yourself three questions. What is your budget? What kind of locations do you need data from? And how big is your engineering team, and do they have knowledge in scraping or parsing? If you're just starting out and you have a small budget, then I would suggest going for the datacenter proxies simply because this is the greatest value for the price.


Once you start scraping, you may notice that you occasionally get CAPTCHAS and blocks. This is because datacenter proxies are very easy to detect. Also, if you need more locations, you also may have trouble because data center proxies usually offer only a limited amount of endpoints to choose from. If budget is not your problem, then I would suggest mixing datacenter proxies with residential proxies simply to get the best of both worlds.


So, with residential proxies, you're now getting a real network. A real network of real human devices that offers you locations from all over the world and also offers the ability to bypass all CAPTCHAS and blocks, as these proxies are very hard to detect. And you can mix them with datacenter proxies.


So, for example, you first call the website with a datacenter endpoint, and only if you get blocked, or you have issues then you switch to a residential network. Of course, that adds complexity to your project and can be a challenging part for some people. If you only want to use residential networks to scale your business, you can also do that, as residential proxies come with the pay-as-you-go option, which allows you to pay only for the each 

gigabyte that you use.


So, you can start your business with just one gigabyte and then scale it up for as long as you need. And if you don't have a big team of engineers or any engineering knowledge about scraping or parsing, then my offer is to use eCommerce Scraping API. Simply because this is the solution built for everyone, it's a tool that offers you data from such big giants as Amazon in a clearly readable format, such as JSON. It also comes with built-in proxy rotation and fingerprint security, so you don't have to worry about any of these things at all. It looks really interesting.


So, when it comes to eCommerce, what type of data is the most valuable then? I would say there are two types of data that we should be looking at at the very beginning. Of course, the first and most obvious option is the pricing of your product. So, how is your product priced accordingly to your competition? Do you offer the same value as your competition for the same price? All these things you can check by looking at competitor prices, competitor descriptions, features they offer with their products, and more.


And the second option, that is very valuable insight for your business, is reviews. So, not only your reviews but also reviews about your competition. So, you should be always looking at how other customers are rating your competitors, what kind of ratings are they getting, and which kinds of reviews are overlooked. So you could use those potential customer complaints as data insights to improve your own business.


So, to put this into a real example, let me show you how it's done. So, we will start by opening Smartproxy's GitHub page and going to the eCommerce Scripting API repository, where we will find examples of how our eCommerce scraper can be used. And here you also see that you can claim this scraper for three days as a free trial and get 3 000 requests to test with. Now I'll go to Postman Collection, click on Run in Postman, and then Postman for Mac or Windows if you're 

using it on Windows.


And also, Postman is a free application, so anyone can do this. I will now import this collection, and once it is imported, I will go to variables and set up my username and password that I will use to scrape that collection. You can get your username and password in our dashboard under Authentication.


Once I save my credentials, I will now go to Amazon, look for Amazon Search and then go to the body. This now allows me to enter any query or, in other words, search keyword that will be looked up in the Amazon. So, for this particular case, let's just say I'm trying to build a business in the wireless chargers niche. So, I'll write 'wireless charger,' click Send, and just in a few seconds, I will see my response. Here I'm being returned all kinds of information about the different products in the category, and as you can see, they have different amounts of reviews. So what I can do is I could take the product ID of this wireless charger, then go to Amazon Reviews Scraper and put it into the query.


What this will do, is it will return me all kinds of reviews written under that product. And as you can see we can see the title, the author, the rating, and the content of the review. So, if we sort these reviews by the lowest ratings or the ratings that are not the highest, we could use them in order to get some insights about what went wrong. And now you may be saying 'Should I review all these ratings by myself?' and the answer is no. We are living in an age where AI is now doing most of our things. And this is no exception because Google Cloud Natural Language API can actually read this review for us, analyze it, and give us the sentiment of it.


So, here we see the Google Cloud Natural Language Processor analyzing this data for us. And if we click on Sentiment  here, we will then find out the score, which can tell us if it was written in a positive way or a negative one. We can then pull these negative ones automatically to some sort of channel where we are storing them and analyze them one by one. So that's it for the demo. I hope you enjoyed it, and of course, in production, this would all be done through code by connecting different APIs to get the business insights that you need. Thank you for this interesting demo Paulius.


So, now that we know how to collect the data, one question crosses my mind. Is scraping Amazon fully legal? Amazon is a fully public website. So, any data you can see there is accessible to everyone. Of course, when scraping, you should take note and understand that there could be some laws that apply, such as data protection laws, contract laws, and others. So, you should always, of course, consult with a professional. But to give you a simple example, when you are gathering data from your competitor are you using any parts of their description, or photos, or anything that could be copyrighted? If not, then in most cases, everything should be okay.


Interesting, so what type of challenges do our users face while scraping Amazon? And also, what type of challenges are we facing on our end? Well, so as I already mentioned before, the main challenges that users face are probably blocks and CAPTCHAS. Because if you are doing something wrong, for example, not managing your fingerprint correctly, then the chances that you will get blocked are very high. Especially if you're doing it at scale.


And the second challenge, which is also the challenge for the customers that are building their own infrastructure, is to monitor for page changes. So, Amazon is a big company; they have a lot of employees working there, and their page is constantly changing. They're constantly innovating how they display products, which product is placed under which category and etc. For us, it's a pain because we also have to monitor for these changes and then change them in our code so that users that are gathering eCommerce data wouldn't get some kind of errors in their process.


Of course, if you're doing this on your own, it's also a big challenge because you're not only maintaining the parsing, but you also have to build the whole monitoring infrastructure, so you could see and understand when the page is changing its layout. I see, so it's really important to like  maintain and update the tool, right? So, to finish Paulius, I wanted to ask like how our users can integrate our Scraping APIs into the automation tools they're already using. Well, that's a good question and the first tool that comes to my mind is Zapier as it's been getting popular over the years.


So, in this particular example, what you could do is to create a step that calls our eCommerce API and then acts based on the result. So, one of the ways you could use it is create a step that calls it, and then when it returns data, it puts that data into a .csv format which you can then export into Google Sheets. So now you have a Google sheet with all sorts of information and in order for it to be insightful what you would need to do is to scrape periodically.


So, you set up the scraping process every day or every week, depending on your need, and then once you have these Google Sheets delivered to you, you start comparing them. So, now you see the price movements, or the changes in the description, or changes in the reviews. You can see if one competitor is growing, particularly in amount of reviews. What are they doing? Maybe they're sending some kind of custom gift cards or anything for those reviews? And you can really build up that insight for your business. Wow, it's very versatile, which I think is needed when you have a tool which collects data at such a high scale, right?


So, as we mentioned before, our eCommerce Scrapping API isn't the only beneficial tool for your scraping setup. If our residential solutions piqued your interest, you can always choose our Pay As You Go plan. It's $12.50 per one gigabyte with no limitation and no commitment. And if you have specific inquiries, you can always talk to our expert. Paulius, thank you for this discussion. It was super insightful. No problem, it was fun. And I will see you guys next time, 

and remember, scrape smart, not hard. Bye!

Get in touch

Follow us


© 2018-2023, All Rights Reserved