1433 views

How to set up a proxy in Octoparse

Octoparse is a web scraping service with which you can collect information from all over the web. The tool automatically extracts data from web resources and then structures it into spreadsheets - this is much more convenient than searching for information manually. But without a proxy, most likely, Octoparse will not bring much benefit. Let's figure out why.

Is a proxy needed for Octoparse

Web scraping is not considered illegal, but many sites do not allow such activities and try to protect from services like Octoparse. Due to large data scraping, your IP address may end up blocked in Google and other web resources.

That's why you need a proxy server: it helps you be unnoticed on the network, hides your IP address, and replaces it with another. The actions of Octoparse will look like several people from different cities and countries visiting the sites at once.

Video tutorial for proxy configuration in Octoparse

Proxy settings in Octoparse

The web scraper supports proxies, so you don't need any additional services or programs to set it up. You can do this in two ways: when you log in and when you create tasks. Let's take a look at each of these options below.

How to set up a proxy while authorization:

  1. Open Octoparse.
  2. In the authorization window, click on the gear at the top.
  3. 2.png

  4. Select the type of your proxy server.
  5. 3.png

  6. Now enter the required data: IP address, Port, Username, and Password.
  7. 4.png

  8. To test the performance of the proxy, click on the "Test" button.
  9. 5.png

  10. If the proxy was successfully verified, click "Confirm". After that, enter the Username and Password of your account and log in.
  11. 6.png

How to set up a proxy when creating a task:

  1. Open the program and log in to your account.
  2. On the sidebar, click on the "Task" icon, then select "Advanced Mode".
  3. 22.png

  4. Enter a link to the site you want to scrape.
  5. 23.png

  6. Click the "Save URL" button.
  7. 24.png

  8. In the task window, click on the "Settings" button (or the gear icon, depending on the version of the program).
  9. 25.png

  10. In the "Anti-blocking settings" section, check the box next to "Use IP proxies".
  11. 26.png

  12. Click on "Settings".
  13. 27.png

  14. Enter the proxy data (IP-address:Port:Username:Password) or copy and paste it from a file. You can enter as many proxy servers as you want.
  15. 28.png

  16. Click on the "OK" button, save the task and exit from task settings. Done!
  17. 29.png

Now you know how to set up a proxy server for Octoparse. To scrape large amounts of information, use high-quality personal proxies. So you can avoid any blocking and get as much data as possible on the Internet.