Instagram Scraping Setup

puplesk8er
Posts: 11
Joined: Thu Mar 26, 2020 10:56 pm

Instagram Scraping Setup

Post by puplesk8er » Thu Mar 26, 2020 11:12 pm

Hi,

I'm kinda new to these kinds of programs, but I'm looking to setup scraping maybe 2-5k accounts a day for emails. I was wondering if anyone could help me with the setup.

Does scraping have different thresholds if I'm trying to scrape the email addresses vs other actions on instagram (if so, any estimate?)?

How would you guys try to achieve that kind of volume?
--Would you only run 2-3 accounts per proxy?
--Do you need aged accounts?
--Maybe something else that would be good to know?

Thanks!

User avatar
martin@rootjazz
Site Admin
Posts: 24255
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Instagram Scraping Setup

Post by martin@rootjazz » Fri Mar 27, 2020 4:44 pm

puplesk8er wrote:
Thu Mar 26, 2020 11:12 pm
Hi,

I'm kinda new to these kinds of programs, but I'm looking to setup scraping maybe 2-5k accounts a day for emails. I was wondering if anyone could help me with the setup.
There are two ways to scrape emails. Both have their pros / cons.

1) You provide a list of profile URLS, the program then scrapes the IG WEBSITE and checks for emails in the bio. NOTE, IG limits means you need about 1 proxy per 150 pages you want to scrape, so for big scrapes you need a lot of proxies. But the benefit is you don't need IG accounts /IG quality proxies. To run this scrape, goto SCRAPE TAB > scrape emails function (top)

2) You run a normal instadub and the program will pull the user_details endpoints and check for emails in bio and contact emails (business only). This will require IG accounts and for big scrapes, lots of IG accounts are required on different IPs. To run this scrape, SCRAPER tab, set your search to run and select output EMAILS
Does scraping have different thresholds if I'm trying to scrape the email addresses vs other actions on instagram (if so, any estimate?)?
There are limits, but a lot depends on accounts and volume, for big scrapes, you need to provide multiple accounts to divide up the requests.
How would you guys try to achieve that kind of volume?

define volume. 10k, 100k, million?
--Would you only run 2-3 accounts per proxy?
ideally, one account per IP. Costs more but is safer. More accounts per IP is cheaper but more risky. The decision is yours. Test test test and find out works for you

--Do you need aged accounts?
You don't NEED, but they tend to be better. But note, an account created 2 years ago and never used, is NOT the same as an account created 2 years ago and used.
--Maybe something else that would be good to know?
test review test improve. Don't try and get all the answers before you start, often people get analysis paralysis, trying to come up with the perfect plan without any actual knowledge, then when they finally start, they throw out their plan and go by actually doing.




Regards,
Martin

puplesk8er
Posts: 11
Joined: Thu Mar 26, 2020 10:56 pm

Re: Instagram Scraping Setup

Post by puplesk8er » Sun Mar 29, 2020 7:03 pm

Thanks for the quick reply.
Last edited by puplesk8er on Sun Mar 29, 2020 7:22 pm, edited 1 time in total.

puplesk8er
Posts: 11
Joined: Thu Mar 26, 2020 10:56 pm

Re: Instagram Scraping Setup

Post by puplesk8er » Sun Mar 29, 2020 7:18 pm

Also I'm getting a lot of scrape failed messages.

FAILED: scrape failed: Status:-1

Do you know what this could be from? Even my first pull failed, so I don' think I'm hitting the threshold limit. Could it be the proxy or something else?

User avatar
martin@rootjazz
Site Admin
Posts: 24255
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Instagram Scraping Setup

Post by martin@rootjazz » Sun Mar 29, 2020 9:54 pm

puplesk8er wrote:
Sun Mar 29, 2020 7:18 pm
Also I'm getting a lot of scrape failed messages.

FAILED: scrape failed: Status:-1

Do you know what this could be from? Even my first pull failed, so I don' think I'm hitting the threshold limit. Could it be the proxy or something else?

-1 / empty / <null> means no response received.

This is either your network was down, the proxy wasn't working, something on your machine blocked the request / the site server timed out and didn't respond in time (unlikely but does happen)

If using a proxy, test it:
Please confirm your proxy is working, more info on how to do that here:

viewtopic.php?f=15&t=3453


If the proxy appears to be working fine, please run the action again as it could have been a network "blip".


If the issue appears to be persistent, please check your anti virus / security software as they *may* block the app from the accessing the network: the two most common are:

* Windows Defender
* Smart screen

Make sure the applications program files folder:

c:\program files (x86)\instadub

is whitelisted and / or all the .exe and .dlls in that folder.



Please let me know what operating system you are using and how you run the program: At home / on VPS / In VM?



Also, some countries (eg China) cannot access Instagram and need to use a VPN to access the service. If this is not configured correctly the program will not route through the VPN and all connections will be blocked.

You can test what IP is being use:
goto HELP > CHECK IP > CHECK 1 and CHECK 2

CHECK 1 should be your VPN IP if everything is correct.




Regards,
Martin

puplesk8er
Posts: 11
Joined: Thu Mar 26, 2020 10:56 pm

Re: Instagram Scraping Setup

Post by puplesk8er » Sun Apr 05, 2020 8:57 pm

I've been trying to figure out a way to scrape a large following of an account. Is there a way to breakup the request for it to do it over a few days limiting the maximum actions daily (using a few ig accounts)?

User avatar
martin@rootjazz
Site Admin
Posts: 24255
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Instagram Scraping Setup

Post by martin@rootjazz » Mon Apr 06, 2020 3:07 pm

You can use extract the cursor ID from the logs and provide that when running a new search. The program will use that ID in it's requests and IG will carry on "roughly" from where you were.

puplesk8er
Posts: 11
Joined: Thu Mar 26, 2020 10:56 pm

Re: Instagram Scraping Setup

Post by puplesk8er » Wed Apr 08, 2020 8:54 pm

Is this just called "ID:" at the end of the logs? I see it's very long.

User avatar
martin@rootjazz
Site Admin
Posts: 24255
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Instagram Scraping Setup

Post by martin@rootjazz » Wed Apr 08, 2020 10:39 pm

puplesk8er wrote:
Wed Apr 08, 2020 8:54 pm
Is this just called "ID:" at the end of the logs? I see it's very long.
paste up the logs and I will confirm

puplesk8er
Posts: 11
Joined: Thu Mar 26, 2020 10:56 pm

Re: Instagram Scraping Setup

Post by puplesk8er » Wed Apr 08, 2020 11:30 pm

just messaged u the logs

Post Reply