Best Practices for Scraping

Support / help / discussion forum for twitter bot
Post Reply
MattGreene
Posts: 16
Joined: Mon Jun 26, 2023 7:24 am

Best Practices for Scraping

Post by MattGreene »

I know the situation is a bit of a mayhem now with Musk's announcement yesterday about the post limitations, but once the dust settles, I want to see if I am doing the scraping right or at least how to make it seem more organic.

Let's say I have a small list of 5 accounts, whose followers I want to scrape. I will use 10 of my accounts to get the job done. I go to the Scrape section, load a list of those source/main accounts in the "Scrape Profiles" tab and in Custom Search I select "Followers of" and "Details" as output option. In "Select accounts" I select 10 of my accounts.

Then I move to Processing tab and here's where the tricky part begins, as there are 2 things I don't understand:

1) Randomizing the action - by default I understand that all accounts of mine share the same task and they operate concurrently. However, how can I play with variables such as delays, batches to scrape, etc. I presume I need to right-click on the action and select "Edit Action", after which I should probably put some number on "WaitAfterPreRunCmd" and "WaitAfterCompletionSeconds"? I used to use FollowLiker and MassPlanner for this and I used to set how many seconds I want before each operation (scrape) repeats with additional variable, like a deviation of a few seconds up or down. I also used to specify how many followers to extract (again, a range between 100-200, so every time it was a slightly different number per operation). I presume Twitterdub can do that too, I just need to figure out the way it works.

2) Safely scraping more than the default 5,000 limit - that's pretty straightforward, as I just increase the number but I believe my accounts may get banned (or proxies blacklisted, although they are residential and I use separate proxy for each account). So, what would be the best way to scrape the entire list of followers for each account (let's say 100,000), in which I deploy multiple accounts and they just swap each other, so that no single account makes too many actions.

Sorry for the long question(s), I hope it's clear enough.

Thanks in advance, Martin!
Post Reply