Scraping Issues

Support / help / discussion forum for twitter bot
Onebytewonder
Posts: 21
Joined: Tue Sep 11, 2018 11:58 am

Scraping Issues

Post by Onebytewonder »

Hi there,

I've been attempting to use functions within the scrape module of TwitterDub but keep get faliures and errors.
If anybody can help interpret the error I'm getting it would be massively appreciated. :D

Below is an example:

1. Navigated to the scrape module
2. Scrolled down to the 'Scrape Tweets' section
3. Setup up a 'custom search' to scrape recent tweets containing the following term 'youtube music'
4. Attempted to scrape tweets

Result

20:59:16: Starting: 19/09/2019 20:59 PM
20:59:19: Perform custom search
20:59:19: Max Results Wanted: 899934
20:59:19: Setup custom search controller
20:59:19: One search stage detected, setting default per item per stage value to max: 899934
20:59:19: Custom search run: search: youtube music
20:59:19: Perform custom search: #chain/total: 1/1 Tweet Search (popular) using: xxxxxxx
20:59:19: Start search: Tweet Search (popular) with: youtube music(899934) using: xxxxxxx
20:59:19: loaded: dbsearchfiltertweet 0
20:59:19: PreTokens: youtube music
20:59:19: UserDetails: https://mobile.twitter.com/xxxxxxx with: xxxxxxx No Proxy
20:59:19: Scrape from users: 1
20:59:19: Scrape details from: 1 inputs
20:59:19: Set headers
20:59:19: Set CsrfToken
20:59:19: Set AuthorisationBearer
20:59:19: Getting user details for :1 ids
20:59:19: GET: https://api.twitter.com/1.1/users/looku ... me=xxxxxxx
20:59:20: GET: 404
https://api.twitter.com/1.1/users/looku ... me=xxxxxxx
20:59:20: FAILED GET: (659446) 404
https://api.twitter.com/1.1/users/looku ... me=xxxxxxx
20:59:20: * WARNING: Request failed with Twitter response: No user matches for specified terms.
20:59:20: No users found: xxxxxxx
20:59:20: error: deserialising: LBtQjTNf2 type: List`1
20:59:20: {"errors":[{"code":17,"message":"No user matches for specified terms."}]}
20:59:20: ERROR: scrape users from ID :Cannot deserialize the current JSON object (e.g. {"name":"value"}) into type 'System.Collections.Generic.List`1[LibTwitterJsonData.JsonUser]' because the type requires a JSON array (e.g. [1,2,3]) to deserialize correctly.
To fix this error either change the JSON to a JSON array (e.g. [1,2,3]) or change the deserialized type so that it is a normal .NET type (e.g. not a primitive type like integer, not a collection type like an array or List<T>) that can be deserialized from a JSON object. JsonObjectAttribute can also be added to the type to force it to deserialize from a JSON object.
Path 'errors', line 1, position 10.
20:59:20: No users found for: https://mobile.twitter.com/xxxxxxx
20:59:20: * FAILED: UserDetails: https://mobile.twitter.com/xxxxxxx with: xxxxxxx
20:59:20: 0
<null>
<null>
20:59:20: * ERROR: Object reference not set to an instance of an object.
20:59:20: Results: 0
20:59:20: Saving to: C:\Users\xxxxxx\AppData\Roaming\rootjazz\Twitterdub\saved_data\search_youtubemusic_data_csv_2019-09-19.txt
20:59:20: Started: 19/09/2019 20:59 PM
Finished: 19/09/2019 20:59 PM
ID: ed2ce113-4c40-4030-a56d-bd8702d85439
20:59:20: Action ran for: 0hr:0min:3s
20:59:20: finished processThread: dbsearchfiltertweet ed2ce113-4c40-4030-a56d-bd8702d85439
20:59:20: Action completed as Run in Additional Thread - removing thread
User avatar
martin@rootjazz
Site Admin
Posts: 34360
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scraping Issues

Post by martin@rootjazz »

hard to say as you have removed some details from the logs, but I think you are trying to run a search and have specified that the profile username is

youtube music

which is not valid

https://twitter.com/youtube music

is not a valid URL, so the program returns "there is no user called "youtube music".


Actually I think you have specified that YOUR profile is "youtube music" which is the same thing that is not a valid username / permalink.
Onebytewonder
Posts: 21
Joined: Tue Sep 11, 2018 11:58 am

Re: Scraping Issues

Post by Onebytewonder »

Hi Martin,

I've just submitted logs via Twitterdub - Log 58509

Would be great if you could guide me towards resolving this.

Thanks again

Image

https://ibb.co/SfCQ8Nh


[/img]Starting: 20/09/2019 14:57 PM
Perform custom search
Max Results Wanted: 899934
Setup custom search controller
One search stage detected, setting default per item per stage value to max: 899934
Custom search run: search: youtube music
Perform custom search: #chain/total: 1/1 Tweet Search (recent) using: @mr_00000
Start search: Tweet Search (recent) with: youtube music(899934) using: @mr_00000
UserDetails: https://mobile.twitter.com/@mr_00000 with: @mr_00000 No Proxy
Scrape from users: 1
Scrape details from: 1 inputs
Getting user details for :1 ids
FAILED GET: (660740) 404
https://api.twitter.com/1.1/users/looku ... =@mr_00000
* WARNING: Request failed with Twitter response: No user matches for specified terms.
No users found: @mr_00000
error: deserialising: qXYclExQW type: List`1
ERROR: scrape users from ID :Cannot deserialize the current JSON object (e.g. {"name":"value"}) into type 'System.Collections.Generic.List`1[LibTwitterJsonData.JsonUser]' because the type requires a JSON array (e.g. [1,2,3]) to deserialize correctly.
To fix this error either change the JSON to a JSON array (e.g. [1,2,3]) or change the deserialized type so that it is a normal .NET type (e.g. not a primitive type like integer, not a collection type like an array or List<T>) that can be deserialized from a JSON object. JsonObjectAttribute can also be added to the type to force it to deserialize from a JSON object.
Path 'errors', line 1, position 10.
No users found for: https://mobile.twitter.com/@mr_00000
* FAILED: UserDetails: https://mobile.twitter.com/@mr_00000 with: @mr_00000
* ERROR: Object reference not set to an instance of an object.
Results: 0
Saving to: C:\Users\opal0000\AppData\Roaming\rootjazz\Twitterdub\saved_data\search_youtubemusic_ids_2019-09-20_4.txt
Started: 20/09/2019 14:57 PM
Finished: 20/09/2019 14:57 PM
ID: 737d6cef-87f8-40e5-9448-9961b4e6c626
Action ran for: 0hr:0min:4s
Onebytewonder
Posts: 21
Joined: Tue Sep 11, 2018 11:58 am

Re: Scraping Issues

Post by Onebytewonder »

Hey Martin,

So I've just tried the same custom search query with the term 'Youtube Music' with a different Twitter account and all seems to be working.

Wasn't obvious to me from looking at the logs that there was anything wrong with the previous accounts that I attempted.

Anyway, all good now.

Thanks for your time as always/
Onebytewonder
Posts: 21
Joined: Tue Sep 11, 2018 11:58 am

Re: Scraping Issues

Post by Onebytewonder »

Hi Martin,

Has the option to scrape emails been removed from the latest version of Twitterdub?

Just can't seem to find it within the scraping tab.


Also, is the a way to scrape the content of tweets (with hashtags & text etc)?
I tried the 'Scrape body' details option but only the Tweet URLs were returned in the saved data.


Thanks
User avatar
martin@rootjazz
Site Admin
Posts: 34360
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scraping Issues

Post by martin@rootjazz »

Onebytewonder wrote: Fri Sep 20, 2019 2:02 pm Starting: 20/09/2019 14:57 PM
Perform custom search
Max Results Wanted: 899934
Setup custom search controller
One search stage detected, setting default per item per stage value to max: 899934
Custom search run: search: youtube music
Perform custom search: #chain/total: 1/1 Tweet Search (recent) using: @mr_00000
Start search: Tweet Search (recent) with: youtube music(899934) using: @mr_00000
UserDetails: https://mobile.twitter.com/@mr_00000 with: @mr_00000 No Proxy
Again, you are not providing valid inputs:

https://mobile.twitter.com/@mr_00000

that is not a valid URL

Your username is NOT: @mr_00000
When you log into twitter, you don't enter
username: @mr_00000
password: xxxxxxx

So the program cannot use your account as you specified it incorrectly.
Username is: mr_00000
WITHOUT @ sign

When performing searches you can use that notation, but when enter your profile username, do not include it
User avatar
martin@rootjazz
Site Admin
Posts: 34360
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scraping Issues

Post by martin@rootjazz »

Onebytewonder wrote: Fri Sep 20, 2019 3:29 pm Hey Martin,

So I've just tried the same custom search query with the term 'Youtube Music' with a different Twitter account and all seems to be working.

Wasn't obvious to me from looking at the logs that there was anything wrong with the previous accounts that I attempted.

Anyway, all good now.

Thanks for your time as always/
I guess the other account had the username / account details specified correctly.

This isn't the first time it is happened, I should update the account details form to check for these typical issues for the next update
User avatar
martin@rootjazz
Site Admin
Posts: 34360
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scraping Issues

Post by martin@rootjazz »

Onebytewonder wrote: Fri Sep 20, 2019 4:57 pm Hi Martin,

Has the option to scrape emails been removed from the latest version of Twitterdub?
Should be top of the scraper tab, the first option (or so)
Also, is the a way to scrape the content of tweets (with hashtags & text etc)?
I tried the 'Scrape body' details option but only the Tweet URLs were returned in the saved data.
That doesn't sound right, let me test and get back to you
Onebytewonder
Posts: 21
Joined: Tue Sep 11, 2018 11:58 am

Re: Scraping Issues

Post by Onebytewonder »

Hey Martin,

So the scrape functions are in fact working as designed in the latest version of Twitterdub.

I was checking the 'Saved Data' txt files whilst the processor actions were still in progress.
Doing so showed a list of tweet urls within the generated file, which led me to believe that it wasn't scraping the specified 'Tweet Body' data.

However, once the processor action completes the correct data is shown in the txt file.

Hope this makes sense.

All is good now.

Thanks :)
User avatar
martin@rootjazz
Site Admin
Posts: 34360
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scraping Issues

Post by martin@rootjazz »

thanks for updating. ~I was seeing it was saving correctly as you said, couldn't figure out how you were only seeing IDs. I will check as to why it is saving during the action as ID only, as that still doesn't make sense
Post Reply