Scrape Twitter - suggestion

Discussions to do with Soundcloud Manager. Do not use for support, use the dedicated support forum for help requests
User avatar
martin@rootjazz
Site Admin
Posts: 34591
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scrape Twitter - suggestion

Post by martin@rootjazz »

Bartekef
Posts: 677
Joined: Thu Sep 22, 2016 12:24 pm

Re: Scrape Twitter - suggestion

Post by Bartekef »

Great, thank you! Gonna check it soon and let you know.

Anyways SCM run slow again. I mean it doesnt eat my memory like it used to before, but it perform follow action for almost 3 hours now and followed 18 profiles from .txt already to this moment...not so many as for 3 hours (I didn't update SCM to the link I quoted yet)
User avatar
martin@rootjazz
Site Admin
Posts: 34591
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scrape Twitter - suggestion

Post by martin@rootjazz »

Soundcloud broke something in the bot, just updated, might be your issue

https://soundcloudmanager.com/updatetesting.html:
Bartekef
Posts: 677
Joined: Thu Sep 22, 2016 12:24 pm

Re: Scrape Twitter - suggestion

Post by Bartekef »

martin@rootjazz wrote:Soundcloud broke something in the bot, just updated, might be your issue

https://soundcloudmanager.com/updatetesting.html:
Hey everything works well. Scrapper works fine, too.

Could you please give me a tip how to set-up this 'regex' you mentioned to extract twitter URLs? The easiest way?

Cheers,
Bartekef
User avatar
martin@rootjazz
Site Admin
Posts: 34591
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scrape Twitter - suggestion

Post by martin@rootjazz »

I think the quickest and easiest way would be to use a URL extractor, then take all extracted URLs, put them in a file, order the file alphabetically, then just extract the twitter ones manually



Search google: extract urls from text

first result:
https://pgl.yoyo.org/urlex/

will extract your URLs


then you can use notepad++ with textfx plugin to order lines alphabetically

8-)
Bartekef
Posts: 677
Joined: Thu Sep 22, 2016 12:24 pm

Re: Scrape Twitter - suggestion

Post by Bartekef »

Wonderful! Thank you

Now I have txt full of URLs, but I have another obstacle in my way...

it returns me sometimes 3-4 different urls in one line, e.g. youtube.com/xxx; twitter.com/xxx; instagram.com/xxx

in spite of:

youtube.com/xxx
twitter.com/xxx
instagram.com/xxx

so even If I sorted em alphabetically, I still dont have text document with twitter URLs all in different lines, and only twitter URLs in that lines (but also youtube, etc.)

do you know what I mean?

I know that isn't related with soundcloud manager actually, but as you see im not an expert, and I would be really grateful if you helped me with that!

Cheers,
Bartekef
Bartekef
Posts: 677
Joined: Thu Sep 22, 2016 12:24 pm

Re: Scrape Twitter - suggestion

Post by Bartekef »

martin@rootjazz wrote:I think the quickest and easiest way would be to use a URL extractor, then take all extracted URLs, put them in a file, order the file alphabetically, then just extract the twitter ones manually



Search google: extract urls from text

first result:
https://pgl.yoyo.org/urlex/

will extract your URLs


then you can use notepad++ with textfx plugin to order lines alphabetically

8-)
Wonderful! Thank you

Now I have txt full of URLs, but I have another obstacle in my way...

it returns me sometimes 3-4 different urls in one line, e.g. youtube.com/xxx; twitter.com/xxx; instagram.com/xxx

in spite of:

youtube.com/xxx
twitter.com/xxx
instagram.com/xxx

so even If I sorted em alphabetically, I still dont have text document with twitter URLs all in different lines, and only twitter URLs in that lines (but also youtube, etc.)

do you know what I mean?

I know that isn't related with soundcloud manager actually, but as you see im not an expert, and I would be really grateful if you helped me with that!

Cheers,
Bartekef
Bartekef
Posts: 677
Joined: Thu Sep 22, 2016 12:24 pm

Re: Scrape Twitter - suggestion

Post by Bartekef »

martin@rootjazz wrote:I think the quickest and easiest way would be to use a URL extractor, then take all extracted URLs, put them in a file, order the file alphabetically, then just extract the twitter ones manually



Search google: extract urls from text

first result:
https://pgl.yoyo.org/urlex/

will extract your URLs


then you can use notepad++ with textfx plugin to order lines alphabetically

8-)
This is what it gives me:
[img]http://fotowrzut.pl/tmp/upload/QQLK6EKOZH/1.jpg
[/img]

I used other extractors but none met my expectations, every result has different links in one line
User avatar
martin@rootjazz
Site Admin
Posts: 34591
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scrape Twitter - suggestion

Post by martin@rootjazz »

lol, that is terrible.

Let me see what I can, don't forget to remind / hassle me until I sort it though
User avatar
martin@rootjazz
Site Admin
Posts: 34591
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Scrape Twitter - suggestion

Post by martin@rootjazz »

https://rootjazz.com/listjazz/

now includes scrape all URLs and scrape all URLs containing "x"
Post Reply