RSS feed help

TumblingJazz Tumblr bot discussion
User avatar
martin@rootjazz
Site Admin
Posts: 34369
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

RSS feed help

Post by martin@rootjazz »

Someone wanted to setup a post iamges from WORDPRESS feed, but the image URL wasn't included in the feed, but was on the wordpress page.

This is how to do it

goto POST IMAGES FROM RSS FEED

Click ASSIGN META

Click CUSTOM

Now we can leave
ITEM
TITLE
etc as the default, but you may want to change the description, your call.

But we will need to look at
IMAGE URL

The WORDPRESS FEED uses the element <LINK> to specify the page URL.

So we want to go to that URL then scrape the image URL

so we enter

Code: Select all

<link>[scrapexpath=xpathtocontent]
this means,
GOTO the URL noted by the <link>
Then scrape the page and use xpath to pull the image URL

e.g.

Code: Select all

<link>[scrapexpath=//meta[@property='og:image']/@content] 
where your xpath to the image is

Code: Select all

//meta[@property='og:image']/@content

:-)
perhenon
Posts: 1
Joined: Wed Nov 19, 2014 9:13 pm

Re: RSS feed help

Post by perhenon »

I don't understand this, I keep getting

Code: Select all

Create postimage obj
* PROBLEM: Create postimage: './/http://www.imgur.com/r/wet/rss' has an invalid qualified name.
I'm trying to use an imgur feed to post images to my tumblr blog using tumblingjazz but it will not do as I want.

Why is it so complicated, remove all this <linkxscrapepath@;}}{_£path£$content% nonsense and speak in english,

I want to post all the images from an imgur feed to my (queued) tumblr blog, I also want any new images posted to the imgur/rss feed to the end of that queue.

Can you explain how to do it like I'm a 5 year old please.
User avatar
martin@rootjazz
Site Admin
Posts: 34369
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS feed help

Post by martin@rootjazz »

Why is it so complicated, remove all this <linkxscrapepath@;}}{_£path£$content% nonsense and speak in english,
lol, wish I could. Unfortunatley that gibberish is essential. It is a form of XPATH. Which allows the program to find what it wants on the page. The RSS module is actually powerful enough to work with any RSS feed, including scraping from a 2ndary URL.

so to explain it. If you want to post images from IMGUR, go to the RSS POST IMAGES tab,
Enter your IMGUR RSS URL
click ASSIGN META
Click IMGUR option
Click Save

Then select your account/s, other parameters as you please and click POST IMAGES. That's it.

To help debug it further, I would need
1) the RSS URL you are posting
2) screen shots of the RSS META popup form and the main RSS POST IMAGES tab, to see what you are setting

These can be posted up, or emailed if you prefer

Code: Select all

support[at]rootjazz[dot]com
User avatar
martin@rootjazz
Site Admin
Posts: 34369
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS feed help

Post by martin@rootjazz »

In order to setup custom RSS posting, you need to understand

XML
RSS
XPATH

Not a lot , but a little. When setting up your feed meta data in the program, you can see some standard values set, this should work for most feeds, but lets look at them to understand

So find your feed, and view the source, you will see it is probably structured

Code: Select all

<item>
....details of item1
</item>
<item>
....details of item2
</item>
etc
So we know, each item in the feed has the element

Code: Select all

<item>

It may also be

Code: Select all

<entry>
but usually <item>

Then we have

Code: Select all

<title>
<link>
<description>
All self explanatory. These can be used for the title / body / media link depending if you are posting a tweet / media link


Now images are a bit more complex, because not only do you want the above information, you also want the image URL. This is unlikely to be contained nicely in an element, so we need to scrape it from the page.
There are a few ways to do this

1) Using XPATH with an element (note you cannot use XPATH in an :encoded element

2) Using the token [srclike=XXX], this means, if the element contains an image tag with a src attribute, we can get it without worrying about a complex xpath statement

so

<description>
...blah html blah <img src="http://twitterdub.wordpress.com/images/ ... 7490u3.jpg"> blah html
</descripion>

so we can use the token along with a pattern match like
[srclike=twitterdub.wordpress.com/images]

so our meta value is
<description>[srclike=twitterdub.wordpress.com/images]

Note this token does work with :encoded elements if your image is within an encoded tag this works fine:

<content:encoded>[srclike=XXX]




Similar helper functions are

<item>[json=XXX]
This will find the JSON value within the text of the XML element <item>. Where JSON format is
"name":"value"


<item>[urllike=XXX]
Similar to srclike, this will find the first URL to that contains the pattern XXX, so you can specify

<item>[urllike=rootjazz.com/videos]

and it will return the first URL within the text of <item> tag that contains a URL that contains rootjazz.com/videos, i.e.
https://rootjazz.com/videos/a-video-file.mp4





Another way to get at the image, is to scrape a path that is linked from the RSS feed using the token:
[scrapexpath=xpathtocontent]

Someone wanted to setup a post images from WORDPRESS feed, but the image URL wasn't included in the feed, but was on the wordpress page.

This is how to do it

goto POST IMAGES FROM RSS FEED

Click ASSIGN META

Click CUSTOM

Now we can leave
ITEM
TITLE
etc as the default, but you may want to change the description, your call.

But we will need to look at
IMAGE URL

The WORDPRESS FEED uses the element <LINK> to specify the page URL.

So we want to go to that URL then scrape the image URL

so we enter

Code: Select all

<link>[scrapexpath=xpathtocontent]
this means,
GOTO the URL noted by the <link>
Then scrape the page and use xpath to pull the image URL

e.g.

Code: Select all

<link>[scrapexpath=//meta[@property='og:image']/@content] 
where your xpath to the image is

Code: Select all

//meta[@property='og:image']/@content

:-)
bitcoin
Posts: 924
Joined: Tue Jul 04, 2017 1:25 am

Re: RSS feed help: how does it all work settings tags elements

Post by bitcoin »

MARTIN SKIP TO RED TEXT BELOW: REST is MOSTLY TO HELP ME AND OTHERS
My questions are for TwitterDub but I think it's best posted as a reply here so others have a central thread to study...
----

Are you kidding me???

:( I always avoided assigning elements from RSS feeds if a tool required it. I don't get it. It's messy and usually a user friendlier solution is available.

So I used FL (another bot) to post those RSS feeds for me. That was super easy to set up, though there are some bugs hidden within. Now they've decided to charge / month besides the software you have to buy at a premium. So time to move on, and TD should do it.

I'm looking at above article and I'll study it once more, but... Don't you need to be a programmer to understand this?

For example:
In FL, I add the RSS feeds I have and don't change the default "Crawl content of these tags" (p, td, li, span). (MARTIN: WHAT IS THAT??) If the RSS item has a picture, it's tweeted alongside with it. No problems there either.

How would this go with TD? I'm gonna read some more, maybe watch another youtube video and try to figure it out. I hope it will be rather strait forward, including the image tags. I'll be focusing myself on "Tweet RSS" tab.

Edit 1: https://youtu.be/Z0p7lM5O5GQ does not mention images
Edit 2: https://youtu.be/fxErgxDHIl8 is what I need to use. It will include and text and the image.

Question: what will happen with an RSS feed that is;
A. missing an image reference altogether, or
B. has an empty image reference?
C. See next reply please! xx
Last edited by bitcoin on Mon Aug 07, 2017 11:03 am, edited 6 times in total.
bitcoin
Posts: 924
Joined: Tue Jul 04, 2017 1:25 am

Re: RSS feed help: post RSS items and their thumbnail images? RSS <selectors> what do they do with TD? Where to use? :)

Post by bitcoin »

Example RSS feed and questions I have because of it.

For the first RSS link, I find the feed contains both an <url> and a <link>. This confused me at first but it can safely be ignored. It does not belong to the actual news feed. The news feed only starts after <item> Which leaves the following structure:
<title> <link> <description> and no images anywhere. So this one should be easy. (testing later)

I went to a second feed I knew had images: https://cointelegraph.com/rss
This is the first item:

Code: Select all

<title>
<![CDATA[ Ethereum Weekly Price Analysis: July 30 - August 6 ]]>
</title>
<link>
https://cointelegraph.com/news/ethereum-weekly-price-analysis-july-30-august-6
</link>
<media:content url="https://cointelegraph.com/images/528_Ly9jb2ludGVsZWdyYXBoLmNvbS9zdG9yYWdlL3VwbG9hZHMvdmlldy8xZjY5MDA4N2U5NmNhMzNkODNjY2M3MTFiY2YzMTQzNC5qcGc=.jpg" medium="image"/>
<enclosure url="http://cointelegraph.com/images/528_Ly9jb2ludGVsZWdyYXBoLmNvbS9zdG9yYWdlL3VwbG9hZHMvdmlldy8xZjY5MDA4N2U5NmNhMzNkODNjY2M3MTFiY2YzMTQzNC5qcGc=.jpg" length="528" type="image/jpg"/>
<pubDate>Sun, 06 Aug 2017 18:31:56 +0000</pubDate>
<dc:creator>
<![CDATA[ CoinTelegraph By Denis Harrison ]]>
</dc:creator>
<category>
<![CDATA[ Ethereum ]]>
</category>
<category>
<![CDATA[ Ethereum Price ]]>
</category>
<guid isPermaLink="false">
https://cointelegraph.com/news/ethereum-weekly-price-analysis-july-30-august-6
</guid>
<description>
<![CDATA[
<p style="float:right; margin:0 0 10px 15px; width:240px;"> <img src="https://cointelegraph.com/images/528_Ly9jb2ludGVsZWdyYXBoLmNvbS9zdG9yYWdlL3VwbG9hZHMvdmlldy8xZjY5MDA4N2U5NmNhMzNkODNjY2M3MTFiY2YzMTQzNC5qcGc=.jpg"> <p>Ethereum Weekly Price Overview</p>
]]>
</description>
</item>
<item>
So now we find:
<title><link><media:content URL=".."><pubdate><dc:creator><category><guid><description>
What do we do with media:content?
We can ignore <pubdate> (nobody needs that tweeted)
We can sadly also ignore <dc:creator> (not enough room in a tweet)
What do we do with <category>?
What is <guid>?
How does a <description> with an image work? Isn't this the same as just text with an extra URL? Won't twitter recognize that URL as a picture and preview it? Or even better: does TD recognize the URL is a link to an image and does it upload the image to the tweet so it takes up no extra characters in the tweet?
And follow up question: what is one of the items has an empty / missing <media:content URL> or a missing link to an image in the <description>?

Thank you Martin :)

Will take another look at the videos and explanations and reply or edit my findings...

Edit: I think you are answering all my questions viewtopic.php?f=8&t=1021&p=27758#p12565. I just don't get it. I get lost the moment you start talking about 1/3th in :)
User avatar
martin@rootjazz
Site Admin
Posts: 34369
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS feed help

Post by martin@rootjazz »

The RSS feed module isn't for me to do it for everyone (which is kind of what it has become). But you have to try it, then I can "help" you with any issues, but I just cannot provide the formatting for everyone

Read the guides / forum posts that specifies how to do it, then show me what you have and I can give pointers
bitcoin
Posts: 924
Joined: Tue Jul 04, 2017 1:25 am

Re: RSS feed help

Post by bitcoin »

I gave you all I could muster :/ I'm just no coder but I'll see if I can find someone to do it for me.

Thanks!
User avatar
martin@rootjazz
Site Admin
Posts: 34369
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS feed help

Post by martin@rootjazz »

Basically you probably want:

meta image url:

Code: Select all

<media:content>/@url
I think, I'd have to test to see if that is right, but is what I would start with
bitcoin
Posts: 924
Joined: Tue Jul 04, 2017 1:25 am

Re: RSS feed help

Post by bitcoin »

found a coder! He'll be looking at it now.

Maybe he'll even explain me and us a bit more ;-)
Post Reply