RSS post helper - tumblingjazz

TumblingJazz Tumblr bot discussion
Post Reply
User avatar
martin@rootjazz
Site Admin
Posts: 34375
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

RSS post helper - tumblingjazz

Post by martin@rootjazz »

In order to setup custom RSS posting, you need to understand

XML
RSS
XPATH

Not a lot , but a little. When setting up your feed meta data in the program, you can see some standard values set, this should work for most feeds, but lets look at them to understand

So find your feed, and view the source, you will see it is probably structured

Code: Select all

<item>
....details of item1
</item>
<item>
....details of item2
</item>
etc
So we know, each item in the feed has the element

Code: Select all

<item>

It may also be

Code: Select all

<entry>
but usually <item>

Then we have

Code: Select all

<title>
<link>
<description>
All self explanatory. These can be used for the title / body / media link depending if you are posting a tweet / media link


Now images are a bit more complex, because not only do you want the above information, you also want the image URL. This is unlikely to be contained nicely in an element, so we need to scrape it from the page.
There are a few ways to do this

1) Using XPATH with an element (note you cannot use XPATH in an :encoded element

Code: Select all

<description/img[@id='image_id_1379182']>
Another possibility is the value you want is an attribute of the RSS element, this can be accessed with the attribute code placed within brackets within the element specification

Code: Select all

<description[@src]>
2) Using the token [srclike=XXX], this means, if the element contains an image tag with a src attribute, we can get it without worrying about a complex xpath statement

so

<description>
...blah html blah <img src="http://twitterdub.wordpress.com/images/ ... 7490u3.jpg"> blah html
</descripion>

so we can use the token along with a pattern match like
[srclike=twitterdub.wordpress.com/images]

so our meta value is
<description>[srclike=twitterdub.wordpress.com/images]

Note this token does work with :encoded elements if your image is within an encoded tag this works fine:

<content:encoded>[srclike=XXX]




Similar helper functions are

<item>[json=XXX]
This will find the JSON value within the text of the XML element <item>. Where JSON format is
"name":"value"


<item>[urllike=XXX]
Similar to srclike, this will find the first URL to that contains the pattern XXX, so you can specify

<item>[urllike=rootjazz.com/videos]

and it will return the first URL within the text of <item> tag that contains a URL that contains rootjazz.com/videos, i.e.
https://rootjazz.com/videos/a-video-file.mp4





Another way to get at the image, is to scrape a path that is linked from the RSS feed using the token:
[scrapexpath=xpathtocontent]

Someone wanted to setup a post images from WORDPRESS feed, but the image URL wasn't included in the feed, but was on the wordpress page.

This is how to do it

goto POST IMAGES FROM RSS FEED

Click ASSIGN META

Click CUSTOM

Now we can leave
ITEM
TITLE
etc as the default, but you may want to change the description, your call.

But we will need to look at
IMAGE URL

The WORDPRESS FEED uses the element <LINK> to specify the page URL.

So we want to go to that URL then scrape the image URL

so we enter

Code: Select all

<link>[scrapexpath=xpathtocontent]
this means,
GOTO the URL noted by the <link>
Then scrape the page and use xpath to pull the image URL

e.g.

Code: Select all

<link>[scrapexpath=//meta[@property='og:image']/@content] 
where your xpath to the image is

Code: Select all

//meta[@property='og:image']/@content

Note we can still use the Rss element attribute notation if required (perhaps the URL link is not the element text but an attribute

Code: Select all

<link[@href]>[scrapexpath=//meta[@property='og:image']/@content] 
:-)
nitram
Posts: 41
Joined: Thu Dec 24, 2015 4:15 am

Re: RSS post helper - tumblingjazz

Post by nitram »

Create postimage obj
Rss item: <link>[scrapexpath=//meta[@property='og:image']/@content]
* FAILED: Scraped image URL failed:
* PROBLEM: Create postimage


i cant get this to work not sure what iam doing know done plase hep
User avatar
martin@rootjazz
Site Admin
Posts: 34375
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS post helper - tumblingjazz

Post by martin@rootjazz »

<link>[scrapexpath=//meta[@property='og:image']/@content]
this is wrong. Can you not pull it from the feed with a srclike token?
nitram
Posts: 41
Joined: Thu Dec 24, 2015 4:15 am

Re: RSS post helper - tumblingjazz

Post by nitram »

sorry i dont know what that means i just want to post my wordpress website to tumblr i dont under stand the rest can u plase tell me in no computer code form plase cheers
User avatar
martin@rootjazz
Site Admin
Posts: 34375
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS post helper - tumblingjazz

Post by martin@rootjazz »

unfortunately, you need to know some computer code in order to use this function, as you need to describe the path to find the images on your blog as every single site can be different.


Look at your feed, is the image you want in the feed source? Or do you need to scrape it from the web page itself?

If it is in the feed, things are easier. You can just use the srclike token to extract the path to the file.



send me the feed url and I will take a look. if you don't want to post it here, send it via email

Code: Select all

support[at]rootjazz[dot]com

and please warn me if it is nsfw
nitram
Posts: 41
Joined: Thu Dec 24, 2015 4:15 am

Re: RSS post helper - tumblingjazz

Post by nitram »

thank you and sorry its NSFW for got to put that in the email
but u should get its nsfw buty the name lol cheers mate
User avatar
martin@rootjazz
Site Admin
Posts: 34375
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: RSS post helper - tumblingjazz

Post by martin@rootjazz »

as per email, you didn't explain what image you wanted.

but if you viewed the source of the feed, you would see the <description> element contained the html <img src=...">

so you can use the srclike token as described in this thread


<description>[srklike=pinimg.com]

which will scan the description elemnent for an <img> tag where the src=".." attribute contains the string "pinimg.com"
Post Reply