Xpath or RSS element of a Tumblr feed's image url

TumblingJazz Tumblr bot discussion
Openkore
Posts: 45
Joined: Tue Dec 17, 2013 2:06 am

Xpath or RSS element of a Tumblr feed's image url

Post by Openkore »

Hi,

I'm trying to post from a Tumblr feed and so far I've got the other elements in place (tags, post link, etc.) but I'm still missing the image url. What should I put there?

Image

Pic related, that's what I'm trying to find. Someone told me to put this:

Code: Select all

//div[@class="body hasMarkup"]/p/img
but it didn't work.

Thanks!
User avatar
martin@rootjazz
Site Admin
Posts: 34712
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Xpath or RSS element of a Tumblr feed's image url

Post by martin@rootjazz »

Send me the Feed and I'll tell you.
Openkore
Posts: 45
Joined: Tue Dec 17, 2013 2:06 am

Re: Xpath or RSS element of a Tumblr feed's image url

Post by Openkore »

Here it is:

Code: Select all

http://mahblog202.tumblr.com/rss
Is this standard for every tumblr feed? or it needs to be modified with every tumblr account I want to post from?
User avatar
martin@rootjazz
Site Admin
Posts: 34712
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Xpath or RSS element of a Tumblr feed's image url

Post by martin@rootjazz »

ah sorry, you did say tumblr feed, they are standard. Was early this morning when I replied, brain wasn't working....
User avatar
martin@rootjazz
Site Admin
Posts: 34712
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Xpath or RSS element of a Tumblr feed's image url

Post by martin@rootjazz »

is a difficult one as the image doesn't exist as an RSS tag,it is coming from the encoded description.

So this means it will have to be custom for each blog as we will have to use notation to

1) select the post URL
2) fire a scrape routine to scrape the post URL
3) extract the image using Xpath from the contents of the scraped post URL

If what you are trying to do is post all images as posted to another blog, would be easier to use the IMAGE SCRAPER module, to download all images from a blog to a folder, then use that folder as a watch folder
Openkore
Posts: 45
Joined: Tue Dec 17, 2013 2:06 am

Re: Xpath or RSS element of a Tumblr feed's image url

Post by Openkore »

May we try with this one: http://mahblog202.tumblr.com/rss to see if it works? I'm trying to pull the tags and the description (<title>) too.
User avatar
martin@rootjazz
Site Admin
Posts: 34712
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Xpath or RSS element of a Tumblr feed's image url

Post by martin@rootjazz »

The rss feed module, ideally wants the image to be a tag

Code: Select all

<item>
<title>blah</title>
<img>THIS IS WHAT WE WANT</img>
but tumblr doesn't do that, not for the feed, it is encoded HTML used to specify the image in the description, TJ RSS extractor cannot pull the image from encoded HTML description.

So you would need to pull it from the tumblr post page. As this is THEME DEPENDENT, you would need specify code for each page.
User avatar
martin@rootjazz
Site Admin
Posts: 34712
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Xpath or RSS element of a Tumblr feed's image url

Post by martin@rootjazz »

so if you look at the pinterest image script

Code: Select all

<link>[scrapexpath=//img[@class='pinImage']/@src]

Code: Select all

<link>
this means, use the <link> RSS item


Code: Select all

[scrapexpath=//img[@class='pinImage']/@src]
this means scrape the URL returned from the <link> tag

Code: Select all

img[@class='pinImage']/@src
this bit is the xpath on the page to use to get the image
User avatar
martin@rootjazz
Site Admin
Posts: 34712
Joined: Fri Jan 25, 2013 10:06 pm
Location: The Funk
Contact:

Re: Xpath or RSS element of a Tumblr feed's image url

Post by martin@rootjazz »

looking at one of your posts

we have this code containing the full image

Code: Select all

<div class="post-view">

    <img alt="" src="http://41.media.tumblr.com/056a708c4fb07688b37c7ad6f1641aa2/tumblr_n8n5ddT8471tpwflxo1_400.jpg"></img>

</div>
so your xpath would be

Code: Select all

//div[@class='post-view']/img/@src
making for this blog / theme the RSS SCRIPT CODE to use

Code: Select all

<link>[scrapexpath=//div[@class='post-view']/img/@src]
Openkore
Posts: 45
Joined: Tue Dec 17, 2013 2:06 am

Re: Xpath or RSS element of a Tumblr feed's image url

Post by Openkore »

I see it'd be difficult to configure that for every blog, and I guess the function wouldn't work anymore if the theme gets changed.

So, can you please add the "Remove links in description" and "Remove original description" options to the Mass Reblog function? I think I can use that.

Thanks again...
Post Reply