添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

can someone help with Xpath?

Here my simplified workflow . . .

I want to retrieve the column of symbols (tickers). I suppose something has changed and I tried a lot but the ticker column keeps empty.

Of course I make a stupid mistake but I don’t see it :wink:

Many thnx in advance

Xpath_1.knwf (10.7 KB)

Hey there,

I think there are two issues at play - one minor and probably an oversight and one bigger one…

The small one: As far as I can tell from your workflow you are right now getting google.com homepage as response - you may want to select your URL column rather than having the default google.com address scraped :slight_smile:

That said, it looks like yahoo does not like to be pinged this way - the node responds with 503 error.

My gut feeling is that you may have to opt for using the KNIME Web Interaction Extension to have KNIME open the website in a browser and then grab the data. There was a just KNIME it challenge to extract economic use from yahoo finance using exactly this extension.

Here’s the solution thread with plenty of options to pick from to see how it can work:

https://forum.knime.com/t/solutions-to-just-knime-it-challenge-9-season-3/81017/30

Here is my solution:

Hi MartinDDDD,

many thnx for your very quick respons

regarding your 1st remark your fully right . . . sorry I abuse your time . . . I was inaccurate constructing my basic example flow . . . mea culpa

I tried to run your suggestion and filled 2nd node Navigator Labs with this URL: Yahooist Teil der Yahoo Markenfamilie

That produces the following error . . .
ERROR Navigator (Labs) 5:2 Execute failed: HTTPConnectionPool(host=‘localhost’, port=30459): Max retries exceeded with url: /session/a72344c9fef2b061b15aa84e3c85a12f/url (Caused by NewConnectionError(‘<urllib3.connection.HTTPConnection object at 0x0000023659746AA0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it’))

Strange because the mentioned URL is an existing page. An also tried “Refresh”.
Any idea why it produces this error?

When I run your unchanged example (with URL Yahooist Teil der Yahoo Markenfamilie ) the results are missing values.

I hope you (or someone else) can help

THNX in advance

Several years ago I developed a Yahoo Finance URL to use in a GET request. It no longer works due to a variety of changes Yahoo has made. It still may be possible, but frankly after reading a variety of posts on the subject its beyond me. I was able to develop a Python script employing the yfinance package which seems to work fine. I’ve wrapped the workflow in a component so it has interactive inputs. You’ll need a Python environment with the packages highlighted below. You can add a Table write… I’ve modified the original to include conda propagation for the required Python environment as well as writing an output. You’ll need to change the location of the Excel Writer in the String Manipulation node.