visual studio 2010 - Parsing Financial information from HTML -
first attempt @ learning work html in visual studio , c#. using html agility pack library. parsing.
from page attempting pull out numbers "net income" row each quarter.
here current progress, (but uncertain of how proceed further):
string url = "http://www.google.com/finance?q=nasdaq:txn&fstype=ii" var webget = new htmlweb(); var document = webget.load(url); var body = document.documentnode.descendants() .where(n => n.name == "body") .firstordefault(); if (body != null) { }
well, first of there's no need body first, can directly query document want. finding value you're looking for, how it:
htmlnode tdnode = document.documentnode.descendantnodes() .firstordefault(n => n.name == "td" && n.innertext.trim() == "net income"); if (tdnode != null) { htmlnode trnode = tdnode.parentnode; foreach (htmlnode node in trnode.descendantnodes().where(n => n.nodetype == htmlnodetype.element)) { console.writeline(node.innertext.trim()); //output: //net income //265.00 //298.00 //601.00 //672.00 //666.00 } } also note trim calls because there newlines in innertext of elements.
Comments
Post a Comment