The Stock Statistics Parser

15 years 7 months ago

Download www.jot.fm

This paper describes how use the HTMLEditorKit to perform web data mining on stock statistics for listed firms. Our focus is on making use of the web to get information about companies, using their stock symbols and YAHOO finance. We show how to map a stock ticker symbol into a company name gather statistics and derive new information. Our example shows how we extract the number of shares outstanding, total volume over a given time period and compute the turnover for the shares. The methodology is based on using a parser-call-back facility to build up a data structure. Screen scraping is a popular means of data entry, but the unstructured nature of HTML pages makes this a challenge. 1 THE PROBLEM Publicly traded companies have statistical data that is typically available on the web (using a browser to format the HTML data). Given an HTML data source, we would like to find a way to create an underlying data structure that is type-safe and well formulated. We are motivated to study thes...

Douglas Lyon

Real-time Traffic