Sciweavers

JOT
2008

The Stock Statistics Parser

13 years 4 months ago
The Stock Statistics Parser
This paper describes how use the HTMLEditorKit to perform web data mining on stock statistics for listed firms. Our focus is on making use of the web to get information about companies, using their stock symbols and YAHOO finance. We show how to map a stock ticker symbol into a company name gather statistics and derive new information. Our example shows how we extract the number of shares outstanding, total volume over a given time period and compute the turnover for the shares. The methodology is based on using a parser-call-back facility to build up a data structure. Screen scraping is a popular means of data entry, but the unstructured nature of HTML pages makes this a challenge. 1 THE PROBLEM Publicly traded companies have statistical data that is typically available on the web (using a browser to format the HTML data). Given an HTML data source, we would like to find a way to create an underlying data structure that is type-safe and well formulated. We are motivated to study thes...
Douglas Lyon
Added 13 Dec 2010
Updated 13 Dec 2010
Type Journal
Year 2008
Where JOT
Authors Douglas Lyon
Comments (0)