dc.contributor.advisor | Agah, Arvin | |
dc.contributor.author | Bari, Omar Abdul | |
dc.date.accessioned | 2017-05-15T01:37:48Z | |
dc.date.available | 2017-05-15T01:37:48Z | |
dc.date.issued | 2016-12-31 | |
dc.date.submitted | 2016 | |
dc.identifier.other | http://dissertations.umi.com/ku:14959 | |
dc.identifier.uri | http://hdl.handle.net/1808/24161 | |
dc.description.abstract | Event Studies in Finance have focused on traditional news headlines to assess the impact an event has on a traded company. The increased proliferation of news and information produced by social media content has disrupted this trend. Although researchers have begun to identify trading opportunities from social media platforms, such as Twitter, almost all techniques use a general sentiment from large collections of tweets. Though useful, general sentiment does not provide an opportunity to indicate specific events worthy of affecting stock prices. This work presents an event clustering algorithm, utilizing natural language processing techniques to generate newsworthy events from Twitter, which have the potential to influence stock prices in the same manner as traditional news headlines. The event clustering method addresses the effects of pre-news and lagged-news, two peculiarities that appear when connecting trading and news, regardless of the medium. Pre-news signifies a finding where stock prices move in advance of a news release. Lagged-news refers to follow-up or late-arriving news, adding redundancy in making trading decisions. For events generated by the proposed clustering algorithm, we have designed and implemented novel language and time-series techniques -- incorporating Event Studies and Machine Learning to produce an actionable system that can guide trading decisions. Of the various methods considered, the emphasis was particularly on the state-of-the-art established methods versus modern Deep Learning techniques. The recommended prediction algorithms provide investing strategies with profitable risk-adjusted returns. The suggested language models present Annualized Sharpe Ratios (risk-adjusted returns) in the 5 to 11 range, while time-series models produce in the 2 to 3 range (without transaction costs). A close investigation of the distribution of returns confirms the encouraging Sharpe Ratios by identifying most outliers as significant positive gains. Additionally, Machine Learning metrics of precision, recall, and accuracy are discussed alongside financial metrics in hopes of bridging the gap between academia and industry in the field of Computational Finance. | |
dc.format.extent | 129 pages | |
dc.language.iso | en | |
dc.publisher | University of Kansas | |
dc.rights | Copyright held by the author. | |
dc.subject | Computer science | |
dc.subject | Artificial intelligence | |
dc.subject | Finance | |
dc.subject | Algorithmic Trading | |
dc.subject | Computational Finance | |
dc.subject | Machine Learning | |
dc.subject | Natural Language Processing | |
dc.subject | Sentiment Analysis | |
dc.subject | Time-Series Classification | |
dc.title | Ensembles of Text and Time-Series Models for Automatic Generation of Financial Trading Signals | |
dc.type | Dissertation | |
dc.contributor.cmtemember | Evans, Joseph | |
dc.contributor.cmtemember | Gill, Andrew | |
dc.contributor.cmtemember | Grzymala-Busse, Jerzy | |
dc.contributor.cmtemember | Wilson, Sara | |
dc.thesis.degreeDiscipline | Electrical Engineering & Computer Science | |
dc.thesis.degreeLevel | Ph.D. | |
dc.identifier.orcid | | |
dc.rights.accessrights | openAccess | |