Predicting The Price Of A Stock

3y ago
22 Views
4 Downloads
5.41 MB
131 Pages
Last View : 1d ago
Last Download : 1d ago
Upload by : Macey Ridenour
Transcription

Predicting the Price of a StockAn Interactive Qualifying Project Reportsubmitted to the Faculty ofWORCESTER POLYTECHNIC INSTITUTEIn partial fulfillment of the requirements for theDegree of Bachelor of SciencebyAndrew P. MurdzaDate: August 17, 2018Project Advisor:Mayer Humi, PhD1

AbstractMany have tried to master the inner workings of the American stock market toreap great profits. In this project, we modeled stock prices to make short-termforecasts. Another model was developed to measure stock volatility. This modelcan be used to estimate the risk of model predictions for different stocks. Wehope that this model for stock prices and volatility index will help investors makeprofitable business decisions.2

ContentsAbstract . 2Executive Summary . 5Introduction . 7Research: Modeling Stock Prices . 7Notation. 8The Autocorrelation Coefficient . 8The Trend Line . 10Fourier Series . 12Using Fourier Series to Improve the Trend Line Model . 12The Margin of Error . 14A Summary of the Algorithm Used to Construct the Prototype Model . 15Improvements on the Prototype Model . 16Increasing Model Precision with Moving Averages . 16Increasing Model Accuracy Using Correlation with Market Indices. 17A Summary of the Algorithm Used to Construct the Index Model . 19Research: Measuring Stock Volatility . 19Dynamical Systems and State Space . 19Chaos and Lyapunov Exponents . 20The Lyapunov Spectrum. 21Embedding a Time Series . 21The Method of False Nearest Neighbors . 22Computing the Maximal Lyapunov Exponent from a Time Series . 25Results: Stock Models . 27Results: Measuring Volatility . 29Conclusion . 35References: . 38Appendix: Individual Stock Graphs . 41Interxion Holdings (INXN) . 4158.com (WUBA) . 46Progress Software (PRGS) . 51Black Baud (BLKB). 56Commvault (CVLT) . 613

Changyou (CYOU) . 65Imperva (IMPV) . 69Guidewire Software (GWRE). 73Paycom (PAYC) . 77Talend (TLND) . 81Appendix: Matlab Code . 86DataGeneratorNew . 86DataGeneratorAvgs . 88get yahoo stockdata3 . 90DataCollector . 97DataCollectorAvgs . 99PrototypeModelDataNew . 101IndexModelDataNew . 112embed . 121falsenearest . 122lyapunovnew . 123Lyapunov2 . 124DataGeneratorLyapunov. 1254

Executive SummaryThis Interactive Qualifying Project (IQP) focused on modeling stock prices usingsignal analysis and measuring stock volatility with Lyapunov exponents. The goalsof this project were to create a model to predict short term stock prices and away to determine which stocks the model would predict most accurately. A moreaccurate stock price model would help investors choose which stocks to invest inand when to sell their stocks to achieve the greatest profit. A way to accuratelymeasure stock volatility would help investors avoid high-risk stocks that couldlead to significant losses in their portfolios.During the first week of the project, 10 software stocks were selected for modeltesting. During the next few weeks, a prototype model for stock prices wasimplemented in MATLAB. Moving averages were later introduced to decrease thenoise in the model input data. The use of moving averages decreased the width ofthe prototype model prediction band by 40.4%, on average, for the 20 stockstested. To account for factors which affected the entire stock subsector, theprototype model was replaced with a weighted average of the models for thestock prices and the S&P 500 index. On average, the index model accuratelypredicted the stock price for 37.5% more days than the prototype model. The lastfew weeks were devoted to measuring stock volatility with Lyapunov exponents.The TISEAN package was used to compute the largest Lyapunov exponent of eachstock over various time intervals and embedding dimensions. It was hypothesizedthat stocks with larger Lyapunov exponents would be more volatile. However, we5

did not find a high correlation between a stock’s maximal Lyapunov exponent andits volatility.6

IntroductionIn recent times, it has been difficult for casual investors to compete withprofessionals. While big investors have access to expensive analysis packages,advanced stock data, and insider information, smaller investors rely on onlygeneral recommendations and their gut feeling. The goal of this InteractiveQualifying Project is to develop tools that casual investors can use to selectprofitable, low-risk stocks. The two tools that this project focused on were amodel to short term predict stock prices, and new a measure of stock volatility.Accurate predictions of future stock prices can enable inexperienced investors tomake more profitable investment decisions than choices based on intuition andexpert recommendations alone. Small investors lack the capital to suffer majorlosses. For this reason, an effective indicator of stock volatility which couldidentify high-risk stocks would be of great help to amateur investors.The model used to predict short term stock prices was a further development of aformulated by a previous IQP team. The TISEAN package was then applied toquantify the risk level of a stock. Small investors could use the TISEAN package toidentify low-risk stocks and then apply the model to decide which stocks to investin and when to buy and sell them.Research: Modeling Stock PricesWe will model stock prices using time series analysis. Time series, which includestock prices, are sets of data ordered from least recent to most recent. In the firstpart of this section, we present several tools from time series analysis and7

describe how they can be applied to predict stock prices. In the second part ofthis section, we will use chaos theory to model the volatility of a stock. We startby introducing some notation.NotationThroughout this paper, we will represent a time series of 𝑁 data points by thesequence 𝑥1 , , 𝑥𝑁 . We call each term in the sequence an observation. Wedenote the times at which the observations 𝑥1 , ,𝑥𝑁 occur by 𝑡1 , , 𝑡𝑁 . Using thisnotation, we may write a time series in the form (𝑡1 , 𝑥1 ), , (𝑡𝑁 , 𝑥𝑁 ). We definethe mean of the observations as𝑁1𝑋̅ 𝑋𝑖 .𝑁(1)𝑖 1Similarly, we define the mean of the observation times by𝑁1𝑡̅ 𝑡𝑖 .𝑁(2)𝑖 1As previously mentioned, we will plan to model stock prices to make futurepredictions. To construct these models, we will fit a curve to the closing prices ofthe stock for some number of days in the past. In the following section we discusshow we will use the autocorrelation coefficient to determine the ideal number ofdays to use to generate our model.The Autocorrelation CoefficientThe autocorrelation coefficient is a statistic that measures the similarity of thecurrent value of a time series to previous observations in the series. Theautocorrelation coefficient is computed at a specified number of lags. The8

autocorrelation at lag 𝑘 quantifies the strength and direction of the relationshipbetween the first 𝑘 observations and the current observation. It is the correlationbetween the time series and the time series after it is lagged 𝑘 times. It iscalculated with the equation̅̅ 𝑁 𝑘𝑖 1 (𝑋𝑖 𝑋 )(𝑋𝑖 𝑘 𝑋 )𝑟𝑘 ,2̅ 𝑁(𝑋) 𝑋𝑖 1 𝑖where 𝑋̅ is given by (1).If the autocorrelation at lags 1, 2, , 𝑘 are all positive, then there is somerelationship between the first 𝑘 observations and the current observation. Fromthis fact we obtain a way to determine the number of points we should use tomodel our time series of stock prices. To find the optimal number of stock pricesto be used in our models, we first compute the autocorrelation of the stock pricesat lags 1 to the number of business days in the past year, 251. We then find thelargest number 𝑘 such that the autocorrelation at lags 1 to 𝑘 are all positive. Thevalue of 𝑘 represents the maximum number of business days from the presentduring which all the stock price is relevant to the current stock price. Therefore,we use the 𝑘 most recent stock prices for our models.9

Graph of Autocorrelation for the Paycom (PAYC) StockThe above figure displays the autocorrelation of the Paycom stock prices for lags1-251. Note that lag 1 corresponds to 6/1/18 and lag 251 corresponds to the date251 business days before 6/1/18, which is 6/2/17. The lag where theautocorrelation becomes negative for the first time, lag 84, corresponds to thedate 84 business days before 6/1/18, 1/31/18. Stock data between 1/31/18 and6/1/18 was used to generate the models for the Paycom stock price.We will next present our first approach to modeling stock prices: the trend line.The Trend LineThe trend line fit to a set of data is the line whose slope and y-intercept arechosen to minimize the average squared distance between the line and the data.It has the general form𝑋𝐿 (𝑡) 𝑑0 𝑑1 𝑡.10

The slope of the trend line, 𝑑1 , is calculated as̅ 𝑁𝑖 1[(𝑋𝑖 𝑋 )(𝑡𝑖 𝑡 ̅)]𝑑1 ,̅ 2 𝑁𝑖 1(𝑡𝑖 𝑡 )where 𝑋̅ and 𝑡̅ are given by (1)-(2).The y-intercept of the line, 𝑑0 , is determined by𝑑0 𝑋̅ 𝑐1 𝑡̅.The above figure displays the (red) past and (blue) future stock prices of Paycomand the (black) trend line fit to the past prices. A green dashed line separates thepast from the future.A trend line generated from pairs of dates and stock prices gives a rough estimateof the future direction of a stock. We want more accurate predictions than11

offered by the trend line alone. To decrease the model error, we introduceFourier series.Fourier SeriesA Fourier series is a sum of a sine and cosine functions. They are useful inmodeling period and oscillatory data. An 𝑛𝑡ℎ order Fourier series has the form𝑛𝑛𝑘 1𝑘 112𝜋𝑛𝑡2𝜋𝑛𝑡𝑋𝑓 (𝑡) 𝑎0 𝑎𝑘 cos () 𝑏𝑘 sin ().2𝑇𝑇When an 𝑛𝑡ℎ order Fourier series is fit to a time series (𝑡1 , 𝑒1 ), , (𝑡𝑁 , 𝑒𝑁 ), thecoefficients are computed with the equations𝑁1𝑎0 𝑒𝑖 ,𝑁𝑗 1𝑁𝑎𝑘 𝑋𝑖 cos (𝑖 12𝜋𝑘𝑡𝑖) 𝑒𝑖 ,𝑇𝑁𝑏𝑘 𝑋𝑖 sin (𝑖 12𝜋𝑘𝑡𝑖) 𝑒𝑖𝑇The variable 𝑇 is the period of the Fourier Series. We set it equal to the timespanned by all the stock prices, 𝑡𝑁 𝑡1 . The ideal number of terms of the Fourierseries we use depends on amount of data we generate it from. We are now readyto describe how we can improve our trend line model with Fourier series.Using Fourier Series to Improve the Trend Line ModelIn this section we will incorporate Fourier series into the trend line to increase theaccuracy of the model predictions. We compute the difference between the12

actual value of various stocks and the trend line, which we call the trend lineresiduals. We see that the trend line residuals oscillate with time. It follows thatwe should model these residuals with Fourier series. If there at least 60 datapoints, we use a third order Fourier series to model the differences. If 40 or fewerpoints are used, then we apply a second order Fourier series. If we have between40 and 60 data points, then we fit both a second order and third order Fourierseries to the differences and choose the series with the higher 𝑅2 value for ourmodel.The above figure displays the past (red) and future (blue) trend line residuals, andthe (black) Fourier model fit to the past differences. A green dashed lineseparates the past from the future.13

Combining the trend line with the Fourier Series produces an overall model of theform𝑛𝑛𝑘 1𝑘 112𝜋𝑛𝑡2𝜋𝑛𝑡𝑋𝑝 (𝑡) 𝑎0 𝑑0 𝑑1 𝑡 𝑎𝑘 cos () 𝑏𝑘 sin ()2𝑇𝑇We refer to this overall model as the prototype model. We call the errors in thepredictions of the prototype data noise, which represent fluctuations in pricesdue to factors we haven’t accounted for in our model. The function 𝑋𝑝 (𝑡) gives asingle value for the price of the stock at a given time 𝑡, which we call a pointestimate. Although a point estimate for the price of a stock at a given time isuseful, we also want a range of values in which the stock price is mostly likely tobe found. In the next section we will develop lower and upper bounds for stockprice at a given time. We will also describe how to measure the precision of ourmodels with the margin of error.The Margin of ErrorAfter we use our stock price data to construct our prototype model, 𝑋𝑝 (𝑡), wewill add upper and lower bounds to the model. We set the model upper boundequal to the sum of the prototype model and the maximum noise. Similarly, weadd the minimum noise to the prototype model to obtain the model lower bound.This ensures that the entire time series used to generate the model is betweenthe upper and lower bounds. We call the region between the upper and lowerbounds the prediction band of the model.The precision of the model predictions is measured by the model’s margin oferror. A low margin corresponds to a narrow prediction band and a precise14

model. The margin of error can be computed as half of the difference of themaximum and minimum noise or as the maximum of the absolute value of thenoise. We use the former method when we compute the margin of error,although we would obtain similar results if we used the latter.The above graph includes the (red) past and (blue) future stock prices of Paycom,and the (black) prototype model fit to the past prices. A green dashed lineseparates the past from the future and magenta dashed line indicates where themodel fails.We next summarize how we determine the prototype model from a set of data.A Summary of the Algorithm Used to Construct the Prototype Model1. Compute the autocorrelation of the stock prices from the past year for lags 1to 251.2. Find the largest number 𝑘 such that the autocorrelations corresponding to lags1 to 𝑘 are all positive.15

3. Fit a trend line to the 𝑘 most recent stock prices.4. Compute the difference between the 𝑘 most recent stock prices and the trendline predictions.5. Fit a Fourier Series to the differences.6. Compute the predictions of the prototype model by summing the predictionsof the trend line and the Fourier series7. Calculate the data noise by subtracting the model predictions from the 𝑘 mostrecent stock prices.8. Compute the model upper bound by adding the maximum noise to the modelpredictions.9. Calculate the model lower bound by adding the minimum noise to the modelpredictions.Improvements on the Prototype ModelTwo measures of the effectiveness of a model are its accuracy and its precision. Amodel is accurate if

Appendix: Matlab Code . The TISEAN package was used to compute the largest Lyapunov exponent of each . We will model stock prices using time series analysis. Time series, which include stock prices, are sets of data ordered from least recent to most recent. In the first

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

Food outlets which focused on food quality, Service quality, environment and price factors, are thè valuable factors for food outlets to increase thè satisfaction level of customers and it will create a positive impact through word ofmouth. Keyword : Customer satisfaction, food quality, Service quality, physical environment off ood outlets .

Le genou de Lucy. Odile Jacob. 1999. Coppens Y. Pré-textes. L’homme préhistorique en morceaux. Eds Odile Jacob. 2011. Costentin J., Delaveau P. Café, thé, chocolat, les bons effets sur le cerveau et pour le corps. Editions Odile Jacob. 2010. Crawford M., Marsh D. The driving force : food in human evolution and the future.