Convolutional-Neural-Networks-for-Financial-Text-Regression

创新工场金融项目：Financial volatility forecasting from annual reports

复现自acl paper Convolutional-Neural-Networks-for-Financial-Text-Regression.

Data

The dataset from Tsai et al. (2016) includes Extended 10-K Corpus available on the U.S. Security Exchange Commission (SEC) Electronic Data Gathering, Analysis and Retrieval (EDGAR) website2. Following previous works (Kogan et al., 2009; Wang et al., 2013; Tsai and Wang, 2014; Tsai et al., 2016), section 7, Management’s Discussion and Analysis (MD&A) is used instead of the complete 10-K report.

The dataset also includes a volatility value for each report of 12 months after the report is published. The volatility value in the dataset is the natural logarithm of stock return volatility and used as the prediction target.

Model

CNN-baseline
CNN-STC
CNN-NTC
CNN-STC-multichannel
CNN-NTC-multichannel

Final hyperparameters are selected as mini-batch size 10, fixed text length 20000, convolution layer kernels 3, 4 and 5 with 100 output features, probability of dropout layer 0.5, and learning rate 0.001.

Results

Model	2008	2009	2010	2011	2012	2013	Avg
CNN-simple (baseline)	0.3716	0.4708	0.1471	0.1312	0.2412	0.2871	0.2748
CNN-STC	0.5358	0.3575	0.3001	0.1215	0.2164	0.1497	0.2801
CNN-NTC-multichannel	0.5077	0.4353	0.1892	0.1605	0.2116	0.1268	0.2718
CNN-STC-multichannel	0.4121	0.4040	0.2428	0.1574	0.2082	0.1676	0.2653
CNN-NTC	0.4672	0.3169	0.2156	0.1154	0.1944	0.1238	0.2388

Performance of different models(from paper), measured by Mean Square Error (MSE). Lower is better and boldface shows the best result among presented models for the corresponding column.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Convolutional-Neural-Networks-for-Financial-Text-Regression

Data

Model

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Convolutional-Neural-Networks-for-Financial-Text-Regression

Data

Model

Results