Published January 19, 2022 | Version v1
Dataset Open

Aggregated US Bank Dataset

  • 1. Rensselaer Polytechnic Insititute

Description

These two .csv files contain the US bank dataset for FETILDA, containing sections of 10-K reports submitted by US banks from 2006 to 2016. They are directly used by the Python scripts for training, validation, and testing. There are two files, one for Item 1A of the 10-K reports, and the other for Item 7/7A.

Files

sorted_sec1A.csv

Files (269.4 MB)

Name Size Download all
md5:3436a5450243082968398c16407923d6
71.6 MB Preview Download
md5:33e3d2f653b9d744f28d2cba39be70bf
197.7 MB Preview Download

Additional details

Funding

III: EAGER: Knowledge Graph Mining for Financial Risk Analytics 1738895
National Science Foundation