The finance industry relies heavily on data analysis to make informed investment decisions and manage risk. With the rapid advancement of technology, the availability of financial data has increased exponentially. In this article, we will be exploring the 10 best datasets for finance in 2023.
Dataset Name | Size | Download Link | Description |
---|---|---|---|
S&P 500 Stock Prices | 2,807 records | https://www.kaggle.com/dgawlik/nyse#prices.csv | This dataset includes information on S&P 500 stock prices from 2010 to 2020. |
Global Financial Crisis | 3,656 records | https://www.kaggle.com/wesseb/global-financial-crisis-2008-to-2009 | This dataset includes information on the global financial crisis of 2008 and 2009, including stock prices and macroeconomic indicators. |
Housing Prices | 14,60 records | https://www.kaggle.com/c/home-data-for-ml-course | This dataset includes information on housing prices in King County, Washington from 2014-2015. |
Crypto Market | 8,000 records | https://www.kaggle.com/sudalairajkumar/cryptocurrencypricehistory | This dataset includes information on cryptocurrency prices from 2013 to 2018. |
Bank Marketing | 45,211 records | https://archive.ics.uci.edu/ml/datasets/bank+marketing | This dataset includes information on a bank marketing campaign, including customer demographics and response to marketing efforts. |
Stock Market | 1,600 records | https://www.kaggle.com/szrlee/stock-time-series-20050101-to-20171231 | This dataset includes information on stock market prices and volume from 2005 to 2017. |
Financial Distress | 1,167 records | https://www.kaggle.com/shebrahimi/financial-distress | This dataset includes information on financial distress in US companies from 1996 to 2016. |
Credit Card Fraud Detection | 284,807 records | https://www.kaggle.com/mlg-ulb/creditcardfraud | This dataset includes information on credit card transactions, with a high percentage of fraudulent transactions. |
Santander Customer Transaction Prediction | 200,000 records | https://www.kaggle.com/c/santander-customer-transaction-prediction | This dataset includes information on customer transactions, with a minority of positive classifications (i.e. customer will make a transaction). |
Loan Default | 887 records | https://www.kaggle.com/kashnitsky/topic-4-linear-models-and-sgdr-practice-time | This dataset includes information on loan defaults, with a high imbalance between positive and negative classifications (i.e. loans that defaulted and loans that did not). |