A 45-page survey paper on recent development of high-frequency financial data analysis. Enjoy yourself:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4834362
Call For Papers
Dear Colleagues,
I am serving as the guest editor for a special issue of the journal Risks, and the topics are surrounding modern statistical and machine learning / deep learning methods for financial data. Some of the relevant topics are discussed below, but you are welcome to submit anything that you think is related. Please do not hesitate to contact me if you have any questions; sending me an email directly would be fine.
Discovering Intraday Tail Dependence Patterns via a Full-Range Tail Dependence Copula
My new paper Discovering Intraday Tail Dependence Patterns via a Full-Range Tail Dependence Copula is just published and highlighted as the Cover Story in the journal Risks. This paper promotes the use of full-range tail dependence copulas for real-world applications, and in particular some interesting intraday tail dependence patterns of the U.S financial markets have been discovered.
Risks is a peer-reviewed journal, all publications being freely available to all readers. Here is a link to the full text of the paper:
Pricing Cyber Insurance for a Large-Scale Network
A new paper is just published in the Variance Journal: Pricing Cyber Insurance for a Large-Scale Network. This paper provides a pricing framework of cyber insurance for a large-scale network. Here is the link to the full text:
https://variancejournal.org/article/29160-pricing-cyber-insurance-for-a-large-scale-network
Variance is a peer-reviewed, open-access journal published by the Casualty Actuarial Society to disseminate original practical and theoretical research in casualty actuarial science.
So grateful
I am so grateful that the Editorial Board of The North American Actuarial Journal (NAAJ) has awarded the Annual Best Paper to me and to my collaborator Dr. Xu at ISU. The commemorative plaque (see below) is so beautiful, and it is definitely an uncommon excitement during such a difficult time.
By the way, there will be a new paper with an improved algorithm to be published in the CAS’s Variance Journal, and please stay tuned if you are interested.
Useful Links
Datasets Kaggle Datasets
UC Irvine Machine Learning Repository
LOBSTER limit order book samples
Cryptocurrency Futures Trading Data at BitMEX
IEX TOPS and DEEP
R Cheatsheets There are quite a few excellent cheatsheets provides by Rstudio and other contributors, and here listed are some most relevant ones that are useful for financial machine learning with R. Please refer to Rstudio Cheatsheets for a lot more useful cheatsheets.
Is The Classifier Even Better Than My Guess?
This question might sound a little silly; how come a sophisticated, advanced, well-built statistical or machine learning model is even worse than my guess? Well, it can happen and it can happen quite often especially for financial data. You must have heard of “garbage in, garbage out”. This is no difference for financial data. What if what you have been doing is to find gold from pure sands? As the signal-to-noise ratio of financial data is almost always very low, the chance is that you have tried very hard but end up with a model which is not better and even worse than your guess.
Deep Learning Research Platform using R
In this article, you will learn how to set up a research environment for modern machine learning techniques, using R, Rstudio, Keras, Tensorflow, and Nvidia GPU.
AWS EC2 users This is probably the easiest approach, and the following steps are used to set up an RStudio server on an AWS EC2 instance with GPU, Tensorflow and Keras pre-installed. If you use a non-linux operational system, this might be the best choice for you to avoid potential hassles.
R tricks
Here I maintain a list of R tips and tricks that I find useful. Please comment in below if you find something useful but not listed.
R tips and tricks Avoid using for loops in R as much as possible; use apply, sapply, etc., or some R pre-packaged functions such as cumsum and aggregate so that for loops might be avoided.
Dealing with large dataset:
The function fread in the data.
Linux tricks
Linux-based operation systems such as Ubuntu 14.04/16.04/18.04 that I have been using are great for scientific computing and for testing and using numerous open source software. I used to be a heavy Windows user, but now I use Windows less and less only when it has to be used.
I haven’t studied Linux systematically, and my main approaches have been to google what I need. The following are some useful Linux tricks that are probably what a Linux newbie like me most want to know.