When training a Deep Learning model, one must often read and pre-process data before it can be passed through the model. Depending on the data source and transformations needed, this step can amount to a non-negligable amount of time, which leads to unecessarily longer training times. This bottleneck is often remedied using a
torch.utils.data.DataLoaderfor PyTorch, or a
tf.data.Datasetfor Tensorflow. These structures leverage parallel processing and pre-fetching in order reduce data loading time as much as possible. In this post we will build a simple version of PyTorch’s
DataLoader, and show the benefits of parallel pre-processing.
Google AI recently released a paper, Rethinking Attention with Performers (Choromanski et al., 2020), which introduces Performer, a Transformer architecture which estimates the full-rank-attention mechanism using orthogonal random features to approximate the softmax kernel with linear space and time complexity. In this post we will investigate how this works, and how it is useful for the machine learning community.
Since the 1940s, electric guitarists, keyboardists, and other instrumentalists have been using effects pedals, devices that modify the sound of the original audio source. Typical effects include distortion, compression, chorus, reverb, and delay. Early effects pedals consisted of basic analog circuits, often along with vacuum tubes, which were later replaced with transistors. Although many pedals today apply effects digitally with modern signal processing techniques, many purists argue that the sound of analog pedals can not be replaced by their digital counterparts. We’ll follow a deep learning approach to see if we can use machine learning to replicate the sound of an iconic analog effect pedal, the Ibanez Tube Screamer. This post will be mostly a reproduction of the work done by Alec Wright et al. in Real-Time Guitar Amplifier Emulation with Deep Learning1.
Alec Wright et al., “Real-Time Guitar Amplifier Emulation with ↩
This post is the first in a series of articles about natural language processing (NLP), a subfield of machine learning concerning the interaction between computers and human language. This article will be focused on attention, a mechanism that forms the backbone of many state-of-the art language models, including Google’s BERT (Devlin et al., 2018), and OpenAI’s GPT-2 (Radford et al., 2019).
Inspired by the story of Bill Benter, a gambler who developed a computer model that made him close to a billion dollars1 betting on horse races in the Hong Kong Jockey Club (HKJC), I set out to see if I could use machine learning to identify inefficiencies in horse racing wagering.
Kit Chellel, The Gambler Who Cracked the Horse-Racing Code, ↩
In this post we will be using a method known as transfer learning in order to detect metastatic cancer in patches of images from digital pathology scans.
In my last post, we learned what Logistic Regression is, and how it can be used to classify flowers in the Iris Dataset. In this post we will see how Logistic Regression can be applied to social networks in order to predict future collaboration between researchers. As usual we’ll start by importing a few libraries:
A few posts back I wrote about a common parameter optimization method known as Gradient Ascent. In this post we will see how a similar method can be used to create a model that can classify data. This time, instead of using gradient ascent to maximize a reward function, we will use gradient descent to minimize a cost function. Let’s start by importing all the libraries we need:
In my last post we learned what gradient ascent is, and how we can use it to maximize a reward function. This time, instead of using mean squared error as our reward function, we will use the Sharpe Ratio. We can use reinforcement learning to maximize the Sharpe ratio over a set of training data, and attempt to create a strategy with a high Sharpe ratio when tested on out-of-sample data.
In the next few posts, I will be going over a strategy that uses Machine Learning to determine what trades to execute. Before we start going over the strategy, we will go over one of the algorithms it uses: Gradient Ascent.
In this post we will look at the momentum strategy from Andreas F. Clenow’s book Stocks on the Move: Beating the Market with Hedge Fund Momentum Strategy and backtest its performance using the survivorship bias-free dataset we created in my last post.
When developing a stock trading strategy, it is important that the backtest be as accurate as possible. In some of my previous strategies, I have noted that the backtest did not account for survivorship bias. Survivorship bias is a form of selection bias caused by only focusing on assets that have already passed some sort of selection process.
In my last post we implemented a cross-sectional mean reversion strategy from Ernest Chan’s Algorithmic Trading: Winning Strategies and Their Rationale. In this post we will look at a few improvements we can make to the strategy so we can start live trading!
In this post we will look at a cross-sectional mean reversion strategy from Ernest Chan’s book Algorithmic Trading: Winning Strategies and Their Rationale and backtest its performance using Backtrader.
In my last post we discussed simulation of the 3x leveraged S&P 500 ETF, UPRO, and demonstrated why a 100% long UPRO portfolio may not be the best idea. In this post we will analyze the simulated historical performance of another 3x leveraged ETF, TMF, and explore a leveraged variation of Jack Bogle’s 60 / 40 equity/bond allocation.
In this post we will look at the long term performance of leveraged ETFs, as well as simulate how they may have performed in time periods before their inception.
subscribe via RSS