Bayesian performance analysis example in pyfolio

There are a few advanced analysis methods in pyfolio based on Bayesian statistics.

The main benefit of these methods is uncertainty quantification. All the traditional measures of performance, like the Sharpe ratio, are just single numbers. These estimates are noisy because they have been computed over a limited number of data points. So how much can you trust these numbers? You don't know because there is no sense of uncertainty. That is where Bayesian statistics helps as instead of single values, we are dealing with probability distributions that assign degrees of belief to all possible parameter values.

Lets create the Bayesian tear sheet. Under the hood this is running MCMC sampling in PyMC3 to estimate the posteriors which can take quite a while (that's the reason why we don't generate this by default in create_full_tear_sheet).

Import pyfolio

%matplotlib inline
import pyfolio as pf

Fetch the daily returns for a stock

stock_rets = pf.utils.get_symbol_rets('FB')

Create Bayesian tear sheet

out_of_sample = stock_rets.index[-40]
pf.create_bayesian_tear_sheet(stock_rets, live_start_date=out_of_sample)
Running T model


WARNING (theano.gof.compilelock): Overriding existing lock by dead process '23486' (I am process '24728')


Applied log-transform to volatility and added transformed volatility_log to model.
Applied log-transform to nu_minus_two and added transformed nu_minus_two_log to model.
 [-----------------100%-----------------] 2000 of 2000 complete in 14.8 sec
Finished T model (required 279.00 seconds).

Running BEST model
Applied interval-transform to group1_std and added transformed group1_std_interval to model.
Applied interval-transform to group2_std and added transformed group2_std_interval to model.
Applied log-transform to nu_minus_two and added transformed nu_minus_two_log to model.
 [-----------------100%-----------------] 2000 of 2000 complete in 12.6 sec
Finished BEST model (required 132.06 seconds).

Finished plotting Bayesian cone (required 0.21 seconds).


/home/wiecki/miniconda3/lib/python3.5/site-packages/matplotlib/axes/_axes.py:519: UserWarning: No labelled objects found. Use label='...' kwarg on individual plots.
  warnings.warn("No labelled objects found. "



Finished plotting BEST results (required 1.45 seconds).

Finished computing Bayesian predictions (required 0.26 seconds).

Finished plotting Bayesian VaRs estimate (required 0.12 seconds).

Running alpha beta model
Applied log-transform to sigma and added transformed sigma_log to model.
Applied log-transform to nu_minus_two and added transformed nu_minus_two_log to model.
 [-----------------100%-----------------] 2000 of 2000 complete in 7.2 sec
Finished running alpha beta model (required 130.28 seconds).

Finished plotting alpha beta model (required 0.24 seconds).

Total runtime was 543.62 seconds.

png

Lets go through these row by row:

Running models directly

You can also run individual models. All models can be found in pyfolio.bayesian and run via the run_model function.

help(pf.bayesian.run_model)
Help on function run_model in module pyfolio.bayesian:

run_model(model, returns_train, returns_test=None, bmark=None, samples=500, ppc=False)
    Run one of the Bayesian models.

    Parameters
    ----------
    model : {'alpha_beta', 't', 'normal', 'best'}
        Which model to run
    returns_train : pd.Series
        Timeseries of simple returns
    returns_test : pd.Series (optional)
        Out-of-sample returns. Datetimes in returns_test will be added to
        returns_train as missing values and predictions will be generated
        for them.
    bmark : pd.Series or pd.DataFrame (optional)
        Only used for alpha_beta to estimate regression coefficients.
        If bmark has more recent returns than returns_train, these dates
        will be treated as missing values and predictions will be
        generated for them taking market correlations into account.
    samples : int (optional)
        Number of posterior samples to draw.
    ppc : boolean (optional)
        Whether to run a posterior predictive check. Will generate
        samples of length returns_test.  Returns a second argument
        that contains the PPC of shape samples x len(returns_test).

    Returns
    -------
    trace : pymc3.sampling.BaseTrace object
        A PyMC3 trace object that contains samples for each parameter
        of the posterior.

    ppc : numpy.array (if ppc==True)
       PPC of shape samples x len(returns_test).

For example, to run a model that assumes returns to be normally distributed, you can call:

# Run model that assumes returns to be T-distributed
trace = pf.bayesian.run_model('t', stock_rets)
Applied log-transform to volatility and added transformed volatility_log to model.
Applied log-transform to nu_minus_two and added transformed nu_minus_two_log to model.
 [-----------------100%-----------------] 500 of 500 complete in 1.9 sec

The returned trace object can be directly inquired. For example might we ask what the probability of the Sharpe ratio being larger than 0 is by checking what percentage of posterior samples of the Sharpe ratio are > 0:

# Check what frequency of samples from the sharpe posterior are above 0.
print('Probability of Sharpe ratio > 0 = {:3}%'.format((trace['sharpe'] > 0).mean() * 100))
Probability of Sharpe ratio > 0 = 92.0%

But we can also interact with it like with any other pymc3 trace:

import pymc3 as pm
pm.traceplot(trace);

png

Further reading

For more information on Bayesian statistics, check out these resources: