Alphalens¶
Tear Sheets¶
-
class
alphalens.tears.
GridFigure
(rows, cols)¶ Bases:
object
It makes life easier with grid plots
Methods
-
close
()¶
-
next_cell
()¶
-
next_row
()¶
-
-
alphalens.tears.
create_event_returns_tear_sheet
(factor_data, prices, avgretplot=(5, 15), long_short=True, group_neutral=False, by_group=False)¶ Creates a tear sheet to view the average cumulative returns for a factor within a window (pre and post event).
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex Series indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, the factor quantile/bin that factor value belongs to and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
prices : pd.DataFrame
A DataFrame indexed by date with assets in the columns containing the pricing data. - See full explanation in utils.get_clean_factor_and_forward_returns
avgretplot: tuple (int, int) - (before, after)
If not None, plot quantile average cumulative returns
long_short : bool
Should this computation happen on a long short portfolio? if so then factor returns will be demeaned across the factor universe
group_neutral : bool
Should this computation happen on a group neutral portfolio? if so, returns demeaning will occur on the group level.
by_group : bool
If True, display graphs separately for each group.
-
alphalens.tears.
create_event_returns_tear_sheet_api_change_warning
(func)¶ Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:
- create_event_returns_tear_sheet(factor_data,
- prices, avgretplot=(5, 15), long_short=True, by_group=False)
- New API:
- create_event_returns_tear_sheet(factor_data,
- prices, avgretplot=(5, 15), long_short=True, group_neutral=False, by_group=False)
Eventually this function can be deleted
-
alphalens.tears.
create_event_study_tear_sheet
(factor_data, prices=None, avgretplot=(5, 15))¶ Creates an event study tear sheet for analysis of a specific event.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single event, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to.
prices : pd.DataFrame, required only if ‘avgretplot’ is provided
A DataFrame indexed by date with assets in the columns containing the pricing data. - See full explanation in utils.get_clean_factor_and_forward_returns
avgretplot: tuple (int, int) - (before, after), optional
If not None, plot event style average cumulative returns within a window (pre and post event).
-
alphalens.tears.
create_full_tear_sheet
(factor_data, long_short=True, group_neutral=False, by_group=False)¶ Creates a full tear sheet for analysis and evaluating single return predicting (alpha) factor.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
long_short : bool
Should this computation happen on a long short portfolio? - See tears.create_returns_tear_sheet for details on how this flag affects returns analysis
group_neutral : bool
Should this computation happen on a group neutral portfolio? - See tears.create_returns_tear_sheet for details on how this flag affects returns analysis - See tears.create_information_tear_sheet for details on how this flag affects information analysis
by_group : bool
If True, display graphs separately for each group.
-
alphalens.tears.
create_full_tear_sheet_api_change_warning
(func)¶ Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:
- create_full_tear_sheet(factor_data,
- long_short=True, group_adjust=False, by_group=False)
- New API:
- create_full_tear_sheet(factor_data,
- long_short=True, group_neutral=False, by_group=False)
Eventually this function can be deleted
-
alphalens.tears.
create_information_tear_sheet
(factor_data, group_neutral=False, by_group=False)¶ Creates a tear sheet for information analysis of a factor.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
group_neutral : bool
Demean forward returns by group before computing IC.
by_group : bool
If True, display graphs separately for each group.
-
alphalens.tears.
create_information_tear_sheet_api_change_warning
(func)¶ Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:
- create_information_tear_sheet(factor_data,
- group_adjust=False, by_group=False)
- New API:
- create_information_tear_sheet(factor_data,
- group_neutral=False, by_group=False)
Eventually this function can be deleted
-
alphalens.tears.
create_returns_tear_sheet
(factor_data, long_short=True, group_neutral=False, by_group=False)¶ Creates a tear sheet for returns analysis of a factor.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
long_short : bool
Should this computation happen on a long short portfolio? if so, then mean quantile returns will be demeaned across the factor universe. Additionally factor values will be demeaned across the factor universe when factor weighting the portfolio for cumulative returns plots
group_neutral : bool
Should this computation happen on a group neutral portfolio? if so, returns demeaning will occur on the group level. Additionally each group will weight the same in cumulative returns plots
by_group : bool
If True, display graphs separately for each group.
-
alphalens.tears.
create_returns_tear_sheet_api_change_warning
(func)¶ Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:
- create_returns_tear_sheet(factor_data,
- long_short=True, by_group=False)
- New API:
- create_returns_tear_sheet(factor_data,
- long_short=True, group_neutral=False, by_group=False)
Eventually this function can be deleted
-
alphalens.tears.
create_summary_tear_sheet
(factor_data, long_short=True, group_neutral=False)¶ Creates a small summary tear sheet with returns, information, and turnover analysis.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
long_short : bool
Should this computation happen on a long short portfolio? if so, then mean quantile returns will be demeaned across the factor universe.
group_neutral : bool
Should this computation happen on a group neutral portfolio? if so, returns demeaning will occur on the group level.
-
alphalens.tears.
create_turnover_tear_sheet
(factor_data, turnover_periods=None)¶ Creates a tear sheet for analyzing the turnover properties of a factor.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
turnover_periods : sequence[string], optional
Periods to compute turnover analysis on. By default periods in ‘factor_data’ are used but custom periods can provided instead. This can be useful when periods in ‘factor_data’ are not multiples of the frequency at which factor values are computed i.e. the periods are 2h and 4h and the factor is computed daily and so values like [‘1D’, ‘2D’] could be used instead
Performance¶
-
alphalens.performance.
average_cumulative_return_by_quantile
(factor_data, prices, periods_before=10, periods_after=15, demeaned=True, group_adjust=False, by_group=False)¶ Plots average cumulative returns by factor quantiles in the period range defined by -periods_before to periods_after
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
prices : pd.DataFrame
A wide form Pandas DataFrame indexed by date with assets in the columns. Pricing data should span the factor analysis time period plus/minus an additional buffer window corresponding to periods_after/periods_before parameters.
periods_before : int, optional
How many periods before factor to plot
periods_after : int, optional
How many periods after factor to plot
demeaned : bool, optional
Compute demeaned mean returns (long short portfolio)
group_adjust : bool
Returns demeaning will occur on the group level (group neutral portfolio)
by_group : bool
If True, compute cumulative returns separately for each group
Returns
——-
cumulative returns and std deviation : pd.DataFrame
A MultiIndex DataFrame indexed by quantile (level 0) and mean/std (level 1) and the values on the columns in range from -periods_before to periods_after If by_group=True the index will have an additional ‘group’ level
--------------------------------------------------- | | -2 | -1 | 0 | 1 | ... --------------------------------------------------- quantile | | | | | | --------------------------------------------------- | mean | x | x | x | x | 1 --------------------------------------- | std | x | x | x | x | --------------------------------------------------- | mean | x | x | x | x | 2 --------------------------------------- | std | x | x | x | x | --------------------------------------------------- ... | ... ---------------------------------------------------
-
alphalens.performance.
common_start_returns
(factor, prices, before, after, cumulative=False, mean_by_date=False, demean_by=None)¶ A date and equity pair is extracted from each index row in the factor dataframe and for each of these pairs a return series is built starting from ‘before’ the date and ending ‘after’ the date specified in the pair. All those returns series are then aligned to a common index (-before to after) and returned as a single DataFrame
Parameters: factor : pd.DataFrame
DataFrame with at least date and equity as index, the columns are irrelevant
prices : pd.DataFrame
A wide form Pandas DataFrame indexed by date with assets in the columns. Pricing data should span the factor analysis time period plus/minus an additional buffer window corresponding to after/before period parameters.
before:
How many returns to load before factor date
after:
How many returns to load after factor date
cumulative: bool, optional
Return cumulative returns
mean_by_date: bool, optional
If True, compute mean returns for each date and return that instead of a return series for each asset
demean_by: pd.DataFrame, optional
DataFrame with at least date and equity as index, the columns are irrelevant. For each date a list of equities is extracted from ‘demean_by’ index and used as universe to compute demeaned mean returns (long short portfolio)
Returns: aligned_returns : pd.DataFrame
Dataframe containing returns series for each factor aligned to the same index: -before to after
-
alphalens.performance.
compute_mean_returns_spread
(mean_returns, upper_quant, lower_quant, std_err=None)¶ Computes the difference between the mean returns of two quantiles. Optionally, computes the standard error of this difference.
Parameters: mean_returns : pd.DataFrame
DataFrame of mean period wise returns by quantile. MultiIndex containing date and quantile. See mean_return_by_quantile.
upper_quant : int
Quantile of mean return from which we wish to subtract lower quantile mean return.
lower_quant : int
Quantile of mean return we wish to subtract from upper quantile mean return.
std_err : pd.DataFrame
Period wise standard error in mean return by quantile. Takes the same form as mean_returns.
Returns: mean_return_difference : pd.Series
Period wise difference in quantile returns.
joint_std_err : pd.Series
Period wise standard error of the difference in quantile returns.
-
alphalens.performance.
create_pyfolio_input
(factor_data, period, long_short=True, group_neutral=False, quantiles=None, groups=None)¶ - WARNING: this API is still in experimental phase and input/output
- paramenters might change in the future
Simulate a portfolio using the factor in input and returns a DataFrames containing the portfolio returns formatted for pyfolio.
For more details on how this portfolio is built see: - performance.factor_returns (how assets weights are computed) - performance.cumulative_returns (how the portfolio returns are computed)
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
period : string
‘factor_data’ column name corresponding to the ‘period’ returns to be used in the computation of porfolio returns
long_short : bool, optional
Should this computation happen on a long short portfolio? if so, then factor values will be demeaned across the factor universe when factor weighting the portfolio for cumulative returns plots
group_neutral : bool, optional
Should this computation happen on a group neutral portfolio? if so, factor values demeaning will occur on the group level. Additionally each group will weight the same in cumulative returns plots
quantiles: sequence[int], optional
Use only specific quantiles in the computation. By default all quantiles are used
groups: sequence[string], optional
Use only specific groups in the computation. By default all groups are used
Returns: returns : pd.Series
Daily returns of the strategy, noncumulative.
- benchmark : pd.Series
Benchmark returns computed as the factor universe mean daily returns. If ‘1D’ period column is not present in the factor_data the beanchmark returns is not computed and returned as ‘None’
-
alphalens.performance.
cumulative_returns
(returns, period, freq=None)¶ Builds cumulative returns from ‘period’ returns. This function simulate the cumulative effect that a series of gains or losses (the ‘retuns’) have on an original amount of capital over a period of time.
if F is the frequency at which returns are computed (e.g. 1 day if ‘returns’ contains daily values) and N is the period for which the retuns are computed (e.g. returns after 1 day, 5 hours or 3 days) then: - if N <= F the cumulative retuns are trivially computed as Compound Return - if N > F (e.g. F 1 day, and N is 3 days) then the returns overlap and the
cumulative returns are computed building and averaging N interleaved sub portfolios (started at subsequent periods 1,2,..,N) each one rebalancing every N periods. This correspond to an algorithm which trades the factor every single time it is computed, which is statistically more robust and with a lower volatity compared to an algorithm that trades the factor every N periods and whose returns depend on the specific starting day of trading.Also note that when the factor is not computed at a specific frequency, for exaple a factor representing a random event, it is not efficient to create multiples sub-portfolios as it is not certain when the factor will be traded and this would result in an underleveraged portfolio. In this case the simulated portfolio is fully invested whenever an event happens and if a subsequent event occur while the portfolio is still invested in a previous event then the portfolio is rebalanced and split equally among the active events.
Parameters: returns: pd.Series
pd.Series containing factor ‘period’ forward returns, the index contains timestamps at which the trades are computed and the values correspond to returns after ‘period’ time
period: pandas.Timedelta or string
Length of period for which the returns are computed (1 day, 2 mins, 3 hours etc). It can be a Timedelta or a string in the format accepted by Timedelta constructor (‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)
freq : pandas DateOffset, optional
Used to specify a particular trading calendar. If not present returns.index.freq will be used
Returns: pd.Series
Cumulative returns series
-
alphalens.performance.
factor_alpha_beta
(factor_data, returns=None, demeaned=True, group_adjust=False, equal_weight=False)¶ Compute the alpha (excess returns), alpha t-stat (alpha significance), and beta (market exposure) of a factor. A regression is run with the period wise factor universe mean return as the independent variable and mean period wise return from a portfolio weighted by factor values as the dependent variable.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
returns : pd.DataFrame, optional
Period wise factor returns. If this is None then it will be computed with ‘factor_returns’ function and the passed flags: ‘demeaned’, ‘group_adjust’, ‘equal_weight’
demeaned : bool
Control how to build factor returns used for alpha/beta computation – see performance.factor_return for a full explanation
group_adjust : bool
Control how to build factor returns used for alpha/beta computation – see performance.factor_return for a full explanation
equal_weight : bool, optional
Control how to build factor returns used for alpha/beta computation – see performance.factor_return for a full explanation
Returns: alpha_beta : pd.Series
A list containing the alpha, beta, a t-stat(alpha) for the given factor and forward returns.
-
alphalens.performance.
factor_information_coefficient
(factor_data, group_adjust=False, by_group=False)¶ Computes the Spearman Rank Correlation based Information Coefficient (IC) between factor values and N period forward returns for each period in the factor index.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
group_adjust : bool
Demean forward returns by group before computing IC.
by_group : bool
If True, compute period wise IC separately for each group.
Returns: ic : pd.DataFrame
Spearman Rank correlation between factor and provided forward returns.
-
alphalens.performance.
factor_rank_autocorrelation
(factor_data, period=1)¶ Computes autocorrelation of mean factor ranks in specified time spans. We must compare period to period factor ranks rather than factor values to account for systematic shifts in the factor values of all names or names within a group. This metric is useful for measuring the turnover of a factor. If the value of a factor for each name changes randomly from period to period, we’d expect an autocorrelation of 0.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
period: string or int, optional
Period over which to calculate the turnover. If it is a string it must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc).
Returns
——-
autocorr : pd.Series
Rolling 1 period (defined by time_rule) autocorrelation of factor values.
-
alphalens.performance.
factor_returns
(factor_data, demeaned=True, group_adjust=False, equal_weight=False, by_asset=False)¶ Computes period wise returns for portfolio weighted by factor values.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
demeaned : bool
Control how to build factor weights – see performance.factor_weights for a full explanation
group_adjust : bool
Control how to build factor weights – see performance.factor_weights for a full explanation
equal_weight : bool, optional
Control how to build factor weights – see performance.factor_weights for a full explanation
by_asset: bool, optional
If True, returns are reported separately for each esset.
Returns: returns : pd.DataFrame
Period wise factor returns
-
alphalens.performance.
factor_weights
(factor_data, demeaned=True, group_adjust=False, equal_weight=False)¶ Computes asset weights by factor values and dividing by the sum of their absolute value (achieving gross leverage of 1). Positive factor values will results in positive weights and negative values in negative weights.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
demeaned : bool
Should this computation happen on a long short portfolio? if True, weights are computed by demeaning factor values and dividing by the sum of their absolute value (achieving gross leverage of 1). The sum of positive weights will be the same as the negative weights (absolute value), suitable for a dollar neutral long-short portfolio
group_adjust : bool
Should this computation happen on a group neutral portfolio? If True, compute group neutral weights: each group will weight the same and if ‘demeaned’ is enabled the factor values demeaning will occur on the group level.
equal_weight : bool, optional
if True the assets will be equal-weighted instead of factor-weighted
Returns: returns : pd.Series
Assets weighted by factor value.
-
alphalens.performance.
mean_information_coefficient
(factor_data, group_adjust=False, by_group=False, by_time=None)¶ Get the mean information coefficient of specified groups. Answers questions like: What is the mean IC for each month? What is the mean IC for each group for our whole timerange? What is the mean IC for for each group, each week?
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
group_adjust : bool
Demean forward returns by group before computing IC.
by_group : bool
If True, take the mean IC for each group.
by_time : str (pd time_rule), optional
Time window to use when taking mean IC. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html for available options.
Returns: ic : pd.DataFrame
Mean Spearman Rank correlation between factor and provided forward price movement windows.
-
alphalens.performance.
mean_return_by_quantile
(factor_data, by_date=False, by_group=False, demeaned=True, group_adjust=False)¶ Computes mean returns for factor quantiles across provided forward returns columns.
Parameters: factor_data : pd.DataFrame - MultiIndex
A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns
by_date : bool
If True, compute quantile bucket returns separately for each date.
by_group : bool
If True, compute quantile bucket returns separately for each group.
demeaned : bool
Compute demeaned mean returns (long short portfolio)
group_adjust : bool
Returns demeaning will occur on the group level.
Returns: mean_ret : pd.DataFrame
Mean period wise returns by specified factor quantile.
std_error_ret : pd.DataFrame
Standard error of returns by specified quantile.
-
alphalens.performance.
quantile_turnover
(quantile_factor, quantile, period=1)¶ Computes the proportion of names in a factor quantile that were not in that quantile in the previous period.
Parameters: quantile_factor : pd.Series
DataFrame with date, asset and factor quantile.
quantile : int
Quantile on which to perform turnover analysis.
period: string or int, optional
Period over which to calculate the turnover. If it is a string it must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc).
Returns
——-
quant_turnover : pd.Series
Period by period turnover for that quantile.
Plotting¶
-
alphalens.plotting.
axes_style
(style='darkgrid', rc=None)¶ Create alphalens default axes style context.
Under the hood, calls and returns seaborn.axes_style() with some custom settings. Usually you would use in a with-context.
Parameters: style : str, optional
Name of seaborn style.
rc : dict, optional
Config flags.
Returns: seaborn plotting context
See also
For
,see
-
alphalens.plotting.
customize
(func)¶ Decorator to set plotting context and axes style during function call.
-
alphalens.plotting.
plot_cumulative_returns
(factor_returns, period, title=None, ax=None)¶ Plots the cumulative returns of the returns series passed in.
Parameters: factor_returns : pd.Series
Period wise returns of dollar neutral portfolio weighted by factor value.
period: pandas.Timedelta or string
Length of period for which the returns are computed (e.g. 1 day) if ‘period’ is a string it must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)
title: string, optional
Custom title
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_cumulative_returns_by_quantile
(quantile_returns, period, ax=None)¶ Plots the cumulative returns of various factor quantiles.
Parameters: quantile_returns : pd.DataFrame
Returns by factor quantile
period: pandas.Timedelta or string
Length of period for which the returns are computed (e.g. 1 day) if ‘period’ is a string it must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
-
alphalens.plotting.
plot_events_distribution
(events, num_bars=50, ax=None)¶ Plots the distribution of events in time.
Parameters: events : pd.Series
A pd.Series whose index contains at least ‘date’ level.
num_bars : integer, optional
Number of bars to plot
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
-
alphalens.plotting.
plot_factor_rank_auto_correlation
(factor_autocorrelation, period=1, ax=None)¶ Plots factor rank autocorrelation over time. See factor_rank_autocorrelation for more details.
Parameters: factor_autocorrelation : pd.Series
Rolling 1 period (defined by time_rule) autocorrelation of factor values.
period: int, optional
Period over which the autocorrelation is calculated
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_ic_by_group
(ic_group, ax=None)¶ Plots Spearman Rank Information Coefficient for a given factor over provided forward returns. Separates by group.
Parameters: ic_group : pd.DataFrame
group-wise mean period wise returns.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_ic_hist
(ic, ax=None)¶ Plots Spearman Rank Information Coefficient histogram for a given factor.
Parameters: ic : pd.DataFrame
DataFrame indexed by date, with IC for each forward return.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_ic_qq
(ic, theoretical_dist=<scipy.stats._continuous_distns.norm_gen object>, ax=None)¶ Plots Spearman Rank Information Coefficient “Q-Q” plot relative to a theoretical distribution.
Parameters: ic : pd.DataFrame
DataFrame indexed by date, with IC for each forward return.
theoretical_dist : scipy.stats._continuous_distns
Continuous distribution generator. scipy.stats.norm and scipy.stats.t are popular options.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_ic_ts
(ic, ax=None)¶ Plots Spearman Rank Information Coefficient and IC moving average for a given factor.
Parameters: ic : pd.DataFrame
DataFrame indexed by date, with IC for each forward return.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_information_table
(ic_data)¶
-
alphalens.plotting.
plot_mean_quantile_returns_spread_time_series
(mean_returns_spread, std_err=None, bandwidth=1, ax=None)¶ Plots mean period wise returns for factor quantiles.
Parameters: mean_returns_spread : pd.Series
Series with difference between quantile mean returns by period.
std_err : pd.Series
Series with standard error of difference between quantile mean returns each period.
bandwidth : float
Width of displayed error bands in standard deviations.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_monthly_ic_heatmap
(mean_monthly_ic, ax=None)¶ Plots a heatmap of the information coefficient or returns by month.
Parameters: mean_monthly_ic : pd.DataFrame
The mean monthly IC for N periods forward.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_quantile_average_cumulative_return
(avg_cumulative_returns, by_quantile=False, std_bar=False, title=None, ax=None)¶ Plots sector-wise mean daily returns for factor quantiles across provided forward price movement columns.
Parameters: avg_cumulative_returns: pd.Dataframe
The format is the one returned by performance.average_cumulative_return_by_quantile
by_quantile : boolean, optional
Disaggregated figures by quantile (useful to clearly see std dev bars)
std_bar : boolean, optional
Plot standard deviation plot
title: string, optional
Custom title
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
-
alphalens.plotting.
plot_quantile_returns_bar
(mean_ret_by_q, by_group=False, ylim_percentiles=None, ax=None)¶ Plots mean period wise returns for factor quantiles.
Parameters: mean_ret_by_q : pd.DataFrame
DataFrame with quantile, (group) and mean period wise return values.
by_group : bool
Disaggregated figures by group.
ylim_percentiles : tuple of integers
Percentiles of observed data to use as y limits for plot.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_quantile_returns_violin
(return_by_q, ylim_percentiles=None, ax=None)¶ Plots a violin box plot of period wise returns for factor quantiles.
Parameters: return_by_q : pd.DataFrame - MultiIndex
DataFrame with date and quantile as rows MultiIndex, forward return windows as columns, returns as values.
ylim_percentiles : tuple of integers
Percentiles of observed data to use as y limits for plot.
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_quantile_statistics_table
(factor_data)¶
-
alphalens.plotting.
plot_returns_table
(alpha_beta, mean_ret_quantile, mean_ret_spread_quantile)¶
-
alphalens.plotting.
plot_top_bottom_quantile_turnover
(quantile_turnover, period=1, ax=None)¶ Plots period wise top and bottom quantile factor turnover.
Parameters: quantile_turnover: pd.Dataframe
Quantile turnover (each DataFrame column a quantile).
period: int, optional
Period over which to calculate the turnover
ax : matplotlib.Axes, optional
Axes upon which to plot.
Returns: ax : matplotlib.Axes
The axes that were plotted on.
-
alphalens.plotting.
plot_turnover_table
(autocorrelation_data, quantile_turnover)¶
-
alphalens.plotting.
plotting_context
(context='notebook', font_scale=1.5, rc=None)¶ Create alphalens default plotting style context.
Under the hood, calls and returns seaborn.plotting_context() with some custom settings. Usually you would use in a with-context.
Parameters: context : str, optional
Name of seaborn context.
font_scale : float, optional
Scale font by factor font_scale.
rc : dict, optional
Config flags. By default, {‘lines.linewidth’: 1.5} is being used and will be added to any rc passed in, unless explicitly overriden.
Returns: seaborn plotting context
See also
For
,see
Utilities¶
-
exception
alphalens.utils.
MaxLossExceededError
¶ Bases:
Exception
-
exception
alphalens.utils.
NonMatchingTimezoneError
¶ Bases:
Exception
-
alphalens.utils.
add_custom_calendar_timedelta
(input, timedelta, freq)¶ Add timedelta to ‘input’ taking into consideration custom frequency, which is used to deal with custom calendars, such as a trading calendar
Parameters: input : pd.DatetimeIndex or pd.Timestamp
timedelta : pd.Timedelta
freq : DateOffset, optional
Returns: pd.DatetimeIndex or pd.Timestamp
input + timedelta
-
alphalens.utils.
compute_forward_returns
(factor_idx, prices, periods=(1, 5, 10), filter_zscore=None)¶ Finds the N period forward returns (as percent change) for each asset provided.
Parameters: factor_idx : pd.DatetimeIndex
The factor datetimes for which we are computing the forward returns
prices : pd.DataFrame
Pricing data to use in forward price calculation. Assets as columns, dates as index. Pricing data must span the factor analysis time period plus an additional buffer window that is greater than the maximum number of expected periods in the forward returns calculations.
periods : sequence[int]
periods to compute forward returns on.
filter_zscore : int or float, optional
Sets forward returns greater than X standard deviations from the the mean to nan. Set it to ‘None’ to avoid filtering. Caution: this outlier filtering incorporates lookahead bias.
Returns: forward_returns : pd.DataFrame - MultiIndex
Forward returns in indexed by date and asset. Separate column for each forward return window.
-
alphalens.utils.
demean_forward_returns
(factor_data, grouper=None)¶ Convert forward returns to returns relative to mean period wise all-universe or group returns. group-wise normalization incorporates the assumption of a group neutral portfolio constraint and thus allows allows the factor to be evaluated across groups.
For example, if AAPL 5 period return is 0.1% and mean 5 period return for the Technology stocks in our universe was 0.5% in the same period, the group adjusted 5 period return for AAPL in this period is -0.4%.
Parameters: factor_data : pd.DataFrame - MultiIndex
Forward returns in indexed by date and asset. Separate column for each forward return window.
grouper : list
If True, demean according to group.
Returns: adjusted_forward_returns : pd.DataFrame - MultiIndex
DataFrame of the same format as the input, but with each security’s returns normalized by group.
-
alphalens.utils.
diff_custom_calendar_timedeltas
(start, end, freq)¶ Compute the difference between two pd.Timedelta taking into consideration custom frequency, which is used to deal with custom calendars, such as a trading calendar
Parameters: start : pd.Timestamp
end : pd.Timestamp
freq : DateOffset, optional
Returns: pd.Timedelta
end - start
-
alphalens.utils.
get_clean_factor_and_forward_returns
(factor, prices, groupby=None, binning_by_group=False, quantiles=5, bins=None, periods=(1, 5, 10), filter_zscore=20, groupby_labels=None, max_loss=0.35)¶ Formats the factor data, pricing data, and group mappings into a DataFrame that contains aligned MultiIndex indices of timestamp and asset. The returned data will be formatted to be suitable for Alphalens functions.
It is safe to skip a call to this function and still make use of Alphalens functionalities as long as the factor data conforms to the format returned from get_clean_factor_and_forward_returns and documented here
Parameters: factor : pd.Series - MultiIndex
A MultiIndex Series indexed by timestamp (level 0) and asset (level 1), containing the values for a single alpha factor.
----------------------------------- date | asset | ----------------------------------- | AAPL | 0.5 ----------------------- | BA | -1.1 ----------------------- 2014-01-01 | CMG | 1.7 ----------------------- | DAL | -0.1 ----------------------- | LULU | 2.7 -----------------------
prices : pd.DataFrame
A wide form Pandas DataFrame indexed by timestamp with assets in the columns. It is important to pass the correct pricing data in depending on what time of period your signal was generated so to avoid lookahead bias, or delayed calculations. Pricing data must span the factor analysis time period plus an additional buffer window that is greater than the maximum number of expected periods in the forward returns calculations. ‘Prices’ must contain at least an entry for each timestamp/asset combination in ‘factor’. This entry must be the asset price at the time the asset factor value is computed and it will be considered the buy price for that asset at that timestamp. ‘Prices’ must also contain entries for timestamps following each timestamp/asset combination in ‘factor’, as many more timestamps as the maximum value in ‘periods’. The asset price after ‘period’ timestamps will be considered the sell price for that asset when computing ‘period’ forward returns.
---------------------------------------------------- | AAPL | BA | CMG | DAL | LULU | ---------------------------------------------------- Date | | | | | | ---------------------------------------------------- 2014-01-01 |605.12| 24.58| 11.72| 54.43 | 37.14 | ---------------------------------------------------- 2014-01-02 |604.35| 22.23| 12.21| 52.78 | 33.63 | ---------------------------------------------------- 2014-01-03 |607.94| 21.68| 14.36| 53.94 | 29.37 | ----------------------------------------------------
groupby : pd.Series - MultiIndex or dict
Either A MultiIndex Series indexed by date and asset, containing the period wise group codes for each asset, or a dict of asset to group mappings. If a dict is passed, it is assumed that group mappings are unchanged for the entire time period of the passed factor data.
binning_by_group : bool
If True, compute quantile buckets separately for each group. This is useful when the factor values range vary considerably across gorups so that it is wise to make the binning group relative. You should probably enable this if the factor is intended to be analyzed for a group neutral portfolio
quantiles : int or sequence[float]
Number of equal-sized quantile buckets to use in factor bucketing. Alternately sequence of quantiles, allowing non-equal-sized buckets e.g. [0, .10, .5, .90, 1.] or [.05, .5, .95] Only one of ‘quantiles’ or ‘bins’ can be not-None
bins : int or sequence[float]
Number of equal-width (valuewise) bins to use in factor bucketing. Alternately sequence of bin edges allowing for non-uniform bin width e.g. [-4, -2, -0.5, 0, 10] Chooses the buckets to be evenly spaced according to the values themselves. Useful when the factor contains discrete values. Only one of ‘quantiles’ or ‘bins’ can be not-None
periods : sequence[int]
periods to compute forward returns on.
filter_zscore : int or float, optional
Sets forward returns greater than X standard deviations from the the mean to nan. Set it to ‘None’ to avoid filtering. Caution: this outlier filtering incorporates lookahead bias.
groupby_labels : dict
A dictionary keyed by group code with values corresponding to the display name for each group.
max_loss : float, optional
Maximum percentage (0.00 to 1.00) of factor data dropping allowed, computed comparing the number of items in the input factor index and the number of items in the output DataFrame index. Factor data can be partially dropped due to being flawed itself (e.g. NaNs), not having provided enough price data to compute forward returns for all factor values, or because it is not possible to perform binning. Set max_loss=0 to avoid Exceptions suppression.
Returns: merged_data : pd.DataFrame - MultiIndex
A MultiIndex Series indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - forward returns column names follow the format accepted by
pd.Timedelta (e.g. ‘1D’, ‘30m’, ‘3h15m’, ‘1D1h’, etc)
- ‘date’ index freq property (merged_data.index.levels[0].freq) will be set to Calendar day or Business day (pandas DateOffset) depending on what was inferred from the input data. This is currently used only in cumulative returns computation but it can be later set to any pd.DateOffset (e.g. US trading calendar) to increase the accuracy of the results
-
alphalens.utils.
get_clean_factor_and_forward_returns_api_change_warning
(func)¶ Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:
- get_clean_factor_and_forward_returns(factor,
- prices, groupby=None, by_group=False, quantiles=5, bins=None, periods=(1, 5, 10), filter_zscore=20, groupby_labels=None, max_loss=0.25)
- New API:
- get_clean_factor_and_forward_returns(factor,
- prices, groupby=None, binning_by_group=False, quantiles=5, bins=None, periods=(1, 5, 10), filter_zscore=20, groupby_labels=None, max_loss=0.25)
Eventually this function can be deleted
-
alphalens.utils.
get_forward_returns_columns
(columns)¶ Utility that detects and returns the columns that are forward returns
-
alphalens.utils.
non_unique_bin_edges_error
(func)¶ Give user a more informative error in case it is not possible to properly calculate quantiles on the input dataframe (factor)
-
alphalens.utils.
print_table
(table, name=None, fmt=None)¶ Pretty print a pandas DataFrame.
Uses HTML output if running inside Jupyter Notebook, otherwise formatted text output.
Parameters: table : pd.Series or pd.DataFrame
Table to pretty-print.
name : str, optional
Table name to display in upper left corner.
fmt : str, optional
Formatter to use for displaying table elements. E.g. ‘{0:.2f}%’ for displaying 100 as ‘100.00%’. Restores original setting after displaying.
-
alphalens.utils.
quantize_factor
(*args, **kwargs)¶
-
alphalens.utils.
rate_of_return
(period_ret, base_period)¶ Convert returns to ‘one_period_len’ rate of returns: that is the value the returns would have every ‘one_period_len’ if they had grown at a steady rate
Parameters: period_ret: pd.DataFrame
DataFrame containing returns values with column headings representing the return period.
base_period: string
The base period length used in the conversion It must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)
Returns: pd.DataFrame
DataFrame in same format as input but with ‘one_period_len’ rate of returns values.
-
alphalens.utils.
rethrow
(exception, additional_message)¶ Re-raise the last exception that was active in the current scope without losing the stacktrace but adding an additional message. This is hacky because it has to be compatible with both python 2/3
-
alphalens.utils.
std_conversion
(period_std, base_period)¶ one_period_len standard deviation (or standard error) approximation
Parameters: period_std: pd.DataFrame
DataFrame containing standard deviation or standard error values with column headings representing the return period.
base_period: string
The base period length used in the conversion It must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)
Returns: pd.DataFrame
DataFrame in same format as input but with one-period standard deviation/error values.
-
alphalens.utils.
timedelta_to_string
(timedelta)¶ Utility that converts a pandas.Timedelta to a string representation compatible with pandas.Timedelta constructor format
Parameters: timedelta: pd.Timedelta
Returns: string
string representation of ‘timedelta’