Alphalens¶

Tear Sheets¶

class alphalens.tears.GridFigure(rows, cols)¶

Bases: object

It makes life easier with grid plots

Methods

close()¶

next_cell()¶

next_row()¶

alphalens.tears.create_event_returns_tear_sheet(factor_data, prices, avgretplot=(5, 15), long_short=True, group_neutral=False, by_group=False)¶

Creates a tear sheet to view the average cumulative returns for a factor within a window (pre and post event).

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex Series indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, the factor quantile/bin that factor value belongs to and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

prices : pd.DataFrame

A DataFrame indexed by date with assets in the columns containing the pricing data. - See full explanation in utils.get_clean_factor_and_forward_returns

avgretplot: tuple (int, int) - (before, after)

If not None, plot quantile average cumulative returns

long_short : bool

Should this computation happen on a long short portfolio? if so then factor returns will be demeaned across the factor universe

group_neutral : bool

Should this computation happen on a group neutral portfolio? if so, returns demeaning will occur on the group level.

by_group : bool

If True, display graphs separately for each group.

alphalens.tears.create_event_returns_tear_sheet_api_change_warning(func)¶

Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:

create_event_returns_tear_sheet(factor_data,

prices, avgretplot=(5, 15), long_short=True, by_group=False)

New API:

create_event_returns_tear_sheet(factor_data,: prices, avgretplot=(5, 15), long_short=True, group_neutral=False, by_group=False)

Eventually this function can be deleted

alphalens.tears.create_event_study_tear_sheet(factor_data, prices=None, avgretplot=(5, 15))¶

Creates an event study tear sheet for analysis of a specific event.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single event, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to.

prices : pd.DataFrame, required only if ‘avgretplot’ is provided

A DataFrame indexed by date with assets in the columns containing the pricing data. - See full explanation in utils.get_clean_factor_and_forward_returns

avgretplot: tuple (int, int) - (before, after), optional

If not None, plot event style average cumulative returns within a window (pre and post event).

alphalens.tears.create_full_tear_sheet(factor_data, long_short=True, group_neutral=False, by_group=False)¶

Creates a full tear sheet for analysis and evaluating single return predicting (alpha) factor.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

long_short : bool

Should this computation happen on a long short portfolio? - See tears.create_returns_tear_sheet for details on how this flag affects returns analysis

group_neutral : bool

Should this computation happen on a group neutral portfolio? - See tears.create_returns_tear_sheet for details on how this flag affects returns analysis - See tears.create_information_tear_sheet for details on how this flag affects information analysis

by_group : bool

If True, display graphs separately for each group.

alphalens.tears.create_full_tear_sheet_api_change_warning(func)¶

Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:

create_full_tear_sheet(factor_data,

long_short=True, group_adjust=False, by_group=False)

New API:

create_full_tear_sheet(factor_data,: long_short=True, group_neutral=False, by_group=False)

Eventually this function can be deleted

alphalens.tears.create_information_tear_sheet(factor_data, group_neutral=False, by_group=False)¶

Creates a tear sheet for information analysis of a factor.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

group_neutral : bool

Demean forward returns by group before computing IC.

by_group : bool

If True, display graphs separately for each group.

alphalens.tears.create_information_tear_sheet_api_change_warning(func)¶

Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:

create_information_tear_sheet(factor_data,

group_adjust=False, by_group=False)

New API:

create_information_tear_sheet(factor_data,: group_neutral=False, by_group=False)

Eventually this function can be deleted

alphalens.tears.create_returns_tear_sheet(factor_data, long_short=True, group_neutral=False, by_group=False)¶

Creates a tear sheet for returns analysis of a factor.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

long_short : bool

Should this computation happen on a long short portfolio? if so, then mean quantile returns will be demeaned across the factor universe. Additionally factor values will be demeaned across the factor universe when factor weighting the portfolio for cumulative returns plots

group_neutral : bool

Should this computation happen on a group neutral portfolio? if so, returns demeaning will occur on the group level. Additionally each group will weight the same in cumulative returns plots

by_group : bool

If True, display graphs separately for each group.

alphalens.tears.create_returns_tear_sheet_api_change_warning(func)¶

Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:

create_returns_tear_sheet(factor_data,

long_short=True, by_group=False)

New API:

create_returns_tear_sheet(factor_data,: long_short=True, group_neutral=False, by_group=False)

Eventually this function can be deleted

alphalens.tears.create_summary_tear_sheet(factor_data, long_short=True, group_neutral=False)¶

Creates a small summary tear sheet with returns, information, and turnover analysis.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

long_short : bool

Should this computation happen on a long short portfolio? if so, then mean quantile returns will be demeaned across the factor universe.

group_neutral : bool

Should this computation happen on a group neutral portfolio? if so, returns demeaning will occur on the group level.

alphalens.tears.create_turnover_tear_sheet(factor_data, turnover_periods=None)¶

Creates a tear sheet for analyzing the turnover properties of a factor.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

turnover_periods : sequence[string], optional

Periods to compute turnover analysis on. By default periods in ‘factor_data’ are used but custom periods can provided instead. This can be useful when periods in ‘factor_data’ are not multiples of the frequency at which factor values are computed i.e. the periods are 2h and 4h and the factor is computed daily and so values like [‘1D’, ‘2D’] could be used instead

Performance¶

alphalens.performance.average_cumulative_return_by_quantile(factor_data, prices, periods_before=10, periods_after=15, demeaned=True, group_adjust=False, by_group=False)¶

Plots average cumulative returns by factor quantiles in the period range defined by -periods_before to periods_after

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

prices : pd.DataFrame

A wide form Pandas DataFrame indexed by date with assets in the columns. Pricing data should span the factor analysis time period plus/minus an additional buffer window corresponding to periods_after/periods_before parameters.

periods_before : int, optional

How many periods before factor to plot

periods_after : int, optional

How many periods after factor to plot

demeaned : bool, optional

Compute demeaned mean returns (long short portfolio)

group_adjust : bool

Returns demeaning will occur on the group level (group neutral portfolio)

by_group : bool

If True, compute cumulative returns separately for each group

Returns

——-

cumulative returns and std deviation : pd.DataFrame

A MultiIndex DataFrame indexed by quantile (level 0) and mean/std (level 1) and the values on the columns in range from -periods_before to periods_after If by_group=True the index will have an additional ‘group’ level

---------------------------------------------------
            |       | -2  | -1  |  0  |  1  | ...
---------------------------------------------------
  quantile  |       |     |     |     |     |
---------------------------------------------------
            | mean  |  x  |  x  |  x  |  x  |
     1      ---------------------------------------
            | std   |  x  |  x  |  x  |  x  |
---------------------------------------------------
            | mean  |  x  |  x  |  x  |  x  |
     2      ---------------------------------------
            | std   |  x  |  x  |  x  |  x  |
---------------------------------------------------
    ...     |                 ...
---------------------------------------------------

alphalens.performance.common_start_returns(factor, prices, before, after, cumulative=False, mean_by_date=False, demean_by=None)¶

A date and equity pair is extracted from each index row in the factor dataframe and for each of these pairs a return series is built starting from ‘before’ the date and ending ‘after’ the date specified in the pair. All those returns series are then aligned to a common index (-before to after) and returned as a single DataFrame

Parameters:

factor : pd.DataFrame

DataFrame with at least date and equity as index, the columns are irrelevant

prices : pd.DataFrame

A wide form Pandas DataFrame indexed by date with assets in the columns. Pricing data should span the factor analysis time period plus/minus an additional buffer window corresponding to after/before period parameters.

before:

How many returns to load before factor date

after:

How many returns to load after factor date

cumulative: bool, optional

Return cumulative returns

mean_by_date: bool, optional

If True, compute mean returns for each date and return that instead of a return series for each asset

demean_by: pd.DataFrame, optional

DataFrame with at least date and equity as index, the columns are irrelevant. For each date a list of equities is extracted from ‘demean_by’ index and used as universe to compute demeaned mean returns (long short portfolio)

Returns:

aligned_returns : pd.DataFrame

Dataframe containing returns series for each factor aligned to the same index: -before to after

alphalens.performance.compute_mean_returns_spread(mean_returns, upper_quant, lower_quant, std_err=None)¶

Computes the difference between the mean returns of two quantiles. Optionally, computes the standard error of this difference.

Parameters:

mean_returns : pd.DataFrame

DataFrame of mean period wise returns by quantile. MultiIndex containing date and quantile. See mean_return_by_quantile.

upper_quant : int

Quantile of mean return from which we wish to subtract lower quantile mean return.

lower_quant : int

Quantile of mean return we wish to subtract from upper quantile mean return.

std_err : pd.DataFrame

Period wise standard error in mean return by quantile. Takes the same form as mean_returns.

Returns:

mean_return_difference : pd.Series

Period wise difference in quantile returns.

joint_std_err : pd.Series

Period wise standard error of the difference in quantile returns.

alphalens.performance.create_pyfolio_input(factor_data, period, long_short=True, group_neutral=False, quantiles=None, groups=None)¶

WARNING: this API is still in experimental phase and input/output: paramenters might change in the future

Simulate a portfolio using the factor in input and returns a DataFrames containing the portfolio returns formatted for pyfolio.

For more details on how this portfolio is built see: - performance.factor_returns (how assets weights are computed) - performance.cumulative_returns (how the portfolio returns are computed)

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

period : string

‘factor_data’ column name corresponding to the ‘period’ returns to be used in the computation of porfolio returns

long_short : bool, optional

Should this computation happen on a long short portfolio? if so, then factor values will be demeaned across the factor universe when factor weighting the portfolio for cumulative returns plots

group_neutral : bool, optional

Should this computation happen on a group neutral portfolio? if so, factor values demeaning will occur on the group level. Additionally each group will weight the same in cumulative returns plots

quantiles: sequence[int], optional

Use only specific quantiles in the computation. By default all quantiles are used

groups: sequence[string], optional

Use only specific groups in the computation. By default all groups are used

Returns:

returns : pd.Series

Daily returns of the strategy, noncumulative.

benchmark : pd.Series

Benchmark returns computed as the factor universe mean daily returns. If ‘1D’ period column is not present in the factor_data the beanchmark returns is not computed and returned as ‘None’

alphalens.performance.cumulative_returns(returns, period, freq=None)¶

Builds cumulative returns from ‘period’ returns. This function simulate the cumulative effect that a series of gains or losses (the ‘retuns’) have on an original amount of capital over a period of time.

if F is the frequency at which returns are computed (e.g. 1 day if ‘returns’ contains daily values) and N is the period for which the retuns are computed (e.g. returns after 1 day, 5 hours or 3 days) then: - if N <= F the cumulative retuns are trivially computed as Compound Return - if N > F (e.g. F 1 day, and N is 3 days) then the returns overlap and the

cumulative returns are computed building and averaging N interleaved sub portfolios (started at subsequent periods 1,2,..,N) each one rebalancing every N periods. This correspond to an algorithm which trades the factor every single time it is computed, which is statistically more robust and with a lower volatity compared to an algorithm that trades the factor every N periods and whose returns depend on the specific starting day of trading.

Also note that when the factor is not computed at a specific frequency, for exaple a factor representing a random event, it is not efficient to create multiples sub-portfolios as it is not certain when the factor will be traded and this would result in an underleveraged portfolio. In this case the simulated portfolio is fully invested whenever an event happens and if a subsequent event occur while the portfolio is still invested in a previous event then the portfolio is rebalanced and split equally among the active events.

Parameters:

returns: pd.Series

pd.Series containing factor ‘period’ forward returns, the index contains timestamps at which the trades are computed and the values correspond to returns after ‘period’ time

period: pandas.Timedelta or string

Length of period for which the returns are computed (1 day, 2 mins, 3 hours etc). It can be a Timedelta or a string in the format accepted by Timedelta constructor (‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)

freq : pandas DateOffset, optional

Used to specify a particular trading calendar. If not present returns.index.freq will be used

Returns:

pd.Series

Cumulative returns series

alphalens.performance.factor_alpha_beta(factor_data, returns=None, demeaned=True, group_adjust=False, equal_weight=False)¶

Compute the alpha (excess returns), alpha t-stat (alpha significance), and beta (market exposure) of a factor. A regression is run with the period wise factor universe mean return as the independent variable and mean period wise return from a portfolio weighted by factor values as the dependent variable.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

returns : pd.DataFrame, optional

Period wise factor returns. If this is None then it will be computed with ‘factor_returns’ function and the passed flags: ‘demeaned’, ‘group_adjust’, ‘equal_weight’

demeaned : bool

Control how to build factor returns used for alpha/beta computation – see performance.factor_return for a full explanation

group_adjust : bool

Control how to build factor returns used for alpha/beta computation – see performance.factor_return for a full explanation

equal_weight : bool, optional

Control how to build factor returns used for alpha/beta computation – see performance.factor_return for a full explanation

Returns:

alpha_beta : pd.Series

A list containing the alpha, beta, a t-stat(alpha) for the given factor and forward returns.

alphalens.performance.factor_information_coefficient(factor_data, group_adjust=False, by_group=False)¶

Computes the Spearman Rank Correlation based Information Coefficient (IC) between factor values and N period forward returns for each period in the factor index.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

group_adjust : bool

Demean forward returns by group before computing IC.

by_group : bool

If True, compute period wise IC separately for each group.

Returns:

ic : pd.DataFrame

Spearman Rank correlation between factor and provided forward returns.

alphalens.performance.factor_rank_autocorrelation(factor_data, period=1)¶

Computes autocorrelation of mean factor ranks in specified time spans. We must compare period to period factor ranks rather than factor values to account for systematic shifts in the factor values of all names or names within a group. This metric is useful for measuring the turnover of a factor. If the value of a factor for each name changes randomly from period to period, we’d expect an autocorrelation of 0.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

period: string or int, optional

Period over which to calculate the turnover. If it is a string it must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc).

Returns

——-

autocorr : pd.Series

Rolling 1 period (defined by time_rule) autocorrelation of factor values.

alphalens.performance.factor_returns(factor_data, demeaned=True, group_adjust=False, equal_weight=False, by_asset=False)¶

Computes period wise returns for portfolio weighted by factor values.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

demeaned : bool

Control how to build factor weights – see performance.factor_weights for a full explanation

group_adjust : bool

Control how to build factor weights – see performance.factor_weights for a full explanation

equal_weight : bool, optional

Control how to build factor weights – see performance.factor_weights for a full explanation

by_asset: bool, optional

If True, returns are reported separately for each esset.

Returns:

returns : pd.DataFrame

Period wise factor returns

alphalens.performance.factor_weights(factor_data, demeaned=True, group_adjust=False, equal_weight=False)¶

Computes asset weights by factor values and dividing by the sum of their absolute value (achieving gross leverage of 1). Positive factor values will results in positive weights and negative values in negative weights.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

demeaned : bool

Should this computation happen on a long short portfolio? if True, weights are computed by demeaning factor values and dividing by the sum of their absolute value (achieving gross leverage of 1). The sum of positive weights will be the same as the negative weights (absolute value), suitable for a dollar neutral long-short portfolio

group_adjust : bool

Should this computation happen on a group neutral portfolio? If True, compute group neutral weights: each group will weight the same and if ‘demeaned’ is enabled the factor values demeaning will occur on the group level.

equal_weight : bool, optional

if True the assets will be equal-weighted instead of factor-weighted

Returns:

returns : pd.Series

Assets weighted by factor value.

alphalens.performance.mean_information_coefficient(factor_data, group_adjust=False, by_group=False, by_time=None)¶

Get the mean information coefficient of specified groups. Answers questions like: What is the mean IC for each month? What is the mean IC for each group for our whole timerange? What is the mean IC for for each group, each week?

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

group_adjust : bool

Demean forward returns by group before computing IC.

by_group : bool

If True, take the mean IC for each group.

by_time : str (pd time_rule), optional

Time window to use when taking mean IC. See http://pandas.pydata.org/pandas-docs/stable/timeseries.html for available options.

Returns:

ic : pd.DataFrame

Mean Spearman Rank correlation between factor and provided forward price movement windows.

alphalens.performance.mean_return_by_quantile(factor_data, by_date=False, by_group=False, demeaned=True, group_adjust=False)¶

Computes mean returns for factor quantiles across provided forward returns columns.

Parameters:

factor_data : pd.DataFrame - MultiIndex

A MultiIndex DataFrame indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - See full explanation in utils.get_clean_factor_and_forward_returns

by_date : bool

If True, compute quantile bucket returns separately for each date.

by_group : bool

If True, compute quantile bucket returns separately for each group.

demeaned : bool

Compute demeaned mean returns (long short portfolio)

group_adjust : bool

Returns demeaning will occur on the group level.

Returns:

mean_ret : pd.DataFrame

Mean period wise returns by specified factor quantile.

std_error_ret : pd.DataFrame

Standard error of returns by specified quantile.

alphalens.performance.quantile_turnover(quantile_factor, quantile, period=1)¶

Computes the proportion of names in a factor quantile that were not in that quantile in the previous period.

Parameters:

quantile_factor : pd.Series

DataFrame with date, asset and factor quantile.

quantile : int

Quantile on which to perform turnover analysis.

period: string or int, optional

Period over which to calculate the turnover. If it is a string it must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc).

Returns

——-

quant_turnover : pd.Series

Period by period turnover for that quantile.

Plotting¶

alphalens.plotting.axes_style(style='darkgrid', rc=None)¶

Create alphalens default axes style context.

Under the hood, calls and returns seaborn.axes_style() with some custom settings. Usually you would use in a with-context.

Parameters:

style : str, optional

Name of seaborn style.

rc : dict, optional

Config flags.

Returns:

seaborn plotting context

Utilities¶

exception alphalens.utils.MaxLossExceededError¶: Bases: Exception

exception alphalens.utils.NonMatchingTimezoneError¶: Bases: Exception

alphalens.utils.add_custom_calendar_timedelta(input, timedelta, freq)¶

Add timedelta to ‘input’ taking into consideration custom frequency, which is used to deal with custom calendars, such as a trading calendar

Parameters:

input : pd.DatetimeIndex or pd.Timestamp

timedelta : pd.Timedelta

freq : DateOffset, optional

Returns:

pd.DatetimeIndex or pd.Timestamp

input + timedelta

alphalens.utils.compute_forward_returns(factor_idx, prices, periods=(1, 5, 10), filter_zscore=None)¶

Finds the N period forward returns (as percent change) for each asset provided.

Parameters:

factor_idx : pd.DatetimeIndex

The factor datetimes for which we are computing the forward returns

prices : pd.DataFrame

Pricing data to use in forward price calculation. Assets as columns, dates as index. Pricing data must span the factor analysis time period plus an additional buffer window that is greater than the maximum number of expected periods in the forward returns calculations.

periods : sequence[int]

periods to compute forward returns on.

filter_zscore : int or float, optional

Sets forward returns greater than X standard deviations from the the mean to nan. Set it to ‘None’ to avoid filtering. Caution: this outlier filtering incorporates lookahead bias.

Returns:

forward_returns : pd.DataFrame - MultiIndex

Forward returns in indexed by date and asset. Separate column for each forward return window.

alphalens.utils.demean_forward_returns(factor_data, grouper=None)¶

Convert forward returns to returns relative to mean period wise all-universe or group returns. group-wise normalization incorporates the assumption of a group neutral portfolio constraint and thus allows allows the factor to be evaluated across groups.

For example, if AAPL 5 period return is 0.1% and mean 5 period return for the Technology stocks in our universe was 0.5% in the same period, the group adjusted 5 period return for AAPL in this period is -0.4%.

Parameters:

factor_data : pd.DataFrame - MultiIndex

Forward returns in indexed by date and asset. Separate column for each forward return window.

grouper : list

If True, demean according to group.

Returns:

adjusted_forward_returns : pd.DataFrame - MultiIndex

DataFrame of the same format as the input, but with each security’s returns normalized by group.

alphalens.utils.diff_custom_calendar_timedeltas(start, end, freq)¶

Compute the difference between two pd.Timedelta taking into consideration custom frequency, which is used to deal with custom calendars, such as a trading calendar

Parameters:

start : pd.Timestamp

end : pd.Timestamp

freq : DateOffset, optional

Returns:

pd.Timedelta

end - start

alphalens.utils.get_clean_factor_and_forward_returns(factor, prices, groupby=None, binning_by_group=False, quantiles=5, bins=None, periods=(1, 5, 10), filter_zscore=20, groupby_labels=None, max_loss=0.35)¶

Formats the factor data, pricing data, and group mappings into a DataFrame that contains aligned MultiIndex indices of timestamp and asset. The returned data will be formatted to be suitable for Alphalens functions.

It is safe to skip a call to this function and still make use of Alphalens functionalities as long as the factor data conforms to the format returned from get_clean_factor_and_forward_returns and documented here

Parameters:

factor : pd.Series - MultiIndex

A MultiIndex Series indexed by timestamp (level 0) and asset (level 1), containing the values for a single alpha factor.

-----------------------------------
    date    |    asset   |
-----------------------------------
            |   AAPL     |   0.5
            -----------------------
            |   BA       |  -1.1
            -----------------------
2014-01-01  |   CMG      |   1.7
            -----------------------
            |   DAL      |  -0.1
            -----------------------
            |   LULU     |   2.7
            -----------------------

prices : pd.DataFrame

A wide form Pandas DataFrame indexed by timestamp with assets in the columns. It is important to pass the correct pricing data in depending on what time of period your signal was generated so to avoid lookahead bias, or delayed calculations. Pricing data must span the factor analysis time period plus an additional buffer window that is greater than the maximum number of expected periods in the forward returns calculations. ‘Prices’ must contain at least an entry for each timestamp/asset combination in ‘factor’. This entry must be the asset price at the time the asset factor value is computed and it will be considered the buy price for that asset at that timestamp. ‘Prices’ must also contain entries for timestamps following each timestamp/asset combination in ‘factor’, as many more timestamps as the maximum value in ‘periods’. The asset price after ‘period’ timestamps will be considered the sell price for that asset when computing ‘period’ forward returns.
----------------------------------------------------
            | AAPL |  BA  |  CMG  |  DAL  |  LULU  |
----------------------------------------------------
   Date     |      |      |       |       |        |
----------------------------------------------------
2014-01-01  |605.12| 24.58|  11.72| 54.43 |  37.14 |
----------------------------------------------------
2014-01-02  |604.35| 22.23|  12.21| 52.78 |  33.63 |
----------------------------------------------------
2014-01-03  |607.94| 21.68|  14.36| 53.94 |  29.37 |
----------------------------------------------------

groupby : pd.Series - MultiIndex or dict

Either A MultiIndex Series indexed by date and asset, containing the period wise group codes for each asset, or a dict of asset to group mappings. If a dict is passed, it is assumed that group mappings are unchanged for the entire time period of the passed factor data.

binning_by_group : bool

If True, compute quantile buckets separately for each group. This is useful when the factor values range vary considerably across gorups so that it is wise to make the binning group relative. You should probably enable this if the factor is intended to be analyzed for a group neutral portfolio

quantiles : int or sequence[float]

Number of equal-sized quantile buckets to use in factor bucketing. Alternately sequence of quantiles, allowing non-equal-sized buckets e.g. [0, .10, .5, .90, 1.] or [.05, .5, .95] Only one of ‘quantiles’ or ‘bins’ can be not-None

bins : int or sequence[float]

Number of equal-width (valuewise) bins to use in factor bucketing. Alternately sequence of bin edges allowing for non-uniform bin width e.g. [-4, -2, -0.5, 0, 10] Chooses the buckets to be evenly spaced according to the values themselves. Useful when the factor contains discrete values. Only one of ‘quantiles’ or ‘bins’ can be not-None

periods : sequence[int]

periods to compute forward returns on.

filter_zscore : int or float, optional

Sets forward returns greater than X standard deviations from the the mean to nan. Set it to ‘None’ to avoid filtering. Caution: this outlier filtering incorporates lookahead bias.

groupby_labels : dict

A dictionary keyed by group code with values corresponding to the display name for each group.

max_loss : float, optional

Maximum percentage (0.00 to 1.00) of factor data dropping allowed, computed comparing the number of items in the input factor index and the number of items in the output DataFrame index. Factor data can be partially dropped due to being flawed itself (e.g. NaNs), not having provided enough price data to compute forward returns for all factor values, or because it is not possible to perform binning. Set max_loss=0 to avoid Exceptions suppression.

Returns:

merged_data : pd.DataFrame - MultiIndex

A MultiIndex Series indexed by date (level 0) and asset (level 1), containing the values for a single alpha factor, forward returns for each period, the factor quantile/bin that factor value belongs to, and (optionally) the group the asset belongs to. - forward returns column names follow the format accepted by

pd.Timedelta (e.g. ‘1D’, ‘30m’, ‘3h15m’, ‘1D1h’, etc)

‘date’ index freq property (merged_data.index.levels[0].freq) will be set to Calendar day or Business day (pandas DateOffset) depending on what was inferred from the input data. This is currently used only in cumulative returns computation but it can be later set to any pd.DateOffset (e.g. US trading calendar) to increase the accuracy of the results

::

| 1D | 5D | 10D |factor|group|factor_quantile

date | asset | | | | | |

AAPL | 0.09|-0.01|-0.079| 0.5 | G1 | 3

BA | 0.02| 0.06| 0.020| -1.1 | G2 | 5

2014-01-01 | CMG | 0.03| 0.09| 0.036| 1.7 | G2 | 1

DAL |-0.02|-0.06|-0.029| -0.1 | G3 | 5

LULU |-0.03| 0.05|-0.009| 2.7 | G1 | 2

alphalens.utils.get_clean_factor_and_forward_returns_api_change_warning(func)¶

Decorator used to help API transition: maintain the function backward compatible and warn the user about the API change. Old API:

get_clean_factor_and_forward_returns(factor,

prices, groupby=None, by_group=False, quantiles=5, bins=None, periods=(1, 5, 10), filter_zscore=20, groupby_labels=None, max_loss=0.25)

New API:

get_clean_factor_and_forward_returns(factor,: prices, groupby=None, binning_by_group=False, quantiles=5, bins=None, periods=(1, 5, 10), filter_zscore=20, groupby_labels=None, max_loss=0.25)

Eventually this function can be deleted

alphalens.utils.get_forward_returns_columns(columns)¶: Utility that detects and returns the columns that are forward returns

alphalens.utils.non_unique_bin_edges_error(func)¶: Give user a more informative error in case it is not possible to properly calculate quantiles on the input dataframe (factor)

alphalens.utils.print_table(table, name=None, fmt=None)¶

Pretty print a pandas DataFrame.

Uses HTML output if running inside Jupyter Notebook, otherwise formatted text output.

Parameters:

table : pd.Series or pd.DataFrame

Table to pretty-print.

name : str, optional

Table name to display in upper left corner.

fmt : str, optional

Formatter to use for displaying table elements. E.g. ‘{0:.2f}%’ for displaying 100 as ‘100.00%’. Restores original setting after displaying.

alphalens.utils.quantize_factor(*args, **kwargs)¶

alphalens.utils.rate_of_return(period_ret, base_period)¶

Convert returns to ‘one_period_len’ rate of returns: that is the value the returns would have every ‘one_period_len’ if they had grown at a steady rate

Parameters:

period_ret: pd.DataFrame

DataFrame containing returns values with column headings representing the return period.

base_period: string

The base period length used in the conversion It must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)

Returns:

pd.DataFrame

DataFrame in same format as input but with ‘one_period_len’ rate of returns values.

alphalens.utils.rethrow(exception, additional_message)¶: Re-raise the last exception that was active in the current scope without losing the stacktrace but adding an additional message. This is hacky because it has to be compatible with both python 2/3

alphalens.utils.std_conversion(period_std, base_period)¶

one_period_len standard deviation (or standard error) approximation

Parameters:

period_std: pd.DataFrame

DataFrame containing standard deviation or standard error values with column headings representing the return period.

base_period: string

The base period length used in the conversion It must follow pandas.Timedelta constructor format (e.g. ‘1 days’, ‘1D’, ‘30m’, ‘3h’, ‘1D1h’, etc)

Returns:

pd.DataFrame

DataFrame in same format as input but with one-period standard deviation/error values.

alphalens.utils.timedelta_to_string(timedelta)¶

Utility that converts a pandas.Timedelta to a string representation compatible with pandas.Timedelta constructor format

Parameters:

timedelta: pd.Timedelta

Returns:

string

string representation of ‘timedelta’