/Quantitative

Alpha Generation using Data Science and Quantitative Analysis with integrated Risk Model

Primary LanguageJupyter NotebookMIT LicenseMIT

Quantitative Finance Project

Keyvan Tajbakhsh
July 26th, 2020

This reasearch project is realized in Python language and edited in Jupyter Notebook environement. Before diving into it, please read carefully all requirements and instructions mentioned below.

For decades financial institutions and alpha generation platforms focus solely on quantitative investment research rather than the rapid trading of investments. While some of these platforms do allow analysts to take their strategies to market, others focus solely on the research and development of these highly complex mathematical and statistical models. quantitative investing uses raw data to calculate potential stock values, earnings forecasts and other metrics that help investors make capital allocation decisions.
The purpose of this project is to define a liquid universe of stocks where we would apply the alpha factors to see through our factor analysis if there is a potential or not to send these results to production. After selecting and combining factors using Machine Learning technics, the combined factor is analyzed and improved with an optimizer function and then integrated into the risk model.

This project workflow is comprised of distinct stages including:

  1. Parameters
  2. Universe definition
  3. Sector definition
  4. Alpha factors
  5. Factor analysis
  6. Factors combination
  7. Risk analysis for equal weights
  8. Integrating factor data to the optimizer
  9. Optimized alpha vector analysis
  10. Predicted portfolio

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Pypi Packages

  • NumPy - A fundamental package for scientific computing with Python.(version == 1.19.1)
  • Pandas - A library providing high-performance, easy-to-use data structures and data analysis tools.(version == 0.22.0)
  • ScikitLearn - Simple and efficient tools for data mining and data analysis.(version == 0.0)
  • Matplotlib - Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.(version == 3.3.0)
  • Sea Born - Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.(version == 0.10.1)
  • Quandl - Quandl delivers market data from hundreds of sources via API, or directly into Python, R, Excel and many other tools.
  • Datetime - The datetime module supplies classes for manipulating dates and times.
  • Pytz - World timezone definitions, modern and historical.
  • Talib - Talib is used by trading software developers requiring to perform technical analysis of financial market data.(version == 0.4.17)
  • Alphalens - Alphalens is a library for performance analysis of predictive (alpha) stock factor.(version == 0.3.6)
  • Pyfolio - Pyfolio is a library for performance and risk analysis of financial portfolios developed by Quantopian Inc.(version == latest github)
  • Itertools - This module implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python.
  • Warnings - Warning messages are typically issued in situations where it is useful to alert the user of some condition in a program.
  • Os - This module provides a portable way of using operating system dependent functionality.
  • Zipfile - The ZIP file format is a common archive and compression standard. This module provides tools to create, read, write, append, and list a ZIP file.
  • Time - This module provides various time-related functions.
  • Yfinance - Yahoo! Finance market data downloader (version == 0.1.54)
  • cvxpy - CVXPY is a Python-embedded modeling language for convex optimization problems. It allows you to express your problem in a natural way that follows the math, rather than in the restrictive standard form required by solvers.(version == 1.0.11)
  • ibapi - The TWS API is a simple yet powerful interface through which IB clients can automate their trading strategies, request market data and monitor your account balance and portfolio in real time.(version == 9.76.1)

Local Modules

  • risk_model - This module provides functions used in risk modeling and risk management.
  • factorize - This module regroups some of useful functions for factorization of raw data.
  • account - A package composed of functions with implemented IBKR api for portfolio management.
  • utils_s - This modlul delivers functions used in preprocessing and cleaning data.
  • feature_weights - This Machine Learning module is implemented to calculate optimal weights distribution of factors for alpah factor combination

Code

The project is divided into two parts. The code is provided in the alpha_research.ipynb and portfoilo_management.ipynb notebook file. You will be required to have a Quandl API access key to download data and an Interactive Brokers Account for trading, to execute the code.

Run

In a terminal or command window, navigate to the top-level project directory Quantitative/ (that contains this README) and run one of the following commands:

jupyter notebook alpha_research.ipynb

This will open the Jupyter Notebook software and project file in your browser.

Data

For this porject multiple source of data has been used from Sharadar and IFT as described below:

  • Sharadar Equity Prices (SHARADAR/SEP) Updated daily,End-Of-Day (EOD) price (ohlcv) data for more than 14,000 US public companies.
  • Indicator Descriptions (SHARADAR/INDICATORS) Description of indicators listed in SF1 table for more than 14,000 US public companies.
  • Tickers and Metadata (SHARADAR/TICKERS) Information and metadata for more than 14,000 US public companies.
  • Core US Fundamentals (SHARADAR/SF1) 150 essential fundamental indicators and financial ratios, for more than 14,000 US public companies.
  • Daily Metrics (SHARADAR/DAILY) 5 essential metrics indicators and financial ratios daily updated, for more than 14,000 US public companies.
  • Sentiment Analysis and News Analytics (IFT/NSA) News, blogs, social media and proprietary sources for thousands of stocks.

Features

Tickers and Metadata [SHARADAR/TICKERS] features
  • table : Sharadar Table : The database table which the ticker is featured in. Examples are: "SF1" or "SEP.
  • permaticker : Permanent Ticker Symbol : The permaticker is a unique and unchanging identifier for an issuer in the dataset which is issued by Sharadar.
  • name : Issuer Name : The name of the security issuer.
  • exchange : Stock Exchange : The exchange on which the security trades. Examples are: "NASDAQ";"NYSE";"NYSEARCA";"BATS";"OTC" and "NYSEMKT" (previously the American Stock exchange).
  • isdelisted : Is Delisted? : Is the security delisted? [Y]es or [N]o.
  • category : Issuer Category : The category of the issuer: "Domestic"; "Canadian" or "ADR".
  • cusips : CUSIPs : A security identifier. Space delimited in the event of multiple identifiers.
  • siccode : Standard Industrial Classification (SIC) Code : The Standard Industrial Classification (SIC) is a system for classifying industries by a four-digit code; as sourced from SEC filings. More on the SIC system here: https://en.wikipedia.org/wiki/Standard_Industrial_Classification
  • sicsector : SIC Sector : The SIC sector is based on the SIC code and the division tabled here: https://en.wikipedia.org/wiki/Standard_Industrial_Classification
  • sicindustry : SIC Industry : The SIC industry is based on the SIC code and the industry tabled here: https://www.sec.gov/info/edgar/siccodes.htm
  • famasector : Fama Sector : Not currently active - coming in a future update.
  • famaindustry : Fama Industry : Industry classifications based on the SIC code and classifications by Fama and French here: http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/det_48_ind_port.html
  • sector : Sector : Sharadar's sector classification based on SIC codes in a format which approximates to GICS.
  • industry : Industry : Sharadar's industry classification based on SIC codes in a format which approximates to GICS.
  • scalemarketcap : Company Scale - Market Cap : This field is experimental and subject to change. It categorises the company according to it's maximum observed market cap as follows: 1 - Nano < 50m; 2 - Micro < 300m; 3 - Small < 2bn; 4 - Mid < 10bn; 5 - Large < 200bn; 6 - Mega >= 200bn
  • scalerevenue : Company Scale - Revenue : This field is experimental and subject to change. It categorises the company according to it's maximum observed annual revenue as follows: 1 - Nano < 50m; 2 - Micro < 300m; 3 - Small < 2bn; 4 - Mid < 10bn; 5 - Large < 200bn; 6 - Mega >= 200bn
  • relatedtickers : Related Tickers : Where related tickers have been identified this field is populated. Related tickers can include the prior ticker before a ticker change; and it tickers for alternative share classes.
  • currency : Currency : The company functional reporting currency for the SF1 Fundamentals table or the currency for EOD prices in SEP and SFP.
  • location : Location : The company location as registered with the Securities and Exchange Commission.
  • lastupdated : Last Updated Date : Last Updated represents the last date that this database entry was updated; which is useful to users when updating their local records.
  • firstadded : First Added Date : The date that the ticker was first added to coverage in the dataset.
  • firstpricedate : First Price Date : The date of the first price observation for a given ticker. Can be used as a proxy for IPO date. Minimum value of 1986-01-01 for IPO's that occurred prior to this date. Note: this does not necessarily represent the first price date available in our datasets since our end of day price history currently starts in December 1998.
  • lastpricedate : Last Price Date : The most recent price observation available.
  • firstquarter : First Quarter : The first financial quarter available in the dataset.
  • lastquarter : Last Quarter : The last financial quarter available in the dataset.
  • secfilings : SEC Filings URL : The URL pointing to the SEC filings which also contains the Central Index Key (CIK).
  • companysite : Company Website URL : The URL pointing to the company website.
Core US Fundamentals [SHARADAR/SF1] features
  • accoci : Accumulated Other Comprehensive Income : [Balance Sheet] A component of [Equity] representing the accumulated change in equity from transactions and other events and circumstances from non-owner sources; net of tax effect; at period end. Includes foreign currency translation items; certain pension adjustments; unrealized gains and losses on certain investments in debt and equity securities.
  • assets : Total Assets : [Balance Sheet] Sum of the carrying amounts as of the balance sheet date of all assets that are recognized. Major components are [CashnEq]; [Investments];[Intangibles]; [PPNENet];[TaxAssets] and [Receivables].
  • assetsc : Current Assets : [Balance Sheet] The current portion of [Assets]; reported if a company operates a classified balance sheet that segments current and non-current assets.
  • assetsnc : Assets Non-Current : [Balance Sheet] Amount of non-current assets; for companies that operate a classified balance sheet. Calculated as the different between Total Assets [Assets] and Current Assets [AssetsC].
  • bvps : Book Value per Share : [Metrics] Measures the ratio between [Equity] and [SharesWA] as adjusted by [ShareFactor].
  • capex : Capital Expenditure : [Cash Flow Statement] A component of [NCFI] representing the net cash inflow (outflow) associated with the acquisition & disposal of long-lived; physical & intangible assets that are used in the normal conduct of business to produce goods and services and are not intended for resale. Includes cash inflows/outflows to pay for construction of self-constructed assets & software.
  • cashneq : Cash and Equivalents : [Balance Sheet] A component of [Assets] representing the amount of currency on hand as well as demand deposits with banks or financial institutions.
  • cashnequsd : Cash and Equivalents (USD) : [Balance Sheet] [CashnEq] in USD; converted by [FXUSD].
  • cor : Cost of Revenue : [Income Statement] The aggregate cost of goods produced and sold and services rendered during the reporting period.
  • consolinc : Consolidated Income : [Income Statement] The portion of profit or loss for the period; net of income taxes; which is attributable to the consolidated entity; before the deduction of [NetIncNCI].
  • currentratio : Current Ratio : [Metrics] The ratio between [AssetsC] and [LiabilitiesC]; for companies that operate a classified balance sheet.
  • de : Debt to Equity Ratio : [Metrics] Measures the ratio between [Liabilities] and [Equity].
  • debt : Total Debt : [Balance Sheet] A component of [Liabilities] representing the total amount of current and non-current debt owed. Includes secured and unsecured bonds issued; commercial paper; notes payable; credit facilities; lines of credit; capital lease obligations; operating lease obligations; and convertible notes.
  • debtc : Debt Current : [Balance Sheet] The current portion of [Debt]; reported if the company operates a classified balance sheet that segments current and non-current liabilities.
  • debtnc : Debt Non-Current : [Balance Sheet] The non-current portion of [Debt] reported if the company operates a classified balance sheet that segments current and non-current liabilities.
  • debtusd : Total Debt (USD) : [Balance Sheet] [Debt] in USD; converted by [FXUSD].
  • deferredrev : Deferred Revenue : [Balance Sheet] A component of [Liabilities] representing the carrying amount of consideration received or receivable on potential earnings that were not recognized as revenue; including sales; license fees; and royalties; but excluding interest income.
  • depamor : Depreciation Amortization & Accretion : [Cash Flow Statement] A component of operating cash flow representing the aggregate net amount of depreciation; amortization; and accretion recognized during an accounting period. As a non-cash item; the net amount is added back to net income when calculating cash provided by or used in operations using the indirect method.
  • deposits : Deposit Liabilities : [Balance Sheet] A component of [Liabilities] representing the total of all deposit liabilities held; including foreign and domestic; interest and noninterest bearing. May include demand deposits; saving deposits; Negotiable Order of Withdrawal and time deposits among others.
  • divyield : Dividend Yield : [Metrics] Dividend Yield measures the ratio between a company's [DPS] and its [Price].
  • dps : Dividends per Basic Common Share : [Income Statement] Aggregate dividends declared during the period for each split-adjusted share of common stock outstanding. Includes spinoffs where identified.
  • ebit : Earning Before Interest & Taxes (EBIT) : [Income Statement] Earnings Before Interest and Tax is calculated by adding [TaxExp] and [IntExp] back to [NetInc].
  • ebitda : Earnings Before Interest Taxes & Depreciation Amortization (EBITDA) : [Metrics] EBITDA is a non-GAAP accounting metric that is widely used when assessing the performance of companies; calculated by adding [DepAmor] back to [EBIT].
  • ebitdamargin : EBITDA Margin : [Metrics] Measures the ratio between a company's [EBITDA] and [Revenue].
  • ebitdausd : Earnings Before Interest Taxes & Depreciation Amortization (USD) : [Metrics] [EBITDA] in USD; converted by [FXUSD].
  • ebitusd : Earning Before Interest & Taxes (USD) : [Income Statement] [EBIT] in USD; converted by [FXUSD].
  • ebt : Earnings before Tax : [Metrics] Earnings Before Tax is calculated by adding [TaxExp] back to [NetInc].
  • eps : Earnings per Basic Share : [Income Statement] Earnings per share as calculated and reported by the company. Approximates to the amount of [NetIncCmn] for the period per each [SharesWA] after adjusting for [ShareFactor].
  • epsdil : Earnings per Diluted Share : [Income Statement] Earnings per diluted share as calculated and reported by the company. Approximates to the amount of [NetIncCmn] for the period per each [SharesWADil] after adjusting for [ShareFactor]..
  • epsusd : Earnings per Basic Share (USD) : [Income Statement] [EPS] in USD; converted by [FXUSD].
  • equity : Shareholders Equity : [Balance Sheet] A principal component of the balance sheet; in addition to [Liabilities] and [Assets]; that represents the total of all stockholders' equity (deficit) items; net of receivables from officers; directors; owners; and affiliates of the entity which are attributable to the parent.
  • equityusd : Shareholders Equity (USD) : [Balance Sheet] [Equity] in USD; converted by [FXUSD].
  • ev : Enterprise Value : [Metrics] Enterprise value is a measure of the value of a business as a whole; calculated as [MarketCap] plus [DebtUSD] minus [CashnEqUSD].
  • evebit : Enterprise Value over EBIT : [Metrics] Measures the ratio between [EV] and [EBITUSD].
  • evebitda : Enterprise Value over EBITDA : [Metrics] Measures the ratio between [EV] and [EBITDAUSD].
  • fcf : Free Cash Flow : [Metrics] Free Cash Flow is a measure of financial performance calculated as [NCFO] minus [CapEx].
  • fcfps : Free Cash Flow per Share : [Metrics] Free Cash Flow per Share is a valuation metric calculated by dividing [FCF] by [SharesWA] and [ShareFactor].
  • fxusd : Foreign Currency to USD Exchange Rate : [Metrics] The exchange rate used for the conversion of foreign currency to USD for non-US companies that do not report in USD.
  • gp : Gross Profit : [Income Statement] Aggregate revenue [Revenue] less cost of revenue [CoR] directly attributable to the revenue generation activity.
  • grossmargin : Gross Margin : [Metrics] Gross Margin measures the ratio between a company's [GP] and [Revenue].
  • intangibles : Goodwill and Intangible Assets : [Balance Sheet] A component of [Assets] representing the carrying amounts of all intangible assets and goodwill as of the balance sheet date; net of accumulated amortization and impairment charges.
  • intexp : Interest Expense : [Income Statement] Amount of the cost of borrowed funds accounted for as interest expense.
  • invcap : Invested Capital : [Metrics] Invested capital is an input into the calculation of [ROIC]; and is calculated as: [Debt] plus [Assets] minus [Intangibles] minus [CashnEq] minus [LiabilitiesC]. Please note this calculation method is subject to change.
  • inventory : Inventory : [Balance Sheet] A component of [Assets] representing the amount after valuation and reserves of inventory expected to be sold; or consumed within one year or operating cycle; if longer.
  • investments : Investments : [Balance Sheet] A component of [Assets] representing the total amount of marketable and non-marketable securties; loans receivable and other invested assets.
  • investmentsc : Investments Current : [Balance Sheet] The current portion of [Investments]; reported if the company operates a classified balance sheet that segments current and non-current assets.
  • investmentsnc : Investments Non-Current : [Balance Sheet] The non-current portion of [Investments]; reported if the company operates a classified balance sheet that segments current and non-current assets.
  • liabilities : Total Liabilities : [Balance Sheet] Sum of the carrying amounts as of the balance sheet date of all liabilities that are recognized. Principal components are [Debt]; [DeferredRev]; [Payables];[Deposits]; and [TaxLiabilities].
  • liabilitiesc : Current Liabilities : [Balance Sheet] The current portion of [Liabilities]; reported if the company operates a classified balance sheet that segments current and non-current liabilities.
  • liabilitiesnc : Liabilities Non-Current : [Balance Sheet] The non-current portion of [Liabilities]; reported if the company operates a classified balance sheet that segments current and non-current liabilities.
  • marketcap : Market Capitalization : [Metrics] Represents the product of [SharesBas]; [Price] and [ShareFactor].
  • ncf : Net Cash Flow / Change in Cash & Cash Equivalents : [Cash Flow Statement] Principal component of the cash flow statement representing the amount of increase (decrease) in cash and cash equivalents. Includes [NCFO]; investing [NCFI] and financing [NCFF] for continuing and discontinued operations; and the effect of exchange rate changes on cash [NCFX].
  • ncfbus : Net Cash Flow - Business Acquisitions and Disposals : [Cash Flow Statement] A component of [NCFI] representing the net cash inflow (outflow) associated with the acquisition & disposal of businesses; joint-ventures; affiliates; and other named investments.
  • ncfcommon : Issuance (Purchase) of Equity Shares : [Cash Flow Statement] A component of [NCFF] representing the net cash inflow (outflow) from common equity changes. Includes additional capital contributions from share issuances and exercise of stock options; and outflow from share repurchases.
  • ncfdebt : Issuance (Repayment) of Debt Securities : [Cash Flow Statement] A component of [NCFF] representing the net cash inflow (outflow) from issuance (repayment) of debt securities.
  • ncfdiv : Payment of Dividends & Other Cash Distributions : [Cash Flow Statement] A component of [NCFF] representing dividends and dividend equivalents paid on common stock and restricted stock units.
  • ncff : Net Cash Flow from Financing : [Cash Flow Statement] A component of [NCF] representing the amount of cash inflow (outflow) from financing activities; from continuing and discontinued operations. Principal components of financing cash flow are: issuance (purchase) of equity shares; issuance (repayment) of debt securities; and payment of dividends & other cash distributions.
  • ncfi : Net Cash Flow from Investing : [Cash Flow Statement] A component of [NCF] representing the amount of cash inflow (outflow) from investing activities; from continuing and discontinued operations. Principal components of investing cash flow are: capital (expenditure) disposal of equipment [CapEx]; business (acquisitions) disposition [NCFBus] and investment (acquisition) disposal [NCFInv].
  • ncfinv : Net Cash Flow - Investment Acquisitions and Disposals : [Cash Flow Statement] A component of [NCFI] representing the net cash inflow (outflow) associated with the acquisition & disposal of investments; including marketable securities and loan originations.
  • ncfo : Net Cash Flow from Operations : [Cash Flow Statement] A component of [NCF] representing the amount of cash inflow (outflow) from operating activities; from continuing and discontinued operations.
  • ncfx : Effect of Exchange Rate Changes on Cash : [Cash Flow Statement] A component of Net Cash Flow [NCF] representing the amount of increase (decrease) from the effect of exchange rate changes on cash and cash equivalent balances held in foreign currencies.
  • netinc : Net Income : [Income Statement] The portion of profit or loss for the period; net of income taxes; which is attributable to the parent after the deduction of [NetIncNCI] from [ConsolInc]; and before the deduction of [PrefDivIS].
  • netinccmn : Net Income Common Stock : [Income Statement] The amount of net income (loss) for the period due to common shareholders. Typically differs from [NetInc] to the parent entity due to the deduction of [PrefDivIS].
  • netinccmnusd : Net Income Common Stock (USD) : [Income Statement] [NetIncCmn] in USD; converted by [FXUSD].
  • netincdis : Net Loss Income from Discontinued Operations : [Income Statement] Amount of loss (income) from a disposal group; net of income tax; reported as a separate component of income.
  • netincnci : Net Income to Non-Controlling Interests : [Income Statement] The portion of income which is attributable to non-controlling interest shareholders; subtracted from [ConsolInc] in order to obtain [NetInc].
  • netmargin : Profit Margin : [Metrics] Measures the ratio between a company's [NetIncCmn] and [Revenue].
  • opex : Operating Expenses : [Income Statement] Operating expenses represents the total expenditure on [SGnA]; [RnD] and other operating expense items; it excludes [CoR].
  • opinc : Operating Income : [Income Statement] Operating income is a measure of financial performance before the deduction of [IntExp]; [TaxExp] and other Non-Operating items. It is calculated as [GP] minus [OpEx].
  • payables : Trade and Non-Trade Payables : [Balance Sheet] A component of [Liabilities] representing trade and non-trade payables.
  • payoutratio : Payout Ratio : [Metrics] The percentage of earnings paid as dividends to common stockholders. - Calculated by dividing [DPS] by [EPSUSD].
  • pb : Price to Book Value : [Metrics] Measures the ratio between [MarketCap] and [EquityUSD].
  • pe : Price Earnings (Damodaran Method) : [Metrics] Measures the ratio between [MarketCap] and [NetIncCmnUSD]
  • pe1 : Price to Earnings Ratio : [Metrics] An alternative to [PE] representing the ratio between [Price] and [EPSUSD].
  • ppnenet : Property Plant & Equipment Net : [Balance Sheet] A component of [Assets] representing the amount after accumulated depreciation; depletion and amortization of physical assets used in the normal conduct of business to produce goods and services and not intended for resale. Includes Operating Right of Use Assets.
  • prefdivis : Preferred Dividends Income Statement Impact : [Income Statement] Income statement item reflecting dividend payments to preferred stockholders. Subtracted from Net Income to Parent [NetInc] to obtain Net Income to Common Stockholders [NetIncCmn].
  • price : Share Price (Adjusted Close) : [Entity] The price per common share adjusted for stock splits but not adjusted for dividends; used in the computation of [PE1]; [PS1]; [DivYield] and [SPS].
  • ps : Price Sales (Damodaran Method) : [Metrics] Measures the ratio between [MarketCap] and [RevenueUSD].
  • ps1 : Price to Sales Ratio : [Metrics] An alternative calculation method to [PS]; that measures the ratio between a company's [Price] and it's [SPS].
  • receivables : Trade and Non-Trade Receivables : [Balance Sheet] A component of [Assets] representing trade and non-trade receivables.
  • retearn : Accumulated Retained Earnings (Deficit) : [Balance Sheet] A component of [Equity] representing the cumulative amount of the entities undistributed earnings or deficit. May only be reported annually by certain companies; rather than quarterly.
  • revenue : Revenues : [Income Statement] Amount of Revenue recognized from goods sold; services rendered; insurance premiums; or other activities that constitute an earning process. Interest income for financial institutions is reported net of interest expense and provision for credit losses.
  • revenueusd : Revenues (USD) : [Income Statement] [Revenue] in USD; converted by [FXUSD].
  • rnd : Research and Development Expense : [Income Statement] A component of [OpEx] representing the aggregate costs incurred in a planned search or critical investigation aimed at discovery of new knowledge with the hope that such knowledge will be useful in developing a new product or service.
  • sbcomp : Share Based Compensation : [Cash Flow Statement] A component of [NCFO] representing the total amount of noncash; equity-based employee remuneration. This may include the value of stock or unit options; amortization of restricted stock or units; and adjustment for officers' compensation. As noncash; this element is an add back when calculating net cash generated by operating activities using the indirect method.
  • sgna : Selling General and Administrative Expense : [Income Statement] A component of [OpEx] representing the aggregate total costs related to selling a firm's product and services; as well as all other general and administrative expenses. Direct selling expenses (for example; credit; warranty; and advertising) are expenses that can be directly linked to the sale of specific products. Indirect selling expenses are expenses that cannot be directly linked to the sale of specific products; for example telephone expenses; Internet; and postal charges. General and administrative expenses include salaries of non-sales personnel; rent; utilities; communication; etc.
  • sharefactor : Share Factor : [Entity] Share factor is a multiplicant in the calculation of [MarketCap] and is used to adjust for: American Depository Receipts (ADRs) that represent more or less than 1 underlying share; and; companies which have different earnings share for different share classes (eg Berkshire Hathaway - BRK.B).
  • sharesbas : Shares (Basic) : [Entity] The number of shares or other units outstanding of the entity's capital or common stock or other ownership interests; as stated on the cover of related periodic report (10-K/10-Q); after adjustment for stock splits.
  • shareswa : Weighted Average Shares : [Income Statement] The weighted average number of shares or units issued and outstanding that are used by the company to calculate [EPS]; determined based on the timing of issuance of shares or units in the period.
  • shareswadil : Weighted Average Shares Diluted : [Income Statement] The weighted average number of shares or units issued and outstanding that are used by the company to calculate [EPSDil]; determined based on the timing of issuance of shares or units in the period.
  • sps : Sales per Share : [Metrics] Sales per Share measures the ratio between [RevenueUSD] and [SharesWA] as adjusted by [ShareFactor].
  • tangibles : Tangible Asset Value : [Metrics] The value of tangibles assets calculated as the difference between [Assets] and [Intangibles].
  • taxassets : Tax Assets : [Balance Sheet] A component of [Assets] representing tax assets and receivables.
  • taxexp : Income Tax Expense : [Income Statement] Amount of current income tax expense (benefit) and deferred income tax expense (benefit) pertaining to continuing operations.
  • taxliabilities : Tax Liabilities : [Balance Sheet] A component of [Liabilities] representing outstanding tax liabilities.
  • tbvps : Tangible Assets Book Value per Share : [Metrics] Measures the ratio between [Tangibles] and [SharesWA] as adjusted by [ShareFactor].
  • workingcapital : Working Capital : [Metrics] Working capital measures the difference between [AssetsC] and [LiabilitiesC].
  • roe: Return on Average Equity : [Metrics] Return on equity measures a corporation's profitability by calculating the amount of [NetIncCmn] returned as a percentage of [EquityAvg].
  • roa : Return on Average Assets : [Metrics] Return on assets measures how profitable a company is [NetIncCmn] relative to its total assets [AssetsAvg].
Sharadar Equity Prices [SHARADAR/SEP] features
  • open : Open Price - Split Adjusted : The opening share price, adjusted for stock splits and stock dividends.
  • high : High Price - Split Adjusted : The high share price, adjusted for stock splits and stock dividends.
  • low : Low Price - Split Adjusted : The low share price, adjusted for stock splits and stock dividends.
  • close : Close Price - Split Adjusted : The open share closing, adjusted for stock splits and stock dividends.
  • volume : Volume - Split Adjusted : The traded volume, adjusted for stock splits and stock dividends.
Daily Metrics ([SHARADAR/DAILY] features
  • ev : Enterprise Value - Daily : Enterprise value is a measure of the value of a business as a whole; calculated as [MarketCap] plus [DebtUSD] minus [CashnEqUSD]. [MarketCap] is calculated by us, and the remaining figures are sourced from the most recent SEC form 10 filings.
  • evebit : Enterprise Value over EBIT - Daily : Measures the ratio between [EV] and [EBITUSD]. EBITUSD is derived from the most recent SEC form 10 filings.
  • evebitda : Enterprise Value over EBITDA - Daily : Measures the ratio between [EV] and [EBITDAUSD]. EBITDAUSD is derived from the most recent SEC form 10 filings.
  • marketcap : Market Capitalization - Daily : Represents the product of [SharesBas]; [Price] and [ShareFactor]. [SharesBas] is sourced from the most recent SEC form 10 filing.
  • pb : Price to Book Value - Daily : Measures the ratio between [MarketCap] and [EquityUSD]. [EquityUSD] is sourced from the most recent SEC form 10 filing.
  • pe : Price Earnings (Damodaran Method) - Daily : Measures the ratio between [MarketCap] and [NetIncCmnUSD]. [NetIncCmnUSD] is sourced from the most recent SEC form 10 filings.
  • ps : Price Sales (Damodaran Method) - Daily : Measures the ratio between [MarketCap] and [RevenueUSD]. [RevenueUSD] is sourced from the most recent SEC form 10 filings.
Sentiment Analysis and News Analytics ([IFT/NSA] features
  • sentiment: a numeric measure of the bullishness / bearishness of news coverage of the stock.
  • sentiment_high: highest intraday sentiment scores.
  • sentiment_low: lowest intraday sentiment scores.
  • news_volume: the absolute number of news articles covering the stock.
  • news_buzz: a numeric measure of the change in coverage volume for the stock.

Factor Analysis Target Variables

The factor analysis is performed using alphalens and Pyfolio. These packages regrouped APIs useful for data processing and factor analysis over the pre-defined periods. These metrics are mentioned here below:

  • Cleaning and preparing data alphalens.utils.get_clean_factor_and_forward_returns: Formats the factor data, pricing data, and group mappings into a DataFrame that contains aligned MultiIndex indices of timestamp and asset. The returned data will be formatted to be suitable for Alphalens functions.
  • Cumulated factor return alphalens.performance.factor_returns: Builds cumulative returns from ‘period’ returns. This function simulate the cumulative effect that a series of gains or losses (the ‘returns’) have on an original amount of capital over a period of time.
  • Mean quantile return alphalens.performance.mean_return_by_quantile: Computes mean returns for factor quantiles across provided forward returns columns.
  • Factor Rank Autocorrelation alphalens.performance.factor_rank_autocorrelation: Computes autocorrelation of mean factor ranks in specified time spans. We must compare period to period factor ranks rather than factor values to account for systematic shifts in the factor values of all names or names within a group. This metric is useful for measuring the turnover of a factor. If the value of a factor for each name changes randomly from period to period, we’d expect an autocorrelation of 0.
  • Sharpe ratio sharpe_ratio: This function computes annualized sharpe ratio. This metric is used to understand the return of an investment compared to its risk. The ratio is the average return earned in excess per unit of volatility or total risk. Volatility is a measure of the factor return fluctuations of an asset.

The Combined Alpha Vector

To get the single score for each stock we have to combine selected factors. This is an area where machine learning can be very helpful. In this context, the feature_weights module is implemented to gives us optimal weights to the selected alpha factors and result in the best combination.

Risk Management

We measured the predicted risk cap using risk_model module. For this purpose the portfolio risk formula is √𝑋𝑇(𝐵𝐹𝐵𝑇+𝑆)𝑋 where:

  • 𝑋 is the portfolio weights
  • 𝐵 is the factor betas
  • 𝐹 is the factor covariance matrix
  • 𝑆 is the idiosyncratic variance matrix

Optimization

Once alpha model and a risk model are generated, we want to find a portfolio that trades as close as possible to the alpha model but limiting risk as measured by the risk_model. The cxpy package is used to implement the optimizer

The CVXPY objective function is to maximize 𝛼𝑇 ∗ 𝑥 , where x is the portfolio weights and alpha is the alpha vector.

In the other hand we have the following constraints:

  • 𝑟 ≤ 𝑟𝑖𝑠𝑘2cap
  • 𝐵𝑇 ∗ 𝑥 ⪯ 𝑓𝑎𝑐𝑡𝑜𝑟max
  • 𝐵𝑇 ∗ 𝑥 ⪰ 𝑓𝑎𝑐𝑡𝑜𝑟min
  • 𝑥𝑇𝟙 = 0
  • ‖𝑥‖ ≤ 1
  • 𝑥 ⪰ 𝑤𝑒𝑖𝑔ℎ𝑡𝑠min
  • 𝑥 ⪯ 𝑤𝑒𝑖𝑔ℎ𝑡𝑠max

Where x is the portfolio weights, B is the factor betas, and r is the portfolio risk calculated in risk model module.

The first constraint is that the predicted risk be less than some maximum limit. The second and third constraints are on the maximum and minimum portfolio factor exposures. The fourth constraint is the "market neutral constraint: the sum of the weights must be zero. The fifth constraint is the leverage constraint: the sum of the absolute value of the weights must be less than or equal to 1.0. The last are some minimum and maximum limits on individual holdings.