Sector ETF Performance Evaluations – Across Volatility, Timing Abilities, and Alpha
- patricktscott11
- Jul 17
- 6 min read

*ETF Prices ranging from March 2015 to April 2025.
This study will evaluate the performance factors of several key sector based ETFs, Energy Select Sector SPDR Fund (XLE) and Ishares U.S. Aerospace & Defense ETF (ITA). This analysis will focus on the historical returns of each fund measuring compensation for investor risk. This evaluation will use a variety of quantitative measures including Sharpe's Ratio, Treynor Ratio, CAPM, Jensen’s Alpha, and Treynor Mazuy market timing regressions. These analysis metrics will compare both funds through the lens of risk adjusted return, timing skillset, and sensitivity analysis to outlier months, with a secondary focus to evaluate fund strategy / style and managerial market timing abilities.
Such data has yielded statistically significant measurements that both funds have underperformed relative to their expected returns, with performance appearing to be attributed to sector trends and volatility rather than management skill. The summarized findings show that while both XLE and ITA underperformed, ITA exhibited more stable results. This can be attributed to the defense sector’s lower volatility through reliance on more stable government contracts rather than global economic commodity price swings.
Fund Overviews and Benchmark Rationale
XLE: The XLE energy-based ETF provides exposure to the U.S. energy sector, whilst tracking the Energy Select Sector Index. XLE’s holdings are composed of major oil and gas producers/exporters and equipment companies. The chosen benchmarks include the SPY for broad market comparisons and the VDE (Vanguard Energy ETF) for a sector-based measurement.
ITA: IShares Aerospace & Defense ETF is a U.S. company composed of aerospace, defense, and security sector equities. Once again, being measured in contrast to the broad market SPY and a sector-based ETF, in this case XAR (SPDR S&P Aerospace & Defense ETF). Monthly Treasury Bill rates were used as the monthly risk-free rate for all ETF metrics.
Fund Style & Structure Differences
ITA and XAR: While both ITA and XAR track the U.S. Aerospace and Defense sector, the structure of their overall approaches to such differs. This difference drives performance outcomes with differing risk adjusted return and systematic risk exposures. ITA primarily utilizes a market-cap-weighted approach, of which there are heavy concentrations of proven industry dominating companies such as Lockheed Martin, Boeing, and Raytheon. While XAR follows an equal-weighted-strategy in turn distributing exposure more evenly. Furthermore XAR has greater exposure to fastly growing mid-cap and small cap defense stocks, holding a 40/40/20 mix, of which has attributed in it’s greater excess returns in the period ranging from March 2015 to April 2025.
XLE and VDE: Both XlE and VDE use a market-cap-weighted approach, however their holdings differ slightly. Both funds are largely composed of ExxonMobil, Chevron, and ConocoPhillips respectively; however VDE’s holdings are slightly more diversified with their top 10 holdings making up only 64% of their entire portfolio. XLE’s top 10 account for more than 73% of the ETFs holdings. VDE holds a total of 119 equities, opposed to XLE’s 26. Thus allowing VDE a greater exposure to small and mid cap U.S. energy companies.
Risk Adjusted Return Performance Metrics
ITA: Using excel’s stock history function, ITA, SPY, and XAR monthly returns were listed across a 10-year period, at monthly intervals. The risk free rate was composed of annualized monthly T-Bill returns, divided for monthly rates. Such information allowed for easy calculation of excess returns for all said ETF’s, where in a respective monthly Sharpe’s ratio was conceived. ITA’s Sharpe ratio indicated that for each unit of risk, ITA delivered .10 units of return above the risk-free rate. ITA’s monthly Sharpe Ratio of .10017 was below that of SPY’s .14806 of which could be expected given SPY’s more diversified profile. ITA delivered a modest positive Sharpe, but failed to beat XAR’s risk adjusted return of .1158. This difference is most notably due to XAR’s equal weighted structure.
ITA and XAR’s Jensen’s Alpha rested at -0.00065 and -0.000015 respectively. These negative Alpha’s indicate each fund’s underperformance to the market despite adjustment for its Beta. Though the funds did underachieve relative to systematic risk exposure calculated through CAPM, the deviations were quite small. Thus ineffective selections and or mispricing may have contributed to such underachievement. ITA’s calculated Treynor ratio was .00598, showing the fund’s inefficiency in converting market risk into excess returns. ITA’s CAPM expected return was .008336. XAR’s Treynor ratio was .006598, opposed to a CAPM of .008817; therefore XAR’s equal-weighted strategy and thus greater exposure to mid and small cap A&D equities contributed to greater excess returns during the period of this study. XAR persisted with greater risk-adjusted returns despite having a higher Beta than ITA, 1.02 and 1.09, respectively.
XLE: XLE’s excess returns divided by its excess return’s SD, yielded a Sharpe's ratio of -.01413, indicating a return lower than the risk-free rate across this period. VDE yielded a Sharpe's ratio of -.01492. Both XLE and VDE underperformed relative to the yield of the risk-free rate across the study period. This can be attributed to a high SD factor (due to higher volatility) when calculating the Sharpe’s ratio amongst minimal returns. XLE’s excess returns SD was nearly double that of the SPY, 0.8842 compared to 0.0446. Both XLE and VDE suffered negative Avg, Monthly Excess Return for this period. XLE and VDE failed to compensate investors for their systematic risk with a Treynor Ratio of -.00101 and -.00107, respectively. Furthermore the funds failed to provide excess return in regards to systematic risk exposure with a Jensen’s Alpha of -.0094 and -.0098. Failing to achieve their expected CAPM return measurements of XLE .0097 and VDE .0100.
Due to the fact that XLE and VDE’s negative Sharpe’s Ratio, Treynor Ratio, Jensen’s Alpha, and Beta exhibited similar approximations. It is difficult to discern a preference in fund holdings and style across this exhibited period. The funds performed similarly despite different exposures, both mainly suffering from sector-based volatility attributed to declining oil prices post 2014, and some capital flights with expanding ESG legislation. The funds provided no excess return for their risk levels (minimal to no returns at all for that matter) therefore investor preference should not lie in how XLE and VDE manage risk, rather the investor’s preference for risk as determined by projected CAPM returns and the fund’s holdings.


Best & Worst Month Sensitivities
ITA: In an effort to assess the sensitivity of ITA’s Sharpe Ratio to outlier months, both the best and the worst months’ returns were removed, and the ratio recalculated. ITA’s original Sharpe ratio of .10017 fell to .0977 with the best month’s return removed, and rose to .1688 with the worst month’s returns removed. The fund’s slight movement with the best month removed indicates the fund was not overly reliant on a single monthly gain. However, there was a more drastic rise when removing the worst month, which indicates a negative singular outlier month greatly weighed on the fund’s averaged out risk risk-adjusted performance. Therefore, one can discern that ITA boasts fair stability and underperformance can be attributed to a concentrated period of volatility rather than looming downside risk.
XLE: XLE’s worst month returns removed increased its original Sharpe ratio to +.0309, thus indicating great stress on the ETF’s performance and a measurable increase. Conversely, when the best month’s returns were removed, it caused the Sharpe ratio to drop further to -.0338. This asymmetric outcome shows how reliant XLE’s positive monthly returns were in mitigating its losses from weak-performing months. Thus, a red flag for risk-averse investors, with volatile setbacks and gains.


Timing Abilities
ITA: Under the Treynor-Mazuy model, ITA presented more compelling evidence of pattern misalignment, despite a lower B2 coefficient of -2.31. However, ITA held a P-Value of .058, falling just below a 10% threshold of statistical significance; therefore, slightly more compelling evidence is needed to understand bad market timing due to managerial maintenance.
XLE: For XLE’s Treynor-Mazuy, less compelling pattern misalignment and non-optimal market timing abilities were discovered. XLE held a B2 coefficient of -.96, and a statistically insignificant P-Value of .6455 was fostered. Thus, much more compelling evidence is needed to understand the fund’s history of pattern misalignment.


Index:


Comments