Subscribe to stay in touch with our latest insights, news and events on ....
Venue analysis is an essential tool for fine-tuning liquidity access to maximize the performance of execution algorithms, and yet it remains one of the most challenging aspects of Transaction Cost Analysis (TCA). The US equities market is fragmented across sixteen exchanges and over thirty off-exchange venues. Comparing the performance of execution venues is often oversimplified to measuring markouts across fills executed in each location. This broad view can be misleading because it ignores the nuances and complexity of the intent of the algorithm and the order type.
Algorithms may utilize a venue for providing liquidity, taking liquidity, or seeking blocks, and this intent must be considered while evaluating venue performance. A venue may be good for liquidity taking but poor for liquidity providing, and vice versa. In this paper, we explore some of the intricacies of venue analysis, discuss the importance of algorithmic intent, and provide examples to illustrate how biases can skew our understanding of venue performance. We also explain how BestEx Research TCA’s new Venue Analysis Report allows users to evaluate venue performance more accurately and use it to fine-tune algorithmic execution performance.
Markouts are used to evaluate fill quality at the execution level. They are calculated by taking the average difference between the price at the time of each individual fill and the price of the same instrument shortly after the trade (e.g., 1 minute after), adjusting for the side of the child order. After a buy order is filled, if prices go up on average, markouts will be positive; if prices go down on average, markouts will be negative.
Markouts help traders to measure a phenomenon called adverse selection. Adverse selection occurs when one counterparty in a transaction is opportunistically timing their trades, leaving the other with worse fill performance on average. For example, when a buyer and a seller are present, and the seller knows the price is likely to go down (because they will be the cause as they sell off a large order or because they are utilizing a short-term pricing signal, for example), the transaction price will represent a better deal for seller than buyer. The buyer’s experience will be that they purchased at a higher price than is now available. It is important to note that whether this happens in one transaction is not the concern when it comes to institutional trading; markouts are a measure of whether this is happening consistently. If a trader receives poor pricing on average compared to prices available after their trades, the performance of the parent order made up of the smaller transactions is likely to be poor.
Generally speaking, child orders that are providing liquidity (e.g., orders resting at the near side or midpoint of the National Best Bid and Offer, the NBBO) experience adverse selection costs regardless of the venue; prices are expected to fall on average after a passive buy order is executed and expected to rise after a passive sell order is executed. The corresponding markouts of such orders reveal the relative toxicity of liquidity takers among venues.
Markouts can be measured from execution price (at fill time) to midpoint price (post-trade) or from midpoint price (at fill time) to midpoint price (post-trade). If measured from the execution price, markouts incorporate not just the adverse selection but the net effect of spread capture and adverse selection.
Depending on the situation, execution markouts or mid-to-mid markouts may be more effective. For example, when trying to understand the toxicity of liquidity-taking order flow across venues, mid-to-mid markouts are simple to understand. But sometimes, we want to compare tactics rather than venues. For example, is it better to place orders at the midpoint or near price? It is likely that the orders placed at near price will incur higher adverse selection than midpoint orders but that extra adverse selection may be compensated by the spread capture. Execution markouts here are more effective because they can explain and compare the corresponding net effects.
Another critical aspect of markouts are their units of measurement. Adverse selection is generally proportional to the bid-offer spread of an instrument. For example, the expected post-trade price movement for a stock with a bid-offer spread of 10 basis points is likely to be much higher than that of a stock with a bid-offer spread of 1 basis point. Since markouts are typically aggregated across stocks and venues, viewing markouts in cents or basis points can be misleading. Rather, viewing markouts as a percent of spread is preferred in most cases[1].
For example, in Table 1 below, Venue 2 may appear more toxic if the unit of measurement is basis points; however, when viewed relative to spread, Venue 1 appears to have more toxic flow.
Table 1. Illustrates example behavior in which viewing markouts in terms of basis points only–rather than relative to the spread of the instruments traded–can be misleading.
Finally, the time period over which markouts are measured matters a great deal. Markouts measure how much price movement is the result of the fill just received. As such, the duration of the calculation should really depend on how long the short-term price impact of the counterparty’s trade or the short-term signal of the counterparty’s trade is expected to last. Since there is no way to know this information, a common solution is to use as long a time horizon as possible and notice the point at which the rate of change in markouts becomes trivial.
However, if we measure post-trade price movement over too long a period, it will introduce noise in our markout data due to price volatility over that period. The longer the period, the more data will be required to be confident in these statistics. Alternatively, if we measure over too short a period, we may not capture the entire price movement and hence underestimate the associated adverse selection cost. Liquidity is also a consideration; the lower the liquidity of an instrument, the longer a period is required for proper measurement of the effect.
Many venues self-report their markouts and not surprisingly, they often choose a very short time horizon (i.e., milliseconds)because a short horizon tends to underestimate the overall adverse selection faced by the party providing liquidity. Further, many do not disclose the period over which the markouts were measured, as if to imply that has no importance. In reality, adverse selection occurs over a much longer period than milliseconds.
In order to draw appropriate conclusions, we must strike a balance in the measurement time horizon. Our TCA currently publishes both 1-minute and 5-minutes markouts with their standard errors[2].Typically, we have enough observations to generate statistical significance among 1-minute markout results. And over a large enough sample, we find that the differences between 1-minute and 5-minute markouts are negligible, suggesting that most of the adverse selection has been realized over the1-minute period.
Markouts are commonly used to measure adverse selection; it is critical to view them through the lens of the algorithm's intent, also commonly referred to as its “tactic”. There are many tactics used by algorithms for order placement; largely, they can be divided into the following styles:
● Liquidity-providing, where the algorithm aims to capture spread
● Liquidity-taking, where the algorithm pays spread to take liquidity for achieving immediate execution
● Midpoint liquidity-seeking, where the algorithm receives the midpoint price, neither paying nor capturing spread
Viewing markouts at a venue level while ignoring algorithm tactics creates several issues. First, markouts do not always represent adverse selection. Both liquidity-providing and midpoint liquidity-seeking tactics are susceptible to adverse selection because execution is uncertain; whether you receive a fill depends on a counterparty being present. The more informed the counterparty, the more these tactics will be adversely selected. Liquidity-taking tactics, however, do not face adverse selection because they have certainty of execution. For liquidity-taking tactics, markouts represent price impact rather than adverse selection costs as discussed in the next section.
Viewing markouts across venues is misleading for a second reason as well. Even the tactics that do face adverse selection(e.g., liquidity-providing or midpoint liquidity-seeking) interact withdifferent subsets of counterparties within a given venue. As a result, the adverse selection faced by each can differ dramatically, even within the same venue[3].
The most challenging aspect of venue or fill analysis is differentiating between adverse selection and price impact because the same price movement can be considered good from an adverse selectionperspective and bad from a price impact perspective. From the perspective of a buy order:
The problem is that algorithms fall victim to both adverse selection cost and price impact. While not all tactics face adverse selection, all of them–regardless of whether they are passive, aggressive, or midpoint–create price impact. Even if an algorithm almost never takes liquidity, over the duration of the order it will likely have a positive price impact.
Figure 1 below illustrates the average price drift of an algorithm that takes liquidity less than 5% of the time. The chart contains the average intraday price impact from the parent order’s arrival at market open, including over 33,000 parent orders and 6 months of trading data. It is side-adjusted so the upward price trajectory indicates a price increase for buy orders and decrease for sell orders throughout the execution period. As shown, the algorithm still exhibits price impact throughout the day, though it is not behaving aggressively and no alpha was expected for the orders traded.
Figure 1. Illustrates the average price drift over the full execution duration of a sample of orders. The orders are side-adjusted such that upward movement indicates prices increasing during a buy order. For this sample of more than 33,000 parent orders collected over six months, orders are taking liquidity less than 5% of time but exhibiting price impact throughout the day though they do not expect to have alpha. For the same orders, markouts are negative on average; price impact is happening, but we see adverse selection.
It is quite confusing that for the same algorithm, the markouts are negative on average (i.e., prices go down after a fill on a buy order) while there is a clear trend of price impact on average in Figure 1. In Table 2 below, markouts for the same parent orders–including all child order tactics–are summarized. The aggregate markout across tactics is -1.93; this value seems to indicate adverse selection. Looking at the markouts by tactic, we see that with the exception of liquidity-taking trades, all other tactics have negative mid-to-mid markouts, i.e. prices go down after we buy for liquidity providing and midpoint liquidity seeking trades.
Table 2. Illustrates the distinctly different qualities of markouts across order placement tactics. Liquidity-taking orders often have positive markouts, which can bias the view of toxicity if all fills in a venue are aggregated.
But if most fills have negative markouts, where is the price impact coming from? Price impact occurs not at the time of the fill, rather at the time the order information becomes public. For example, for liquidity-providing orders, that information becomes public as soon as they are sent. For close orders, impact occurs when the imbalances become public[4]. For liquidity-taking orders, since order placement and fill happen at the same time, the impact occurs at the time of the fill.
For this reason, a positive markout is not a good thing for liquidity-taking tactics, as it represents price impact rather than a “good outcome” for adverse selection. This means that the same metric can represent adverse selection for one tactic and price impact for another. Table 3 below summarizes the important distinctions between liquidity-providing and liquidity-taking child orders.
Table 3. Summarizes the characteristics of liquidity-taking and liquidity-providing child orders. The two tactics have distinct goals and measures of success.
In the sections above, we have established that markouts are appropriate metrics as long as they are not utilized for liquidity-taking tactics. Child order performance must be broken down by algorithm intent (providing vs. taking), but taking order type into account along with intent is also quite important.
To illustrate the importance of order type, consider midpoint liquidity-seeking orders; there are several ways to seek this type of liquidity. An algorithm can rest at midpoint in a dark pool, it can ping (send an IOC order) at midpoint in a dark pool, or it can interact with midpoint liquidity using conditional orders. Adverse selection costs associated with each order type are different because they interact with different types of order flow. Let us begin by considering midpoint resting and midpoint IOC orders. Table 4 below summarizes the types of interactions orders of these two types will have.
Table 4. Summarizes the interaction of midpoint IOC and midpoint resting orders in dark pools. The two order types are able to interact with a [overlapping] subset of counterparties in the dark pool.
While midpoint IOC orders will interact with only resting midpoint flow, resting midpoint orders interact with resting midpoint, midpoint IOC, and far side IOC orders sent to dark pools. As a result, when calculating markouts for resting midpoint orders, the result will be a function of the toxicity of not only other counterparties resting at midpoint, but also midpoint IOC and far side IOC orders that this dark pool or exchange attracts.
While the intent of an algorithm is the same when sending midpoint IOC orders or resting midpoint orders, the distinction of order type virtually creates different dark pools because of the differing toxicity experiences of the two types of orders.
Measuring the markouts of our midpoint IOC order flow independently tells us about the toxicity of the midpoint resting order flow in each dark pool. The same is not true for resting midpoint orders; measuring markouts of resting flow is the aggregate toxicity from the resting and IOC order flow. Because one would expect the toxicity of IOCs to be significantly higher than for resting order flow, it would be prudent to separate them for measurement purposes. But how can we measure the toxicity of IOC flow alone? To achieve this goal, our algorithms incorporate a tactic to rest on the near side in dark pools that is likely to interact only with IOC flow, as summarized in Table 5 below.
Table 5. Summarizes the interaction of midpoint IOC, midpoint resting, and near-side resting orders in dark pools. Midpoint IOC and near-side resting orders each carve out a specific group of counterparties, while midpoint resting orders interact with multiple groups.
The best way to measure the toxicity of a dark pool is to separate the markouts for midpoint IOC orders and the markouts for orders resting at the near side. At BestEx Research, we call the three tactics shown in Table 5 “Dark Mid Ping”, “Dark Mid Post”, and “Dark Near Post”, respectively. Table 6 below highlights the aggregate markouts for a sample of more than 981,000 Dark Mid Ping and Dark Near Post child orders placed from November 2023 to April 2024.
Table 6. Reports markouts as a percent of spread from a sample of more than 981,000 Dark Mid Ping and Dark Near Post child orders from November 2023 to April 2024. As a percent of spread, Dark Near Post orders experienced much greater adverse selection than Dark Mid Ping orders.
Though the two tactics access the same set of dark pools, their markouts clearly show that orders posted at the near side experience far higher toxicity than the midpoint IOC orders. The difference shown in the table underscores the importance of separating toxicity measurements not only by pool but also by the order type used to place each order.
There are other aspects outside of tactics and order types to consider when analyzing the quality of executions. We are continuing research on the factors below and will continue to share our findings with our clients as new information becomes available. In addition, when we find the factors to be significant in our empirical analysis, we incorporate them directly into our Venue Analysis reporting (described in detail in the next section) and optimize our algorithms appropriately. Covering the open questions regarding each of the factors listed below in detail is beyond the scope of this paper, but this list represents some of the elements we are currently considering:
As readers can see in the sections above, a straightforward comparison of markouts across venues can generate misleading results. It is critical to consider algorithm intent and order type in order to get a comprehensive view of the quality of an algorithm’s resulting fills. Once child orders have been partitioned appropriately, comparison across venues becomes much more meaningful. With that in mind, we have launched a new Venue Analysis Report so that our clients can evaluate fill quality most effectively.
Based on the philosophy of venue analysis detailed above and the dimensions we describe for appropriate analysis of fill quality, we have created a new Venue Analysis Report to showcase the measures of success for each fill type and deeply evaluate the quality of child order executions. This report is not a replacement but a supplement to the existing analyses we offer, creating new views specific to each tactic and order type described above. Each has its own tab in the report, addressing the relevant statistics and breakdowns specific to that tactic.
We include markouts in basis points, cents per share, and as a percent of spread to handle each of the use cases above appropriately.
The report contains an execution summary tab with breakdowns of venue usage and fees as well as a performance summary tab with all executions for a traditional view of child order performance including markouts by venue. We also include tabs breaking down markouts across algo destinations and strategies. But the power of this report comes from analyzing each tactic’s relevant performance statistics driven by the unique goals and concerns detailed above. For each tactic, a tab in the report presents usage, spread capture, and markouts in a number of breakdowns for complete analysis. And we customize those views:
The BestEx Research approach to understanding fill quality is multifaceted, because we’ve seen firsthand how fill quality affects parent order performance. Venue analysis is more than a study of markouts to decide which venues to access; by considering the underlying algorithmic intent and order placement tactics, users gain a clearer understanding of venue performance and also how their child order fills contribute to parent order performance. Our new Venue Analysis Report aims to provide the tools and insights our customers need to make informed decisions about their trading.
[1] There could also be cases for which markouts should be considered in cents per share. For example, we may want to ascertain whether markout differences between venues are explained by their fee differences. Fees are presented in cents per share in US equity markets, so calculating markouts in the same units would be appropriate.
[2] We are currently working on a methodology that will measure markouts while adjusting for the liquidity of the instrument.
[3] For this reason, using a metric like markouts for liquidity-taking trades is not appropriate. However, other aspects about the venue should be considered when taking liquidity. For example, it is interesting to measure how much price improvement an order received over the visible NBBO; if a venue generally has more price improvement–more hidden orders posted within the NBBO–it may be a better venue for liquidity taking.
[4]For Market On Close (MOC) orders, the impact actually occurs at the time the imbalance is known.
At BestEx Research, we care how you fill. We know from experience that systematic, quantitative decision-making around algorithm design contributes to globally optimal execution and results in significantly reduced execution costs. Reach out to us with questions at research@bestexresearch.com or learn more about us at bestexresearch.com.
This research paper reflects the views and opinions of BestEx Research Group LLC. It does not constitute legal, tax, investment, financial, or other professional advice. Nothing contained herein constitutes a solicitation, recommendation, endorsement, or offer to buy or sell securities, futures, or other financial instruments or to engage in financial strategies which may include algorithms. This material may not be a comprehensive or complete statement of the matters discussed herein. Nothing in this paper is a guarantee or assurance that any particular algorithmic solution fits you, or that you will benefit from it. You should consider whether our research is suitable for your particular circumstances and needs and, if appropriate, seek professional advice.
©2024 BestEx Research. All rights reserved.