Setup, Problem
We assume the FX market participant (FXMP hereafter, client of Tradefeedr) is trading with several liquidity providers (LP) via FX aggregator setup (see Defining FX Liquidity Metrics). FXMP is a price taker. He observes FX quotes in real time from all LPs are selects the best quote to trade (highest bid to sell, lowest offer to buy). We assume for now that his trade size is smaller than the size attached of quotes so he does not have to hit several LP at once (socalled sweep). FXMP can be filled, partially filled or rejected as a results of his order.
FXMP is looking for a way to measure the performance of his LPs in the stack. Ideally he would like to have a framework to minimize certain cost.
The study assume the FXMP market data and trade messages (order, fills, rejects) are loaded in the Tradefeedr Platform
The prices and quantities observed by FXMP look like those depicted in Figure 1.
Figure 1: Liquidity Stack
Market Impact Curves
We define market impact as a midprice move post trade execution. Therefore, if the market moves upwards after a buytrade than this trade has a positive market impact. At this stage the following concepts will be used interchangeably
 Market Impact. Market impact is normally defined in the literature as the price move due to a change in supply/demand balance. For example, large buy order would have to drive prices up to find new equilibrium between buyers and sellers.
 Short Term Alpha. Some traders executions may not have any market impact in the sense of changing supply/demand balance (can be just too small). But if this trader can forecast the future price move, his trades may look like they have a market impact
Tracking market evolution around trade time often gives an insight into trading style (whether market impact is temporary or permanent, whether execution orders are read by other market participants before the trade, socalled signaling risk etc). It is also frequently useful to track market evolution before the trade time. While what happens before the trade is not market impact, it can highlight events leading to the trade and explain the presence (or absence) of market impact.
Tradefeedr platform allows to calculate and plot market impact of each trade before and after the trade. Those market impact curves are then aggregated across trades, LP, currencies, accounts and any other labels provided by Tradefeedr clients. For example, it is possible to track and compare market impact of several groups of traders against each other.
A sample report is given on Figure 1. In this specific example the market curves are aggregated across LPs and trade Status (Reject versus Fill). The shades around the curves are one standard deviation away to indicate the precision of the market impact estimates. At least three possible visualization scenarios are available
 “Weighted” Market impact from the trader perspective where aggregation of tradebytrade market impacts is done by trade size. Trader perspective just mean that if the price goes up after the buy trade the market impact is positive (as opposed to LP View below as LP is on the opposite side of the same trade)
 “Average” Market impact from trader perspective where average of all tradebytrade market impacts is calculated. This highlight trader ability to forecast the market rather than “impact” it. Hence small trades have same weight as big ones. The difference between weighted and average market impact can give an idea whether the trader actually changes supply/demand balance or is good in short term alpha.
 “LP View” market impact (actually shown on Figure 1). This is the market impact as would be seen from LP perspective. The P&L is exactly opposite to that of the trader. When a trade is done against LP the instantaneous P&L of this trade is Spread P&L. If the trade was a buy order and the price goes up LP P&L starts to decrease. So it is a downward sloping curves. The faster the curve goes down, the more difficult is the flow received by an LP compared to his peers. Also normally rejected trades are more toxic than fills is LP has an alpha himself and makes fill/reject decision intelligently
Figure 1: Market Impact Curve by LP and Trade Status 
Trading Styles & Interpretation
The aggregated market impact around the trade type can tell a lot about the trading style. It is general consensus that one cannot have short term alpha unless it is intentional (i.e. well researched short term trading strategies). While this can be true if we talk about monetization of short term alpha, it is actually common to have a meaningful short term alpha within a spread. All one has to do is to trade with flow momentum. While the flow is not observable in FX it is well correlated with FX price action and generally momentum trades have small positive short term alpha.
Short term alpha is an unpleasant phenomenon for market maker as it eats straight into market maker P&L. However, it is important to understand the reasons behind client market impact. Figure 2 schematically presents 4 possible client profiles:
 Short term momentum is a crowd trader. He is either taking signal explicitly from market movement or just follows the crowd (e.g. see good economic numbers, need to buy). Despite the fact that the profile does not look good for market maker those are the majority of flows. And if the slope is not steep, it is actually possible for market maker to accommodate by charging a required spread. There are different shades on momentum trades but most of them would have to be managed by market maker rather than rejected as a client.
 Mean reversion strategies are very good for market maker. However, if a trader identifies himself in this category the natural question is how to improve the market timing. Tradefeedr platform can help to quantify if the potential improvement is possible.
 “No Direction” is everyone’s expectation on a “typical retail trader”. However as retail people trade with the news they have common factor making a pure “No Direction” trader rare.
 “Toxic/Sniper” or “Overaggregated” is a unpleasant type which purely takes away from the market maker P&L. “Overaggregated” trader is the one who does not have market direction from some sophisticated modelling, He just connects to a large number of LPs and always trades with LPs who has the best price and may be mispricing with high probability. Therefore overaggregated looks toxic to all LPs and eventually the spread will be widened.

Figure 2: Client Classification Based on Market Impact Curves 
How much evidence is needed?
The natural question is how many observation one needs to collect before being reasonably confident that a client is toxic at a certain time horizon with a specific precision. Let’s start with a simple motivating example. For those with statistics background the entire analysis is nothing more than a calibration of confidence interval definition to real short term volatility and drift of FX markets.
Let’s consider onesecond horizon. Assume we have a trader with a measured toxicity of 5 $/m at one second horizon (so we took all trades and averaged the toxicity). We can also measure the volatility of this toxicity but it is more instructive to parametrize it to expected volatility numbers. Assume annualized volatility of 10%, this is an average FX volatility during London hours. The expected volatility of toxicity would be scaled from annualized number to one second as 10%/sqrt(252*24*60*60. If expressed in $/m the one second volatility would be around 21 $/m. We know that the more observations we have, the smaller is the volatility of the average toxicity. It is well known that the volatility of the average decreases as a square root of a number of observations.
Assume we are interested in the following interval [5 – P, 5+ P] $/m around measured average toxicity of 5 $/m. “P” in the interval stands for precision. For a given P we would like to know what the necessary number of observations required to be able to say that the true (unobservable) average toxicity in inside this confidence interval with probability Alpha. Obviously the more precise we want to get, the more observations we need. Also for a given precision the more confidence we want to have (probability of being inside the interval), the more observations we need. The solution for the minimum number is trivial inversion of confidence interval definition and is given below
Therefore, applying the formula above we get
 If the required confidence is 95%, required precision is 5$/m and the time horizon is 1 second the formula above suggest we need 70 observation to achieve the required precision with this confidence. It would mean that we could say that the average trader toxicity is x $/m +/ 5$m with the probability of 95%.
 If the required horizon is 10 second then the same precision and confidence can only be achieved with 705 observations (note that required number of observations is linear in time as follows from the formula above)
 If the required horizon is 10 second but we only need a precision of 10 $/m we only need 176 observations. Let’s call is core scenario for sensitivity analysis below.
Figure 3: Illustration of Confidence and Precision around Toxicity Estimates. 
Figure 3 illustrate the uncertainty and precision. Basically we need to increase the number of observations until the support of the blue area (confidence) would be equal to given precision.
Of course the reality is more complex. Different trades happen in different times of the day and it is not possible to assume that toxicity happens at the same volatility if one trade is done after nonfarm payroll and another is during Asia morning. However, it is useful to have ball pack numbers as a benchmark.
Let’s define the terms from the example above a bit more formally and do a simple sensitivity analysis
 Precision is the width of the interval around the measured average toxicity of client trades. This interval is believed to contain the true (unobserved) toxicity. Precision is P in [X – P, X+ P] $/m interval where X is an average measured toxicity.
 Confidence is the probability that the true (unobserved) toxicity of the client is within the precision (interval above).
 Horizon is a time in second after the trade time. This is time horizon at which is the toxicity of trading is measured. We may be interested in toxicity over 1 second or over 10 second or over even longer horizon.
A number of additional assumptions is required to apply standard normal test to the average toxicity.
 We assume that over some measurement time period client toxicity distribution does not change. For example client trading style does not change over say over one month. This assuming allows to calculate average toxicity. Otherwise if client inherent toxicity changes on every single trade nothing can be measured.
 We also need to assume that average toxicity is normally distributed (to apply normal test). This is not such a big assumption as an average of a majority of stationary variables converges to normal distribution pretty fast.
 Finally we would need to assume some volatility of toxicity.
In what follows we do more calibration to around the base case scenario described above.
Scenarios: Time Horizon of Toxicity
First we consider sensitivity to time horizon. As we increase the time horizon we need more observations to achieve the same precision with the same confidence. As the figure below are conclude that
 To estimate toxicity at 10$/m at 10 seconds horizon we need around 200 observations (176, core scenario).
 However to get 5$/m toxicity would require much bigger sample of 705 observations (as shown above).
 Increasing time horizon would require more observations as toxicity itself is more volatile over longer horizon so more observations are needed to increase the precision.
Figure 4.

Scenarios: Precision of Toxicity
We start with the same core scenario that it is slightly under 200 observations to identify toxicity at 10 second horizon with 95% precision. As Figure 5 demonstrates, if we want to improve the precision to 2$/m the sample size would have to be almost 5000! It is very expensive to have precise measurements.
Figure 5.

Scenarios: Confidence in Toxicity
We start with the same core scenario that it is slightly under 200 observations to identify toxicity at 10 second horizon with 95% precision. As Figure 6 shows, increasing this precision to 99% is very cheap. We just need to go over slightly under 250 observations.
Figure 6.

The results above demonstrate well known statistical fact: it is quite easy to achieve high confidence. However, it is very difficult to achieve high precision. Therefore, imprecise measurement is a feature under which all the trading decisions would have to be made.
Using Market Impact Curves
When combined with Spread Paid market impact curve gives a good approximation of your LP P&L. Therefore, while fair spread can be a theoretical conversation, negative or zero P&L of your LP is not something they are going to tolerate for long time.
This is especially true for DMA platforms. If they aggregate market impact from their client and pass it on to LP they would have to analyze cumulative impact of their client on their LPs. Tracking market impact curves over time can suggest trading style changes in the underlying clients. It would also allow to allocate “toxicity” optimally to the LP that can handle it.
Conclusions
Market impact curves are simple but extremely useful piece of analytics. They allow to get insights into client trading style. Tracking them over time helps to understand whether client trading style is changing. However, it is important to keep in mind that it is expensive to measure toxicity precisely, especially over longer horizon. Therefore while it is advisable to act upon the existing information, the exact measurements should not be taken at face value but with uncertainty attached to them.