Hi!

Signals payouts are currently low, and the Numerai team is looking for possible improvements. One purpose of this post is to discuss if the Signals’ payout curve could be modified to make Signals more attractive.

Below is the message that I originally posted in the chat:

Currently, on Signals, a very high variance signal that is correct 56% of the time loses money. However, combining signals from multiple users should reduce the variance, and therefore such a very high variance signal may still be of interest for the meta-model. Here very high variance means a signal that reaches the payout cutoff in both directions all the time.

Let’s consider now a more reasonable signal that is staked with 2xMMC to maximize profit. The fact that a 2x coefficient is applied both to correlation and MMC increases the variance. To simplify, let’s assume that the payout (equal to 2 x corr + 2 x MMC) only takes values in {+10%, -10%}. To be break-even, the signal must be correct 52.5% of the time. If the signal is correct 55% of the time, it will have an average return of 0.5%. To increase the return of such signal, we could make a change to the payout curve. What about attacks? It is important to notice here that to avoid attacks of type p | 1-p, the payout curve doesn’t have to be symmetric. Taking into consideration how returns are compounded, a payout function x |-> x if x >= 0 and x / (1 – x) if x < 0 would make p | 1-p attacks just break-even. With such a payout function, the previous signal would have a return that goes up from 0.5% to 0.96%, that is almost the double. Furthermore, if the signal is correct only 52.5% of the time, rather than being break-even, it would now have a positive return of 0.48%.

I believe that the above payout function would be more in line with the purpose of Signals, that is gathering new original signals that could improve the meta-model, even if these signals are very weak. Furthermore, this change would be particularly easy to implement.

I am going to give a few more details now.

To simplify the problem, let’s consider that our weekly result (for example 2corr + 2mmc) only take values in \{r, -r\}. Let’s check how often we must get a positive result with the current symmetric payout curve in order to get break-even:

(1 + r)^p (1 – r)^{1 – p} = 1 \iff p = \frac{-\log(1-r)}{\log(1+r) - \log(1-r)}.

That means that if our results take only values in \pm 25\%, we need to be correct \frac{-\log(1-0.25)}{\log(1+0.25) - \log(1-0.25)} = 56.3\% of the time to be break-even.

If the values are in \pm 10\% (resp. \pm 5\%), the predictions need to be correct 52.5\% (resp. 51.25\%) of the time to be break-even.

We see here that any signal that doesn’t have a high number of positive eras will probably either lose money or barely make any money. That seems contradictory to the fact that with Signals, we have to bring our own data, and while such data can be of interest to Numerai, we cannot expect to have results as good as the ones we can get in the classic tournament that is using expensive financial data.

Let’s now consider an asymmetric payout curve defined by:

f(x) = \left\{\begin{array}{ll} x \text{ if } x \ge 0 \\ \frac{x}{1 – x} \text{ if } x < 0 \end{array} \right.

This function as the following property: (1 + f(x)) (1 + f(-x)) = 1.

That is if we submit p and 1 – p, we get as results corr and -corr, and therefore the payout is break-even.

```
def f(x):
if x >= 0:
return x
else:
return x / (1 - x)
r = 0.1
print('With results in +-0.10:')
for c in [55, 52.5, 51]: # c is percentage of correct predictions
payout = ((1 + f(r))**c * (1+f(-r))**(100-c)) ** (1 / 100) - 1
print(f'- if the signal is correct {c}% of the time, the average payout is {payout * 100 :.2f}%')
```

```
With results in +-0.10:
- if the signal is correct 55% of the time, the average payout is 0.96%
- if the signal is correct 52.5% of the time, the average payout is 0.48%
- if the signal is correct 51% of the time, the average payout is 0.19%
```

We could also add to this payout function a small penalty. That would make signals with very high variance and no prediction value lose a bit of money rather than just being break-even:

```
def f(x, eps):
if x >= 0:
return x
else:
return x / (1 - x) + eps * x
```

That’s all for today, thank you for reading