Marcos Lopez de Prado seems to be a well known and renowned machine learning expert in the field of finance.
I am very far from his level, as have not yet finished my PhD in economics, and only have an applied level statistical knowledge. I have encountered a much cited paper of Lopez de Prado. I can not say I completely understand all the mathematical parts. But there are some claims in the paper, which seem to completely contradict things, which I have learned on statistics and economics thus far, or at least seem to be illogical for me.
For a specific example, the paper under the section Pitfall #4 and Solution #4 suggests, that by differencing time series to make them stationary for classical statistical models (ARIMA etc.) removes the memory of the series and thus makes them lose the predictive power:
The conclusion is that, for decades, most empirical studies have
worked with series where memory has been unnecessarily wiped-out. The
reason this is a dangerous practice is that fitting a memory-less
series will likely lead to a spurious pattern, a false discovery.
Incidentally, this over-differentiation of time series may explain why
the Efficient Markets Hypothesis is still so prevalent among academic
circles: Without memory, series will not be predictive, and
researchers may draw the false conclusion that markets are
In economics there really is a simplified theoretical model on equity returns, which posits, that it is a memory-less white noise, and the prices (the integrated returns) follow a random walk.
But from an empirical perspective, as far as I understand the memory-less attribute of returns pertains only to the individual data points themselves, not the series as a whole. A differentiated series still should have a “collective” memory put together, and it has almost the same information as the integrated version, only lacks a constant value. So it should have the same predictive power as well, should it not? Or it is me, who has a lack of understanding?
This tries to answer the original question and not get into Marcos’s paper etc. If you think that the level of a variable ( say log price ) has information, then differencing the series ( to obtain returns ), throws out information. If you don’t think that the level has information, then differencing is fine. Engle and Granger in their 1987 econometric paper showed how it is possible to consider both levels and changes in the relationship between two series ( X and Y) through the use of an ECM. But it does not mean that there can’t be cases where one doesn’t care about the levels and is only interested in changes ( or vice versa ).
On a different note, here’s a piece of advice. Whenever you read anything about strategies and techniques and approaches in finance don’t put too much weight in them because, if the author REALLY TRULY HAS SOMETHING THAT WORKS, he-she is not going to divulge it anyway. Most of the stuff you read will be purposely vague and general and unless you know the details of what the person actually does, not terribly useful. That’s not to say that Marcos doesn’t write interesting papers but he’s not going to tell you what he actually does so it’s best to read his or anyone’s presentations with that in mind.