#3 Numerai AMA
I collected some Numerai frequent questions I have been asked several times. I hope this helps newcomers.
Q: Do you have any tips for someone starting in Numerai to progress quickly with their data models?
This is by far the most repeated question. I always recommend the same, start simple, trying to automatize the process as much as possible. If you take the tournament seriously you will have to submit predictions every Saturday, automatizing the process is key.
Numerai provides tools to help you with the process, the tool not only comes with all the necessary to automatize the submission but also with an example script. Once you have everything up and running it’s time to start running experiments, iterating, and applying as many fancy techniques as you want.
Despite being a competition the people participating in the tournament are open, and the community is great. You can learn about different approaches by reading posts in the forum or joining the conversation in RocketChat.
Some examples:
Post explaining how to reduce the feature exposure from Michael Oliver
Liz Experiment Review Q1 2021: Generating Features and Applying Feature Neutralization
Q: After seeing how much you are making every month in Numerai. Are you thinking of leaving your day job and dedicate exclusively to Numerai? Do you have an objective to achieve before even thinking about it?
I ask myself the same question many times. Model returns combined are around 3% weekly and NMR oscillating between 60-70 USD. My objective for 2020 was to achieve 1k NMR staked producing 30 NMR per week.
But everything changed after the main tournament reached the 300k NMR staked, adding the payout factor to the equation. I guess I would try to accumulate a few more NMR, wait a couple of months to see how the factor decreases the payout, and then start to fantasize again.
Forum post explaining the payout factor: Announcing new payouts system mini-release
Q: Do you ever combine neural nets with gradient boosting within the same model/training set?
Ensembles are powerful and many data scientists are using them in their Numerai approach. We have 15 slots, it’s easy to save some to combine your best models and generate your own meta-model.
I’ve been always trying to keep my approach as easy as possible. So far, I’m happy sacrificing performance if it means keeping a simple pipeline easy to automatize, make any change, and retrain in a couple of hours.
Once you start creating multiple models and ensembling models it’s really easy to end with an impossible pipeline combining never-enough models.
As I pointed out before, you will find useful insights regarding any topic in the Numerai community. Read this thread in RocketChat where some users recommend using rank averages to combine models, which makes sense since you are evaluated in spearman correlation.
If you have received this article directly in your email you can ask any question you have replying directly. You can also join the conversation over here in the comments. I’ll answer every question I get.