optimism in the face of uncertainty
Exploring the state-value space of life is tricky.
There are two main issues:
- we have no clue what the shape of this function looks like
- exploring this function is expensive
The most beautiful result of RL is the concept of optimism in the face of uncertainty. I find that many beautiful philosophical concepts emerge from RL, which may be why I am drawn to the field too.
Optimism in the face of uncertainty is best embodied in the GP-UCB algorithm.
The algo is simple.
Quantify the risk you're willing to take as x.
For each state you are considering exploring, compute y = mx + c, where x is the variance of the possible outcome at that state and c is the expected reward from it.
Then move to the state s where the value y is highest.
Essentially, you're moving to the area of the function where the upper bound is highest.
Life is uncertain, but you should be optimistic, not least because its best mathematically.
A corollary of this is that one of the best things you can do for your friends is to inspire optimism in the face of uncertainty.
This is why programs like EF or YC are so valuable.
They inherently embody this principle, which RL has shown us to be optimal.