Steven’s Homepage

I am a PhD student working on reinforcement learning in the Prorok Lab at The University of Cambridge, in collaboration with the Toshiba Cambridge Research Lab. My research focus is on applying deep reinforcement learning to partially observable Markov decision processes (POMDPs) using recurrent models. In other words, I am working to endow intelligent agents with long-term memory capabilities, enabling reasoning under the uncertainty present in the real world.

Under a more philosophical lens, I believe that long-term memory is necessary for truly intelligent agents. Our personality, our decisions – who we ultimately are as people is conditioned on a lifetime of individual experiences. How can you learn from your successes and failures, if you can’t remember them?