Working of PPO Agent

I think PPO is not considering historical data while making a decision per frame. Is it correct, or am I missing something? Any idea anyone.

Hi, I’m not sure what you mean by “considering historical data”? Are you asking if PPO uses memory?

If you’re referring to memory then the default configuration provided does not use memory, it uses a simple CNN. You can however add memory, see: