Addressing Sample Inefficiency and Reward Bias in Inverse Reinforcement Learning

Kumar Krishna Agrawal
Kumar Krishna Agrawal

arXiv: Learning, Volume abs/1809.02925, 2018.

Cited by: 3|Views68


The Generative Adversarial Imitation Learning (GAIL) framework from Ho u0026 Ermon (2016) is known for being surprisingly sample efficient in terms of demonstrations provided by an expert policy. However, the algorithm requires a significantly larger number of policy interactions with the environment in order to imitate the expert. In thi...More



Get fulltext within 24h
Upload PDF

1.Your uploaded documents will be check within 24h, and coins will be credited to your account.

2.As the current system does not support cash withdrawal, you can add staff WeChat (AMxiaomai) to receive it as a red packet.

3.10 coins will be exchanged for 1 yuan.


Upload a single paper

for 5 coins

Wechat's Red Packet

Upload 50 articles

for 280 coins

Wechat's Red Packet

Upload 200 articles

for 1200 coins

Wechat's Red Packet

Upload 500 articles

for 3000 coins

Wechat's Red Packet

Upload 1000 articles

for 7000 coins

Wechat's Red Packet
Your rating :