Distributed policy evaluation under multiple behavior strategies