Markov.AgentSourceExposes the functor Agent.Make which returns an Agent.S module that is parameterised by the provided implementations of Agent.MarkovCompressorType, Agent.RewardType and Agent.RLPolicyType.
Agent.S.act initial_policy commences an infinite loop using the policy to take actions and produce observers (functions returning a state). When the observer resolves to a state, the loop repeats.
Handle the continuous-time stream of information from a system and compress the information into a Markovian state representation such that the sequence of states returned by sequential calls to observe have the Markov property.
A reward function which is a map from a state to a Reward.t option.
A policy for infering an action and an observer given a state and optionally a reward.
module Make
(MarkovCompressor : MarkovCompressorType)
(Reward : RewardType with type state = MarkovCompressor.state)
(Policy :
RLPolicyType
with type state = MarkovCompressor.state
with type reward = Reward.t) :
S with type policy = Policy.tA functor. Make MarkovCompressor Reward Policy returns an Agent module. For example, Policy must be a type that includes the interface RLPolicyType (e.g. it may be of type RLPolicyType or a 'super-type' of RLPolicyType).