The next approach, Littman, et al (1994), attacked the problem
slightly differently. This algorithm, called *witness*, uses
the same basic structure as Sondik and Cheng; i.e., it defines regions
for a vector and looks for a point where that vector is not
dominant.

Unlike Sondik's algorithm is doesn't worry about all the actions all
the time. It concentrate on finding the best value function for each
of the actions separately. Once it find these it will combine them
into the final `V'` value function.

Another way it simplifies the problem, aside from doing each action in
isolation, is that it also only worries about one observation at a
time. Remember that a vector at a belief point is constructed by
selecting a transformed vector from `V` for each observation.
The witness algorithm, like Cheng's algorithm, starts with an
arbitrary belief point and generates its vector. It adds this vector
to a set and assumes that this set is `V'` it then goes about
trying to prove or disprove whether or not this set truly is
`V'`.

In constructing a vector we make a choice for each observation; the
choice being the selection of one of the `V` vectors
representing a particular future strategy. The algorithm then looks
at the individual choices made, observation by observation to see
where a different choice would yield a better value. Like the other
algorithms, it defines the region where it is assured that the
particular choice is best. If it can find a belief point where a
different strategy would be better, then this serves as a witness to
the fact that our current set of vectors is not yet the real
`V'` we are after.

The regions defined by this algorithm are also best describe by the intersection of two simpler regions. We have actually seen these two regions before. One is exactly the same as the regions that Cheng's algorithm defines; namely the region over which the current vector dominates all the other vectors we have generated up to this point. This region is shown in the top of the next figure. Note that at this point in this figure, we are at a point in the algorithm where we have generated three vectors.

Stated another way: this region simply reduces the search for a point to the region over the current approximation we have (i.e., all the vectors we have generated to this point). This prevents us from looking for better points in a region where we already know there are better points.

### Witness regions

The second region defines the region over which the current future
strategy will be best for the observation `z1`. If we wander
out of this region, then we could substitute a different course of
action for this observation and get a better value. When considering
the strategy for one observation, we ignore the other observations.
This second region, is actually a small subset of one of Sondik's
regions.

However, if we find a belief point where we could substitute another future strategy, we cannot simply plug in this new strategy. Because we are considering this in isolation from the other observations, it could be that in addition to changing the choice for this observation, we could also change the choice for another observation. Thus, finding a belief point where the current observation's choice could be changed, just gives us a witness to the fact that there is a point where we can do better. We can then take this point and generate the real best vector for it (taking into account all the observation choices).

We repeat this for every observation to get a set of vectors for a
particular action and then we do this for all actions. We can then
combine and eliminate dominated vectors to get the real set we were
after, `V'`.

## Continue

**Back | POMDP Tutorial | Next**