pomdp-solve

POMDP Solver Software

The 'pomdp-solve' program (written in C) solves problems that are formulated as partially observable Markov decision processes, a.k.a. POMDPs. It uses the basic dynamic programming approach for all algorithms, solving one stage at a time working backwards in time. It does finite horizon problems with or without discounting. It will stop solving if the answer is within a tolerable range of the infinite horizon answer, and there are a couple of different stopping conditions (requires a discount factor less than 1.0). Alternatively you can solve a finite horizon problem for some fixed horizon length. The code actually implements a number of POMDP solution algorithms:

  • Enumeration (Sondik '71, Monahan '82, White '91)
  • Two Pass (Sondik '71)
  • Linear Support (Cheng '88) †
  • Witness (Littman '97, Cassandra '98)
  • Incremental Pruning (Zhang and Lui '96, Cassandra, Littman and Zhang '97)
  • Finite Grid -and instance of point-based value iteration (PBVI) (Cassandra '04)
    † Requires the commercial CPLEX linear programming solver software.

Version Notes

This v5.4 version was new as of May 2015 and has minor updates needed to work on more modern versions of GNU/Linux and Apple's OS-X (circa May 2015). The GCC compiler got a lot more picky and generated some errors and many warnings when run against the older v5.3 code. This also adds two long-ago contributed patches from Trey Smith around allowing scientific notion in the POMDP files and speeding up accessing immediate rewards for large problems.

Downloads

The recommended method to get the pomdp-solve running is to download the C source code and compile it on your own machine. Use the button below to download the source and the instructions in the following section for installing.

Binary Executables

Use the buttons below for pre-compiled executables if you are having problems compiling the source or if you want a possibly easier path to running the code. These are not guaranteed to be universal executables, but you can give them a try and see if they work for you.

Installing from Source

Download the pomdp-solve v5.4 code, then do the following:

       -- download the file --
       tar zxvf pomdp-solve-5.4.tar.gz
       cd pomdp-solve-5.4
       ./configure
       make
    

You will find the pomdp-solve executable in the src directory. I haven't yet tested what happens if you execute the 'make install' target. This generally works on Unix-style systems (GNU/Linux and Apple's OS-X) so you will likely need to do extra work to get it to run on a Microsoft Windows system.

Running

You would use the -pomdp command line option with the name of the file that defines the POMDP model. The POMDP File Format Page describes the syntax for this input file and the Example POMDP Models page has some examples of these files. Executing with the -h option will show all the command line options, which are explained on the Command Line Options Page.

Example

If you want to just ensure the compiled code is working and/or get familar with it, I suggest downloading the Tiger POMDP File and executing like this:

./src/pomdp-solve -pomdp tiger.95.POMDP
You should see output that looks something like this:
 //****************\
||   pomdp-solve    ||
||     v. 5.2       ||
 \****************//
      PID=21958
- - - - - - - - - - - - - - - - - - - -
time_limit = 0
mcgs_prune_freq = 100
verbose = context
stdout = 
inc_prune = normal
history_length = 0

.. Many Parameter Setting Lines Omitted ...

vi_variation = normal
horizon = 0
stat_summary = false
max_soln_size = 0.000000
witness_points = false
- - - - - - - - - - - - - - - - - - - -
[Initializing POMDP ... done.]
[Initial policy has 1 vectors.]
++++++++++++++++++++++++++++++++++++++++
Epoch: 1...3 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 2...5 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 3...9 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 4...7 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 5...13 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 6...15 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 7...19 vectors in 0.00 secs. (0.00 total) (err=inf)
Epoch: 8...25 vectors in 0.01 secs. (0.01 total) (err=inf)
Epoch: 9...27 vectors in 0.01 secs. (0.02 total) (err=inf)
Epoch: 10...27 vectors in 0.01 secs. (0.03 total) (err=inf)

... Many Execution Lines Omitted ...

Epoch: 473...9 vectors in 0.00 secs. (2.88 total) (err=3.20e-11)
Epoch: 474...9 vectors in 0.01 secs. (2.89 total) (err=3.04e-11)
Epoch: 475...9 vectors in 0.00 secs. (2.89 total) (err=2.89e-11)
Epoch: 476...9 vectors in 0.00 secs. (2.89 total) (err=2.75e-11)
Epoch: 477...9 vectors in 0.00 secs. (2.89 total) (err=2.61e-11)
++++++++++++++++++++++++++++++++++++++++
Solution found.  See file:
     tiger.95-21957.alpha
     tiger.95-21957.pg
++++++++++++++++++++++++++++++++++++++++
User time = 0 hrs., 0 mins, 2.61 secs. (= 2.61 secs)
System time = 0 hrs., 0 mins, 0.28 secs. (= 0.28 secs)
Total execution time = 0 hrs., 0 mins, 2.89 secs. (= 2.89 secs)

** Warning **
     lp_solve reported 2 LPs with numerical instability.
This shows the convergence of a solution for the infinite horizon. The optimal value function coefficients and the resulting "policy graph" will be in the files named in the output shown with the .alpha and .pg extensions (tiger.95-21957.alpha and tiger.95-21957.pg in the example above).

The format for these output files are described on the Value Function Format Page and the Policy Graph Format Page. Your solution should look similar to this optimal value function and this optimal policy graph.

Related Pages