Memoization
Introduction
The two best rules for
improving efficiency of
Prolog programs were given by Richard O'Keefe:
- Don't do it.
- Don't do it again.
To illustrate option 1: If we need to solve a task that
is exorbitantly hard, we could look for a different,
easier task whose results approximate optimality
within a sufficient range.
In this text, we discuss option 2, that is ways to not do
it again. The common theme is that we
remember results of previous computations, which is a
technique known as memoization in computer science.
Tabling
Tabling is a built-in method to apply memoization to
Prolog predicates. This means that the underlying Prolog system
can automatically perform memoization for you, if you
request it. Tabling is also known
as SLG resolution.
Different Prolog systems provide tabling in various forms and with
different
characteristics. XSB Prolog
is one of the pioneering systems in this area, and some of its
techniques are now becoming more widely available also in other
systems. Your Prolog system's manual should contain the exact
details.
For example, to enable tabling in Scryer Prolog, you need to:
- Use library(tabling) via the directive :-
use_module(library(tabling)). in your source file.
- Enable tabling via the table/1 directive.
Hence, to enable tabling
for adjacent/2, you can write:
:- use_module(library(tabling)).
:- table adjacent/2.
adjacent(a, b).
adjacent(e, f).
adjacent(X, Y) :- adjacent(Y, X).
In this case, tabling has made the predicate terminating:
?- adjacent(X, Y), false.
false.
As the second example, let us consider the series
of Fibonacci numbers,
which we can describe in Prolog for example as follows:
fibonacci(0, 1).
fibonacci(1, 1).
fibonacci(N, F) :-
N #> 1,
N1 #= N - 1,
N2 #= N - 2,
fibonacci(N1, F1),
fibonacci(N2, F2),
F #= F1 + F2.
The relation works as intended in the most general case:
?- fibonacci(N, F).
N = 0, F = 1
; N = 1, F = 1
; N = 2, F = 2
; N = 3, F = 3
; N = 4, F = 5
; ... .
It can also be used for more specific queries:
?- fibonacci(17, F).
F = 2584
; false.
However, the computation runs out of stack or takes too long for
larger arguments:
?- fibonacci(100, F).
ERROR: Out of local stack
We can easily enable tabling by adding the following directives to
the file:
:- use_module(library(tabling)).
:- table fibonacci/2.
Using SLG resolution, the answer to the previous query is now
readily found:
?- fibonacci(100, F).
F = 573147844013817084101
; false.
Tabling also allows you to
use left-recursive DCGs for
parsing. For example, you can add the directive:
:- table tree_nodes//1.
to apply tabling to the DCG nonterminal tree_nodes//1.
Scryer Prolog implements tabling via delimited
continuations. See the
paper Tabling
as a Library with Delimited Control by Benoit
Desouter et al. for more information.
Doing it manually
We can emulate tabling by manually storing results
in the global database.
Consider for example the following definition of memo/1:
:- dynamic(memo_/1).
memo(Goal) :-
( memo_(Goal)
-> true
; once(Goal),
assertz(memo_(Goal))
).
As long as Goal is semi-deterministic or deterministic,
memo(Goal) is equivalent to Goal
and reuses results that have already been computed. This
leads to a technique known as dynamic programming in
computer science.
Note that this is less powerful than tabling: First, it
requires modifications of the original program that go
beyond adding simple directives. You have to manually wrap the
goals for which you want to enable memoization
with memo/1. Second, this rather ad hoc
definition does not help to improve termination properties of
your programs. On the plus side, the technique can still help to
improve performance tremendously when it is applicable, and it
is portable to all Prolog systems.
Applied to fibonacci/2, it could look as follows:
fibonacci(0, 1).
fibonacci(1, 1).
fibonacci(N, F) :-
N #> 1,
N1 #= N - 1,
N2 #= N - 2,
memo(fibonacci(N1, F1)),
memo(fibonacci(N2, F2)),
F #= F1 + F2.
Sample query and answer:
?- fibonacci(100, F).
F = 573147844013817084101.
Doing it more explicitly
In many cases, we want more explicit control over what is being
stored. For example, we may want to ensure that we do not
accidentally clutter the global database or tabling storage. In
such cases, we can carry around a custom "database" of existing
results as predicate arguments.
We can use semicontext notation to
carry around the state implicitly while still retaining
full control over the storage.
As an example, we present the calculation of the minimum
edit distance between two lists, using the
nonterminals state//1
and state//2:
min_edit(As, Bs, Min-Es) :-
empty_assoc(Assoc0),
phrase(min_dist(As, Bs, Min-Es), [Assoc0], _).
min_dist([], [], 0-[]) --> [].
min_dist(As, Bs, Min-Es) -->
( state(S0), { get_assoc(store(As,Bs), S0, Min-Es) } -> []
; { findall(option(Action,Cost,As1,Bs1),
edit_option(As, Bs, Action, Cost, As1, Bs1),
Options) },
assess_options(Options, CostOptions),
state(S0, S),
{ keysort(CostOptions, [Min-Es|_]),
put_assoc(store(As,Bs), S0, Min-Es, S) }
).
assess_options([], []) --> [].
assess_options([option(Action,Cost,As,Bs)|Options], [Min-[Action|Es]|Rest]) -->
min_dist(As, Bs, Min0-Es),
{ Min #= Min0 + Cost },
assess_options(Options, Rest).
This code is quite flexible, and can be adapted to various use
cases by providing a suitable definition
of edit_option/6, declaratively describing the
actions that are allowed to transform one list into another. For
example, a typical use case may look like this:
edit_option([A|As], Bs, drop(A), 1, As, Bs).
edit_option(As, [B|Bs], insert(B), 1, As, Bs).
edit_option([A|As], [A|Bs], use(A), 0, As, Bs).
This means that there are three possible actions (dropping,
inserting and using an element), with associated costs of one, one
and zero, respectively.
Example query and answer:
?- min_edit("abcef", "axbdef", Min).
Min = 3-[use(a),insert(x),use(b),drop(c),insert(d),use(e),use(f)]
; false.
If it were implemented naively, min_edit/3 would exhibit
exponential runtime in the length of the lists. Using
memoization has reduced this to polynomial runtime.
More about Prolog
Main page