Declarative Programming

Most of the languages we’ve considered so far in this text have followed the imperative programming model, where a computation is decomposed into individual statements that modify the state of the program. These languages have also been procedural, grouping statements into subroutines that are then called explicitly.

We have also seen the functional programming model, which decomposes a computation into functions that are closely related to those in mathematics. In such a model, programming is done with expressions that avoid side effects. We have also considered specific languages that provide a mix of the functional and imperative paradigms.

Functional programs are declarative, since they declare a relationship between the inputs and outputs of a function. We turn our attention to other models that are declarative, including those that express computation using logical relations, constraints, and dependencies.

Logic Programming

Whereas functional programming is based on the theoretical foundations of \(\lambda\)-calculus, logic programming is based on the foundation of formal logic. More specifically, it is based on first-order predicate calculus, which expresses quantified statements such as:

\[\forall X.~ \exists Y.~ P(X) ~\vee~ \neg Q(Y).\]

This states that for every value X, over some implicit universe of values, there is some value Y such that either P(X) is true or Q(Y) is false or both. This specific statement can also be written in the form of an implication:

\[\forall X.~ \exists Y.~ Q(Y) ~\Longrightarrow~ P(X).\]

The implication \(a ~\Longrightarrow~ b\) is equivalent to \(\neg a ~\vee~ b\).

In most logic languages, a program is specified in terms of axioms that are assumed to be true, and a programmer specifies a goal that the system should attempt to prove from the set of axioms. An axiom is usually written in the form of a Horn clause, which has the following structure:

H :- B1, B2, ..., BN

The :- symbol specifies a reverse implication, and the comma is used for conjunction. The equivalent form in predicate calculus is:

\[(B_1 ~\wedge~ B_2 ~\wedge~ \dots ~\wedge~ B_N) ~\Longrightarrow~ H\]

In the Horn clause above, H is the head of the clause, while B1, B2, ..., BN is the body. In natural language, the Horn clause is stating that if B1 is true, and B2 is true, …, and BN is true, then it must be that H is also true. (Quantifiers are implicit in a Horn clause, though we will not discuss the details here.)

The individual elements of a Horn clause, such as H or B2 above, are called terms. A term may be a variable, an atom in the form of a symbol, or a compound term, such as a predicate applied to some arguments which are themselves terms.

A set of Horn clauses establishes relations among data, which we can then use to query whether a relation holds or what pieces of data satisfy a particular relation.

As a concrete example, consider the following clauses that represent familial relationships:

parent(P, C) :- mother(P, C).                 % rule 1
parent(P, C) :- father(P, C).                 % rule 2
sibling(A, B) :- parent(P, A), parent(P, B).  % rule 3

Here, we have stated three rules. The first establishes that if P is the mother of C, then P is also a parent of C. The second states that if P is the father of C, then P is also a parent of C. The last rule states that if P is a parent of A, and P is also a parent of B, then A and B are siblings.

We can state some specific relationships as facts, which are Horn clauses without a body and thus are unconditionally true:

mother(molly, bill).      % fact 1
mother(molly, charlie).   % fact 2

We can give the logic interpreter a query of the form sibling(bill, S). The interpreter will then attempt to solve this query using a process known as resolution, which applies rules to existing information. Part of this process is unification, which connects terms that match. One possible resolution sequence for the query above is:

   sibling(bill, S)
-> parent(P, bill), parent(P, S)                 (rule 3)
-> mother(P, bill), parent(P, S)                 (rule 1)
-> mother(molly, bill), parent(molly, S)         (fact 1)
-> mother(molly, bill), mother(molly, S)         (rule 1)
-> mother(molly, bill), mother(molly, charlie)   (fact 2)

The end result in this sequence would be that S = charlie.

In the process above, the third step unifies the term mother(P, bill) with mother(molly, bill), which in turn unifies P with molly. This unification is reflected in all occurrences of P, resulting in the second term becoming parent(molly, S). Unification is a generalized form of variable binding, except that full terms can be unified with each other rather than just binding variables to values.

In our formulation of familial relationships, however, there is nothing preventing the resolution engine from applying mother(molly, bill) in resolving mother(molly, S), so another perfectly valid solution is that S = bill. We will see later how to fix this specific problem in Prolog.

Prolog

Before we proceed further in our exploration of logic programming, let us introduce a concrete programming language to work with. Among logic languages, Prolog is by far the most popular, and many implementations are available. For the purposes of this text, we will use a specific interpreter called SWI-Prolog, of which there is also a web-based version.

The syntax we used above is actually that of Prolog. A Prolog program consists of a set of Horn clauses that are assumed to be true. A clause is composed of a head term and zero or more body terms, and a term may be atomic, compound, or a variable. An atomic term may be an atom, which is either a Scheme-like symbol or a quoted string. The following are all atoms:

hello    =<    +    'logic programming'

If an atom starts with a letter, then that letter must be lowercase. Thus, hEllo is an atom, but Hello is not. Numbers, which can be integer or floating-point, are also atomic terms.

Variables are identifiers that begin with a capital letter. Thus, Hello is a variable, as are A and X.

Compound terms consist of a functor, which is itself an atom, followed by a list of one or more argument terms. The following are compound terms:

pair(1, 2)    wizard(harry)    writeln(hello(world))

A compound term is interpreted as a predicate, meaning that it has a truth value, when it occurs as the head term or one of the body terms of a clause, as well as when it is the goal query. Otherwise, it is generally interpreted as data, as in hello(world) in writeln(hello(world)).

While the syntax of a compound term resembles that of a function call in many imperative or functional languages, Prolog does not have functions, so a compound term is never interpreted as such.

A Horn clause with no body is a fact, since it is always true. Thus, the following are facts:

mother(molly, bill).
mother(molly, charlie).

Notice the period that signifies the end of a clause.

A Horn clause with a body is called a rule, and it consists of a head term, the reverse implication symbol (:-), and one or more body terms, separated by commas. The comma signifies conjunction so that the head is true when all the body terms are true. The following are rules:

parent(P, C) :- mother(P, C).
sibling(A, B) :- parent(P, A), parent(P, B).

The first rule states that if mother(P, C) is true, then parent(P, C) is also true. The second rule states that if both parent(P, A) and parent(P, B) are true, then sibling(A, B) is true.

A program is composed of a set of facts and rules. Once these have been established, we can query the Prolog interpreter with a a goal predicate. The interpreter will attempt to establish that the goal is true, and if it contains variables, instantiate them with terms that result in the satisfaction of the goal. If the query succeeds, the interpreter reports success, along with the terms that the variables unified with in order to establish the result. If more than one solution may exist, we can ask for the next one using a semicolon in most interpreters. If we ask for a solution and no more exist, the interpreter reports failure.

As an example, consider the query sibling(bill, S). Loading a file containing the two facts and rules above will result in the interactive prompt ?- (For now, we have elided the parent rule that depended on father, since we haven’t established any father facts and our Prolog interpreter will report an error as a result.):

?- sibling(bill, S).
S = bill ;
S = charlie.

At the prompt, we’ve entered the query followed by a period to signify the end of the query. The interpreter reports S = bill as the first result, and we have the option of entering a semicolon to search for another or a period to end the query. We enter a semicolon, and the interpreter finds and reports S = charlie, as well as a period to indicate its certainty that no more solutions exist.

The actual order in which Prolog searches for a result is deterministic, as we will see shortly. Thus, the query will always find S = bill as its first result and S = charlie as its second.

Lists

Compound terms, which can relate multiple individual terms, allow us to represent data structures. For example, we can use the compound term pair(First, Second) to represent a pair composed of First and Second. The term will not appear on its own as a head or body term, so it will be treated as data. We can then define relations for lists as follows:

cons(First, Second, pair(First, Second)).
car(pair(First, _), First).
cdr(pair(_, Second), Second).
is_null(nil).

In the clauses above, an underscore represents an anonymous variable. Many Prolog implementations will raise a warning if a variable is used only once in a clause – such a variable is called a singleton, and it can be introduced inadvertently due to a typo, as in the following:

cons(First, Second, pair(Frist, Second)).

Here, we misspelled First as Frist, so that both are singleton variables. The warning from the Prolog interpreter directs our attention to this, so that we can fix it. If, on the other hand, we do intend a variable to be used only once, we can start it with an underscore to inform the implementation of that intent. We can also use a lone underscore, and each occurrence of a solitary underscore is considered a separate, anonymous variable.

We’ve set nil as our representation for an empty list. We can then make queries on lists as follows:

?- cons(1, nil, X).
X = pair(1, nil).

?- car(pair(1, pair(2, nil)), X).
X = 1.

?- cdr(pair(1, pair(2, nil)), X).
X = pair(2, nil).

?- cdr(pair(1, pair(2, nil)), X), car(X, Y), cdr(X, Z).
X = pair(2, nil),
Y = 2,
Z = nil.

?- is_null(nil).
true.

?- is_null(pair(1, pair(2, nil))).
false.

In the fourth example, we’ve used conjunction to obtain the cdr of the original list, as well as the car and the cdr of the result.

As in Scheme, lists are a fundamental data structure in Prolog, so Prolog provides its own syntax for lists. A list can be specified by placing elements in square brackets, separated by commas:

[]
[1, a]
[b, 3, foo(bar)]

A list can be decomposed into a number of items followed by a rest, much like the period in Scheme, using a pipe:

?- writeln([1, 2 | [3, 4]]).   % similar to (1 2 . (3 4)) in Scheme
[1,2,3,4]
true.

We can use this syntax to write rules on lists. For example, a contains predicate is as follows:

contains([Item|_Rest], Item).
contains([_First|Rest], Item) :-
    contains(Rest, Item).

The first clause asserts that a list whose first element is Item contains Item. The second clause states that a list contains Item if the remaining list, excluding the first item, contains Item. Thus:

?- contains([], a).
false.

?- contains([a], a).
true .

?- contains([b, c, a, d], a).
true .

The built-in member predicate works similarly to our definition of contains, except that it takes the arguments in reverse order:

?- member(a, []).
false.

?- member(a, [a]).
true.

Arithmetic

Prolog provides numbers, as well as comparison predicates on numbers. For convenience, these predicates may be written in infix order:

?- 3 =< 4.    % less than or equal
true.

?- 4 =< 3.
false.

?- 3 =:= 3.   % equal (for arithmetic)
true.

?- 3 =\= 3.   % not equal (for arithmetic)
false.

Prolog also provides arithmetic operators, but they merely represent compound terms. Thus, 3 + 4 is another means of writing the compound term +(3, 4). If we attempt to unify this with 7 using the explicit unification operator =, it will fail:

?- 7 = 3 + 4.
false.

Similarly, if we attempt to unify a variable with an arithmetic expression, it will be unified with the compound term itself:

?- X = 3 + 4.
X = 3+4.

Comparison operators, however, do evaluate the arithmetic expressions in their operands::

?- 7 =:= 3 + 4.
true.

?- 2 + 5 =:= 3 + 4.
true.

?- 4 < 3 + 2.
true.

In order for the operands to be evaluated, variables in the operands must be instantiated with numeric values. A comparison cannot be applied to an uninstantiated variable:

?- X =:= 3 + 4.
ERROR: Arguments are not sufficiently instantiated

Instead, the is operator is defined to unify its first argument with the arithmetic result of its second argument, allowing the first argument to be an uninstantiated variable:

?- 7 is 3 + 4.
true.

?- X is 3 + 4.
X = 7.

?- X is 3 + 4, X =:= 7.
X = 7.

?- X is 3 + 4, X = 7.
X = 7.

In the third example, X is unified with 7, the result of adding 3 and 4. Since X is now instantiated with 7, it can be compared to 7. In the fourth example, X is 7 so it unifies with the number 7.

We can use this to define a length predicate on our list representation above:

len(nil, 0).
len(pair(_First, Second), Length) :-
    len(Second, SecondLength), Length is SecondLength + 1.

Here, Length is unified with the arithmetic result of adding 1 to SecondLength. This must occur after the recursive application of len, so that SecondLength is sufficiently instantiated to be able to perform arithmetic on it. Then:

?- len(nil, X).
X = 0.

?- len(pair(1, pair(b, nil)), X).
X = 2.

Side Effects

Prolog provides several predicates that perform input and output. We’ve already used the writeln predicate, which writes a term to standard out and then writes a newline. The write predicate also writes a term to standard out, but without a trailing newline:

?- X = 3, write('The value of X is: '), writeln(X).
The value of X is: 3
X = 3.

We will not discuss the remaining I/O routines here.

The Cut Operator

By default, Prolog considers each possible alternative in turn when it reaches a choice point. However, Prolog provides the cut operator, written as !, to eliminate choice points associated with the current predicate. For example, recall the contains predicate:

contains([Item|_Rest], Item).
contains([_First|Rest], Item) :-
    contains(Rest, Item).

A query such as contains([1, 2, 3, 4], 2) introduces a choice point as to which clause to unify with the goal. The first choice fails, since contains([1, 2, 3, 4], 2) cannot unify with contains([Item|_Rest], Item). However, the second choice succeeds, so that we have a new goal term of contains([2, 3, 4], 2). Here another choice point occurs, and the first choice succeeds, with Item instantiated with 2 and _Rest instantiated with [3, 4]. Since no goal terms remain, the query as a whole succeeds. However, the interpreter still has an unexplored choice available, so it will report that more solutions may exist, requiring us to manually either continue the query with a semicolon or end it with a dot:

?- contains([1, 2, 3, 4], 2).
true ;
false.

Instead, we can add the cut operator to tell the interpreter to stop searching for alternatives once it has found a solution for contains([1, 2, 3, 4], 2):

?- contains([1, 2, 3, 4], 2), !.
true.

We can similarly rewrite contains to eliminate choice points upon success:

contains([Item|_Rest], Item) :- !.
contains([_First|Rest], Item) :-
    contains(Rest, Item).

Here, as soon as a goal term unifies with contains([Item|_Rest], Item), the choice point of which contains clause to unify with that goal term is eliminated. Thus, only one solution is found:

?- contains([1, 2, 3, 4], 2).
true.

On the other hand, the fact that the cut operator prevents other choices from being considered can result in queries that previously succeeded to now fail:

?- contains([1, 2, 3, 4], X), X = 3.
false.

Here, the first goal term succeeds by instantiating X with 1, and the cut operator prevents other choices for X from being considered. Then X, now instantiated as 1, fails to unify with 3 in the second goal term.

Given the potential negative consequences of eliminating choice points, using the cut operator is often considered bad practice, so that it should be avoided in most cases. In this text, we only use the cut operator as part of a query, not as part of a rule.

Negation

The search above for sibling(S, bill) produced the undesirable result S = bill. In order to eliminate results from consideration, Prolog provides a limited form of negation. For instance, we can rewrite the sibling rule as:

sibling(A, B) :- A \= B, mother(P, A), mother(P, B).

This states that A must not be unifiable with B. Unfortunately, our query will now fail completely:

?- sibling(S, bill).
false.

This is because when the body term A \= B is reached, A is uninstantiated while B is instantiated with bill. The unification rules above allow an uninstantiated variable to unify with anything: A can unify with B by instantiating A with bill. Since A is unifiable with B, the goal term A \= B fails.

On the other hand, if we write the rule as follows:

sibling(A, B) :- mother(P, A), mother(P, B), A \= B.

then our query succeeds:

?- sibling(S, bill).
S = charlie .

This is because A and B are instantiated as atoms by the time they get to the last term, and we can assert that two atoms not unify.

Prolog also provides an explicit negation predicate, \+. We can therefore query whether harry and bill are not siblings:

?- \+(sibling(harry, bill)).
true.

Unfortunately, we cannot obtain a more general result from the search engine, such as asking it to find someone who is not a sibling of bill:

?- \+(sibling(S, bill)).
false.

This is because negation in Prolog is handled by attempting to prove the term being negated, and only if the proof fails is the negation true. However, the query sibling(S, bill) does indeed succeed with S = charlie, so negation results in false.

Thus, while Prolog does provide negation, it is of limited use. This is not a deficiency in Prolog itself, but rather follows from the limits of the logic-programming paradigm as a whole, which cannot provide the full expressiveness of first-order predicate calculus.

Examples

We conclude with some more interesting examples expressed in the logic paradigm.

  • Suppose we wish to find a seven digit number such that the first digit is the count of zeroes in the digits of the number, the second digit is the count of ones, and so on. Using Prolog, we can express this computation as follows. We will represent our results as a list of digits. First, we define a predicate to count the occurrences of a particular numerical value in a list:

    count(_Item, [], 0).
    count(Item, [Item|Rest], Count) :-
        count(Item, Rest, RestCount),
        Count is RestCount + 1.
    count(Item, [Other|Rest], Count) :-
        Item =\= Other,
        count(Item, Rest, Count).
    

    The first rule states that an arbitrary item occurs zero times in an empty list. The second states that if a value is the first item in a list, then the number times it occurs in the list is one more than the number of times it appears in the rest of the list. The last rule states that if a value is not equal to the first item, then its number of occurrences is that same as the number of times it appears in the rest of the lest.

    Next, we define facts to restrict the values of a digit:

    is_digit(0).
    is_digit(1).
    is_digit(2).
    is_digit(3).
    is_digit(4).
    is_digit(5).
    is_digit(6).
    

    Alternatively, we can define the is_digit predicate using member:

    is_digit(Digit) :-
        member(Digit, [0, 1, 2, 3, 4, 5, 6]).
    

    Finally, we define a predicate to compute our result:

    digits(List) :-
        List = [Digit0, Digit1, Digit2, Digit3, Digit4, Digit5, Digit6],
        is_digit(Digit0),
        is_digit(Digit1),
        is_digit(Digit2),
        is_digit(Digit3),
        is_digit(Digit4),
        is_digit(Digit5),
        is_digit(Digit6),
        count(0, List, Digit0),
        count(1, List, Digit1),
        count(2, List, Digit2),
        count(3, List, Digit3),
        count(4, List, Digit4),
        count(5, List, Digit5),
        count(6, List, Digit6).
    

    We start by unifying the argument List with a list of seven items. We then specify that each item must be a digit. Finally, we require that the the first item be the count of zeroes in the list, the second the count of ones, and so on.

    Entering our query, we get the sole result:

    ?- digits(List).
    List = [3, 2, 1, 1, 0, 0, 0] ;
    false.
    

    We can proceed to write a predicate that relates a list of digits to an actual number:

    digits_number([], 0).
    digits_number([First|Rest], Number) :-
        digits_number(Rest, RestNumber),
        length(Rest, RestLength),
        Number is First * 10 ^ RestLength + RestNumber.
    

    An empty list is related to zero. Otherwise, we compute the number represented by the list excluding its first item, as well of the length of that list. Then the number representing the total list is the number of the smaller list plus the multiple of the power of 10 represented by the first digit. Then:

    ?- digits(List), digits_number(List, Number), !.
    List = [3, 2, 1, 1, 0, 0, 0],
    Number = 3211000.
    
  • The Tower of Hanoi is a classic puzzle that consists of three rods and a set of \(N\) discs of different sizes that slide onto a rod. The puzzle starts with discs in ascending order from top to bottom on a single rod, and the goal is to move the entire stack to another rod by moving one disc at a time. It is prohibited to place a larger disc on top of a smaller one. The solution is to recursively move the \(N-1\) smaller discs to the third rod, move the remaining, largest disc to the second rod, and then to recursively move the other \(N-1\) discs to the second rod.

    We can express this computation in Prolog as follows, using the write and writeln predicates to print a move to standard output:

    move(Disc, Source, Target) :-
        write('Move disc '), write(Disc), write(' from '),
        write(Source), write(' to '), writeln(Target).
    
    hanoi(1, Source, Target, _Temporary) :-
        move(1, Source, Target).
    hanoi(NumDiscs, Source, Target, Temporary) :-
        Previous is NumDiscs - 1,
        hanoi(Previous, Source, Temporary, Target),
        move(NumDiscs, Source, Target),
        hanoi(Previous, Temporary, Target, Source).
    

    The move predicate, given a disc and source and target rods, merely writes out the move to standard output. The hanoi predicate relates a number of discs and three rods, a source rod, a target rod, and a temporary rod. The base case is when there is one disc, and that disc can be moved directly from source to target. The second hanoi rule is the recursive case, which requires is recursively move all but the largest disc to the temporary rod, move the largest disc to the target rod, and then move the remaining discs from the temporary to the target rod. Since Prolog solves the body terms in order, the moves will occur in the right order.

    The follows is the result of a query with \(N = 4\):

    ?- hanoi(4, a, b, c).
    Move disc 1 from a to c
    Move disc 2 from a to b
    Move disc 1 from c to b
    Move disc 3 from a to c
    Move disc 1 from b to a
    Move disc 2 from b to c
    Move disc 1 from a to c
    Move disc 4 from a to b
    Move disc 1 from c to b
    Move disc 2 from c to a
    Move disc 1 from b to a
    Move disc 3 from c to b
    Move disc 1 from a to c
    Move disc 2 from a to b
    Move disc 1 from c to b
    true .
    
  • The quicksort algorithm sorts a list by choosing a pivot, often the first item, partitioning the remaining list into elements that are less than and greater than or equal to the pivot, recursively sorting the partitions, and then appending them. The following Prolog code expresses this:

    quicksort([], []).
    quicksort([Pivot|Rest], Sorted) :-
        partition(Pivot, Rest, Smaller, GreaterOrEqual),
        quicksort(Smaller, SortedSmaller),
        quicksort(GreaterOrEqual, SortedGreaterOrEqual),
        append(SortedSmaller, [Pivot|SortedGreaterOrEqual], Sorted).
    

    The first item is chosen as the pivot, and the remaining items are then partitioned into the smaller items and those that are greater than or equal to the pivot. The two smaller lists are recursively sorted, and then the results are appended, with the pivot placed in front of the items that are greater than or equal to it, to produce the sorted result.

    The partition predicate is as follows:

    partition(_Pivot, [], [], []).
    partition(Pivot, [Item|Rest], [Item|Smaller], GreaterOrEqual) :-
        Item < Pivot,
        partition(Pivot, Rest, Smaller, GreaterOrEqual).
    partition(Pivot, [Item|Rest], Smaller, [Item|GreaterOrEqual]) :-
        Item >= Pivot,
        partition(Pivot, Rest, Smaller, GreaterOrEqual).
    

    The first item in the list is either less than the pivot or greater than or equal to it. In the first case, the item should be the first one in the smaller partition, and the rest of the list is partitioned to produce the rest of the smaller and greater-than-or-equal partitions. In the second case, the item should be the first one in the greater-than-or-equal partition, and recursion handles the rest of the list.

    Entering a query for a specific list produces:

    ?- quicksort([4, 8, 5, 3, 1, 2, 6, 9, 7], X).
    X = [1, 2, 3, 4, 5, 6, 7, 8, 9] .
    
  • The sieve of Eratosthenes is an algorithm for computing prime numbers up to some limit \(N\). We start by constructing a list of integers in order from 2 to \(N\). Then we repeat the following process, until no numbers remain:

    1. The first item in the list is prime.

    2. Filter out all multiples of the first item from the remaining list.

    3. Go to step 1.

    We can write this algorithm in Prolog as follows. First, we construct a list with the integers from 2 to the limit \(N\):

    numbers(2, [2]).
    numbers(Limit, Numbers) :-
        LimitMinusOne is Limit - 1,
        numbers(LimitMinusOne, NumbersToM),
        append(NumbersToM, [Limit], Numbers).
    

    We do so by recursively computing a list of integers from 2 to \(N-1\) and then appending \(N\) to the result.

    We can then use write a predicate to filter out the multiples of a factor from a list:

    filter_not_multiple(_Factor, [], []).
    filter_not_multiple(Factor, [Number|Rest],
                        [Number|FilteredRest]) :-
        Number mod Factor =\= 0,
        filter_not_multiple(Factor, Rest, FilteredRest).
    filter_not_multiple(Factor, [Number|Rest], FilteredRest) :-
        Number mod Factor =:= 0,
        filter_not_multiple(Factor, Rest, FilteredRest).
    

    The filter_not_multiple predicate relates a factor and a list of numbers to a list with the multiples of the factor filtered out. The second rule retains Number in the resulting list if it is not a multiple of Factor. The third rule discards Number from the filtered list if it is a multiple of Factor.

    We can proceed to define a sieve predicate that relates a list of numbers to the result of applying the prime-sieve algorithm to the list:

    sieve([], []).
    sieve([Number|Rest], [Number|SievedRest]) :-
        filter_not_multiple(Number, Rest, FilteredRest),
        sieve(FilteredRest, SievedRest).
    

    The first number is retained in the result. All multiples of the first number are filtered out of the rest of the list. The sieve algorithm is then recursively applied to the filtered list to obtain the rest of the result list.

    Finally, we write a primes predicate that relates an integer limit to a list of primes up to and including that limit:

    primes(Limit, Primes) :-
        numbers(Limit, Numbers),
        sieve(Numbers, Primes).
    

    This rule constructs a list of numbers from 2 up to the limit and then applies the sieve algorithm to the list. We can then use the sieve to compute prime numbers up to 100:

    ?- primes(100, P).
    P = [2, 3, 5, 7, 11, 13, 17, 19, 23|...] [write]
    P = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53,
    59, 61, 67, 71, 73, 79, 83, 89, 97] .
    

    Pressing the w key when the solution is displayed in truncated form causes the interpreter to print out the non-truncated form in the second line above.

    We can also use the built-in numlist predicate rather than the numbers predicate we wrote:

    primes(Limit, Primes) :-
        numlist(2, Limit, Numbers),
        sieve(Numbers, Primes).
    

Constraints and Dependencies

In addition to functional and logic programming, the declarative paradigm includes programs that express constraints among variables and constants as well those that describe dependency graphs. We will look at the former in constraint logic programming and an instance of the latter in the make build automation tool.

Constraint Logic Programming

Constraint logic programming is an extension of logic programming to include constraints on variables. While logic programming allows a limited form of constraints, languages such as Prolog only allow arithmetic constraints to be applied to variables that have been instantiated. For example, suppose we wanted to find a number less than 1000 that is both a square and the sum of two squares. The following is an attempt to specify this in Prolog:

square_sum([N, X, Y, Z]) :-
    N =:= Z * Z, N =:= X * X + Y * Y,
    X > 0, Y > 0, Z > 0, X < Y, N < 1000.

We can attempt a query:

?- square_sum(S).
ERROR: =:=/2: Arguments are not sufficiently instantiated

Unfortunately, since N and Z are not instantiated in the comparison N =:= Z, we get an error.

On the other hand, using the CLP(FD) library for Prolog, which allows constraint logic programming over finite domains, we can specify the solution as follows:

:- use_module(library(clpfd)).  % load the clpfd library

square_sum_c([N, X, Y, Z]) :-
    N #= Z * Z, N #= X * X + Y * Y,
    X #> 0, Y #> 0, Z #> 0, X #< Y, N #< 1000,
    label([N, X, Y, Z]).

The first clause loads the library for use. We can then specify arithmetic constraints using operators that begin with a pound symbol. For instance, the #= operator constrains the two arguments to be equal, while the #< operator constraints the first argument to be smaller than the second. Finally, the label predicate forces the solver to ground the given variables, computing actual values for them rather than specifying their results as constraints. Entering a query, we can now obtain all solutions:

?- square_sum_c(S).
S = [25, 3, 4, 5] ;
S = [100, 6, 8, 10] ;
S = [169, 5, 12, 13] ;
S = [225, 9, 12, 15] ;
S = [289, 8, 15, 17] ;
S = [400, 12, 16, 20] ;
S = [625, 7, 24, 25] ;
S = [625, 15, 20, 25] ;
S = [676, 10, 24, 26] ;
S = [841, 20, 21, 29] ;
S = [900, 18, 24, 30] ;
false.

Examples

As another example of using constraints, consider the canonical verbal arithmetic puzzle of finding a solution to the following equation:

    S E N D
+   M O R E
-----------
= M O N E Y

Requirements are that each letter be a distinct digit, and that the leading digit of a number not be zero. We can express this problem in plain Prolog as the following:

is_digit(Digit) :-
    numlist(0, 9, AllDigits),
    member(Digit, AllDigits).

money([S, E, N, D, M, O, R, Y]) :-
    is_digit(S), is_digit(E), is_digit(N), is_digit(D),
    is_digit(M), is_digit(O), is_digit(R), is_digit(Y),
    S \= 0, M \= 0,
    S \= E, S \= N, S \= D, S \= M, S \= O, S \= R, S \= Y,
    E \= N, E \= D, E \= M, E \= O, E \= R, E \= Y,
    N \= D, N \= M, N \= O, N \= R, N \= Y,
    D \= M, D \= O, D \= R, D \= Y,
    M \= O, M \= R, M \= Y,
    O \= R, O \= Y,
    R \= Y,
                    1000 * S + 100 * E + 10 * N + D
                  + 1000 * M + 100 * O + 10 * R + E
    =:= 10000 * M + 1000 * O + 100 * N + 10 * E + Y.

First, we require that each variable be a digit in the range \([0, 9]\), and we further require that S and M not be zero. We then specify the pairwise uniqueness requirements. Finally, we specify that the variables must satisfy the target equation.

We can enter a query as follows:

?- money(S), !.
S = [9, 5, 6, 7, 1, 0, 8, 2].

Computing this solution takes close to a minute on the author’s iMac computer, since the solver has to search a large portion of the solution space, with much backtracking.

We can simplify the implementation of money by using the higher-order maplist predicate, as well as the built-in is_set:

money(List) :-
   List = [S, E, N, D, M, O, R, Y],
   maplist(is_digit, List), S \= 0, M \= 0, is_set(List),
                   1000 * S + 100 * E + 10 * N + D
                 + 1000 * M + 100 * O + 10 * R + E
   =:= 10000 * M + 1000 * O + 100 * N + 10 * E + Y.

The term maplist(is_digit, List) applies the higher-order maplist predicate to map the is_digit predicate across the elements of List, and is_set(List) ensures that List has no duplicates. While this solution is shorter, it runs even slower than our original solution, taking about a minute and half on the same machine.

On the other hand, we can use CLP(FD) to specify the problem as follows:

:- use_module(library(clpfd)).

money_c([S, E, N, D, M, O, R, Y]) :-
    List = [S, E, N, D, M, O, R, Y],
    List ins 0 .. 9, S #\= 0, M #\= 0, all_distinct(List),
                   1000 * S + 100 * E + 10 * N + D
                 + 1000 * M + 100 * O + 10 * R + E
    #= 10000 * M + 1000 * O + 100 * N + 10 * E + Y,
    label(List).

The ins predicate is defined by CLP(FD) to constrain that the variables in the first argument each be contained in the set that is the second argument. The .. operator specifies a range, so that 0 .. 9 is the range \([0, 9]\). The all_distinct predicate constraints the variables in the argument to take on distinct values. Finally, we use label at the end to ground the given variables with actual values. We obtain the same result:

?- money_c(S), !.
S = [9, 5, 6, 7, 1, 0, 8, 2].

The solver can use the set of constraints to eliminate most of the search space, and the remaining candidates are checked when the label predicate is reached. The result is that computing this solution takes about 0.003 seconds on the author’s iMac, a speedup of about 18000x.

As another example, consider the problem of solving a Sudoku puzzle. The following predicate takes in a nested list of lists, in row-major order, with some entries provided but others filled with anonymous variables:

sudoku(Rows) :-
    length(Rows, 9), maplist(same_length(Rows), Rows),
    append(Rows, Values), Values ins 1..9,
    maplist(all_distinct, Rows),
    transpose(Rows, Columns),
    maplist(all_distinct, Columns),
    Rows = [Row1, Row2, Row3, Row4, Row5, Row6, Row7, Row8, Row9],
    blocks(Row1, Row2, Row3),
    blocks(Row4, Row5, Row6),
    blocks(Row7, Row8, Row9),
    maplist(label, Rows).

The first body term requires that the number of rows be 9. The second uses maplist, which maps a predicate over the items in a list, as we saw in the previous example. The same_length(Rows) argument is a partially applied predicate that, when applied to another argument, requires that the two argument lists have the same length. The term as a whole is checking that each row also has the same length as the number of rows. The append term takes a list of lists and concatenates them into the single list Values. We then constrain that each variable be in the range \([1, 9]\). The next term constrains each row to consist of distinct numbers, and the following two terms constrain each of the columns to consist of distinct numbers. The next four terms constrain each of the 9x9 squares to be composed of distinct numbers, with the blocks predicate defined below. Finally, the last term ensures that each variable is grounded to a value.

The blocks predicate is as follows:

blocks([], [], []).
blocks([N1, N2, N3 | RestRow1],
       [N4, N5, N6 | RestRow2],
       [N7, N8, N9 | RestRow3]) :-
    all_distinct([N1, N2, N3, N4, N5, N6, N7, N8, N9]),
    blocks(RestRow1, RestRow2, RestRow3).

The predicate takes in three rows, ensures that the set consisting of the first three items from each row contains distinct values, and then recursively checks this for the remaining items in each row.

We can now provide a query to solve a specific puzzle. The following has been called the “world’s hardest Sudoku”:

?- S = [[8,_,_,_,_,_,_,_,_],
        [_,_,3,6,_,_,_,_,_],
        [_,7,_,_,9,_,2,_,_],
        [_,5,_,_,_,7,_,_,_],
        [_,_,_,_,4,5,7,_,_],
        [_,_,_,1,_,_,_,3,_],
        [_,_,1,_,_,_,_,6,8],
        [_,_,8,5,_,_,_,1,_],
        [_,9,_,_,_,_,4,_,_]],
   sudoku(S).
S = [[8, 1, 2, 7, 5, 3, 6, 4, 9],
     [9, 4, 3, 6, 8, 2, 1, 7, 5],
     [6, 7, 5, 4, 9, 1, 2, 8, 3],
     [1, 5, 4, 2, 3, 7, 8, 9, 6],
     [3, 6, 9, 8, 4, 5, 7, 2, 1],
     [2, 8, 7, 1, 6, 9, 5, 3, 4],
     [5, 2, 1, 9, 7, 4, 3, 6, 8],
     [4, 3, 8, 5, 2, 6, 9, 1, 7],
     [7, 9, 6, 3, 1, 8, 4, 5, 2]] .

Solving this takes less than a tenth of a second on the author’s iMac.

Make

Make is a family of tools used for automating the building of software packages, as well as tracking dependencies between the various components of a package. Make operates on programs called makefiles, which contain rules for how to build individual targets. A target may have dependencies, which are required to be satisfied before the target can be built, as well as commands that specify how the target should actually be built. Thus, a makefile is a combination of declarative components relating targets to dependencies and imperative actions specifying the actions required to build a target.

The structure of a rule in a makefile is as follows:

target: dependencies
        commands

Here, dependencies is a list of zero or more targets or files that the given target depends on, and commands is a list of zero or more actions to be taken, generally each on its own line and indented with a tab character.

As an example, consider the following simple makefile, located by convention in a file named Makefile (note the capitalization):

hello:
        echo "Hello world!"

We can run this from the terminal, if we are in the same directory, as:

$ make hello
echo "Hello world!"
Hello world!

This invokes the hello target, which has no dependencies and as its sole action invokes the shell command to print Hello world! to the screen. We can leave out the explicit target when invoking make, in which case it will build the first target in the makefile:

$ make
echo "Hello world!"
Hello world!

The target of a rule is commonly an executable file, and the dependencies are the files needed to build the target. For example, suppose we have a C++ project with the source files a.cpp, b.cpp, and c.cpp. We can structure our makefile as follows:

main: a.o b.o c.o
        g++ -o main a.o b.o c.o

a.o: a.cpp
        g++ --std=c++14 -Wall -Werror -pedantic -c a.cpp

b.o: b.cpp
        g++ --std=c++14 -Wall -Werror -pedantic -c b.cpp

c.o: c.cpp
        g++ --std=c++14 -Wall -Werror -pedantic -c c.cpp

Here, our default rule is main, which depends on the targets a.o, b.o, and c.o. In order to build main, those targets have to be built first, so Make will attempt to build each of those targets using their respective rules. The rule for a.o depends on the file a.cpp, and if it exists, the command invokes g++ to build the object file a.o. The rules for b.o and c.o have the same structure. Once those targets have been built, Make can then build main, which links together the object files into the final main executable. Running make indicates the sequence of operations:

$ make
g++ --std=c++14 -Wall -Werror -pedantic -c a.cpp
g++ --std=c++14 -Wall -Werror -pedantic -c b.cpp
g++ --std=c++14 -Wall -Werror -pedantic -c c.cpp
g++ -o main a.o b.o c.o

Thus, we can specify complex dependency trees with rules in a makefile, and the Make tool will automatically resolve the dependencies and build the required targets. The relationship between a target and its dependencies is specified declaratively in a rule.

A key feature of Make is that it only builds a target if it has a dependency, direct or indirect through other rules, that is newer than the target itself. For instance, if we follow up the preceding build by modifying the timestamp on b.cpp, we can see that it is newer than the targets b.o and main:

$ touch b.cpp
$ ls -l
-rw-r--r--  1 kamil  staff    229 Nov 17 01:01 Makefile
-rw-r--r--  1 kamil  staff     90 Nov 17 00:57 a.cpp
-rw-r--r--  1 kamil  staff   6624 Nov 17 01:01 a.o
-rw-r--r--  1 kamil  staff     31 Nov 17 01:12 b.cpp
-rw-r--r--  1 kamil  staff    640 Nov 17 01:01 b.o
-rw-r--r--  1 kamil  staff     33 Nov 17 00:58 c.cpp
-rw-r--r--  1 kamil  staff    640 Nov 17 01:01 c.o
-rwxr-xr-x  1 kamil  staff  15268 Nov 17 01:01 main

If we then run make, it will only rebuild those targets that depend on b.cpp:

$ make
g++ --std=c++14 -Wall -Werror -pedantic -c b.cpp
g++ -o main a.o b.o c.o

This is a crucial feature for working with large projects, as only the components that depend on a modification are rebuilt rather than every target in the project.

As a more complex example, consider the following makefile that was used to build a previous version of this text:

all: foundations functional theory data declarative

foundations: foundations.html foundations.tex

functional: functional.html functional.tex

theory: theory.html theory.tex

data: data.html data.tex

declarative: declarative.html declarative.tex

asynchronous: asynchronous.html asynchronous.tex

metaprogramming: metaprogramming.html metaprogramming.tex

%.html: %.rst
        rst2html.py --stylesheet=style/style.css $< > $@

%.tex: %.rst
        rst2latex.py --stylesheet=style/style.sty $< > $@
        pdflatex $@
        pdflatex $@
        pdflatex $@

clean:
        rm -vf *.html *.tex *.pdf *.aux *.log *.out

The default target is all, which depends on the foundations, functional, theory, data, and declarative targets. While there are also asynchronous and metaprogramming targets, they were not being built since we had not reached the corresponding units in the text.

Each of the following standard targets has two dependencies, an .html file and a .tex file. In order to build an .html file, Make looks for an appropriate target. We have a pattern rule for .html files, which depends on a corresponding .rst file. Thus, in order to build, for example, declarative.html, Make applies the pattern rule and invokes rst2html.py. The special symbol $< stands for the dependencies, while $@ stands for the target. Thus, the result of rst2html.py is written to declarative.html, and the build for that target is complete.

We also have a pattern rule for .tex files, which invokes rst2latex.py, followed by several invocations of pdflatex. The end result is that building declarative.tex ends up creating declarative.pdf as well.

The last rule is to clean up target and temporary files. Thus, we can force the all target to be built from scratch with make clean all. Without requesting the clean target, only those targets that depend on an .rst file that has been modified will be rebuilt.

Pattern Matching

Many languages that are primarily functional or imperative provide a declarative construct that does pattern matching, specifying separate cases that each define a pattern against which a value can match, and the computation to be done as a result. These separate cases are analogous to different axioms in Prolog or different pattern rules in Make, and the matching process is similar to unification in Prolog.

As an example, we take a look at the match statement in Python, which has the following syntax:

match <expression>:
    case <pattern>:
        <suite>
    case <pattern>:
        <suite>
    ...

A controlling expression provides the value to be matched, and one or more case clauses specify a matching pattern and a suite of statements to be executed upon a match. Only the first clause that matches the value is executed – the remaining clauses are skipped, even if their patterns also match the value. This is similar to the behavior of a sequence of if and elif branches, or a sequence of except clauses on a try statement.

There are several kinds of patterns that can be specified, a subset of which are the following:

  • A literal pattern specifies a number, string, or boolean literal, or the None literal, and it matches a value that compares equal to the literal. The following is an example:

    def https_error_description(code):
        match code:
            case 400:
                return 'Bad Request'
            case 401:
                return 'Unauthorized'
            case 403:
                return 'Forbidden'
            case 404:
                return 'Not Found'
        return f'Unknown code {code}'
    

    This is similar to a switch statement, but the match construct is much more powerful in that it supports other patterns as well.

  • A capture pattern specifies an identifier, and it matches against any value, binding the identifier to that value. In the example above, we can use such a pattern to incorporate the default result in the match statement:

    def https_error_description(code):
        match code:
            ...
            case 404:
                return 'Not Found'
            case unknown:
                return f'Unknown code {code}'
    

    Since we don’t actually use the variable introduced in the last case, we can use a lone underscore instead as an anonymous variable:

    def https_error_description(code):
        match code:
            ...
            case _:
                return f'Unknown code {code}'
    

    A case with a pattern that consists solely of an identifier matches any value, so such a case is only permitted as the last one in a match statement.

  • A class pattern only matches a value that is an instance of the class. The simplest class pattern consists of a type name followed by empty parentheses:

    def https_error_description(code):
        match code:
            ...
            case int():  # only matches an int
                return f'Unknown code {code}'
            case _:
                raise ValueError(f'expected an int, got {code}')
    

    A class pattern may also specify attributes that the object must have using syntax similar to keyword arguments:

    case Point(x=0, y=0):
        ...
    

    Alternatively, the class itself may define custom matching using syntax similar to positional arguments. For instance, several built-in types allow patterns of the form <type>(<subpattern>), which matches against a value that is an instance of <type> and that also matches <subpattern>. Thus, the pattern int(3) matches an int whose value is 3, and the pattern int(value) matches any int and binds the name value to that object.

  • A sequence pattern consists of a comma-separated list of subpatterns, which can be enclosed by either square brackets or parentheses [1]. Such a pattern can match a variety of sequence types (including user-defined ones that meet a certain set of conditions), though strings are excluded from matching a sequence pattern.

    The following is an example of simple sequence patterns:

    def is_short(sequence):
        match sequence:
            case []:
                return True
            case [_]:
                return True
            case [_, _]:
                return True
            case _:
                return False
    

    The first case matches an empty sequence, the second a one-element sequence, and the third a two-element sequence. We use anonymous variables since we are not concerned with the actual element values. Note that the pattern [_, _] has different behavior than [var, var]. The latter requires both elements to be equal, since the first occurrence of var binds the variable to a value and the second checks whether the corresponding element is equal to that value. On the other hand, the anonymous variable _ does not do any binding, so each occurrence is independent of any others.

    A sequence pattern may also contain a variadic subpattern, matching zero or more occurrences. The syntax is similar to variadic positional arguments:

    def is_short(sequence):
        match sequence:
            case [_, _, *_]:
                return False
            case _:
                return True
    

    Here, the pattern [_, _, *_] matches a sequence with at least two elements – the first occurrence of _ matches the first element, the second occurrence matches second element, and the *_ matches all remaining elements.

    The following is an example of recursively computing the length of a sequence using sequence patterns:

    def length(sequence):
        match sequence:
            case []:
                return 0
            case [_, *rest]:
                return 1 + length(rest)
    

    The first case matches an empty sequence, the base case of the computation. The second matches a sequence with at least one element, and all but the first element match the variadic *rest subpattern. These elements are encapsulated in a list bound to the rest variable, and we can recurse on this list.

    Compare this definition of length with an equivalent predicate in Prolog:

    len([], 0).
    len([_|Rest], Length) :-
      len(Rest, RestLength),
      Length is 1 + RestLength.
    

    In both definitions, we specify two separate cases with patterns that are matched against an input list.

There are several other patterns, including mapping patterns, as patterns, value patterns, and “or” patterns, and a case can include a guard expressed as an if clause to further restrict what the case matches. More details are in the original specification as well as the tutorial.

We consider one more, complex example that illustrates the declarative nature of pattern matching. Suppose we want to compute the sum of all the elements in a nested list of numbers. The following expresses this computation in Prolog:

% deep_sum(NestedList, Sum).
% True if NestedList is a nested list of numbers, and Sum is the sum
% of all the numbers contained in NestedList.
deep_sum([], 0).
deep_sum([First|Rest], Sum) :-
  deep_sum(First, FirstSum),
  deep_sum(Rest, RestSum),
  Sum is FirstSum + RestSum.
deep_sum(Item, Item). % can restrict this to numbers with :- number(Item).

We have three cases: an empty list whose sum is zero, a non-empty list whose sum is the recursive sum of the first item (which itself may be a list) plus the recursive sum of the rest of the list, and a non-list item whose sum is itself. We can express this same computation in Python:

def deep_sum(nested_list):
    """Return the sum of all the numbers in nested_list.

    nested_list must be a nested list of numbers.
    """
    match nested_list:
        case []:
            return 0
        case [first, *rest]:
            return deep_sum(first) + deep_sum(rest)
        case item:  # can do int(item) | float(item) to restrict to numbers
            return item

There is a direct correspondence between the two implementations, reflecting the primarily declarative manner in which the computation is expressed. The fundamental difference between Prolog and Python here is that Prolog provides search and backtracking, so that other axioms can be tried if one fails (or more solutions are requested). In Python and other functional or imperative languages that have pattern matching, a value matches at most one case, and the remaining cases are never considered. Thus, while pattern matching is declarative, it does not provide the full expressiveness of logic programming.