3.1 Brackets as Pointers

To illustrate the process of program development in Refal, suppose we want to write a program which removes all repeated blanks in a given string of symbols, i.e., replaces each group of adjacent blanks by a single blank. Let us call the function to do this job Blanks . We come to the most direct solution by reasoning in this way: we want no adjacent blanks; hence whenever we see such a pair of blanks we must throw away one of them and repeat this as long as there are such pairs. The result is the sentence:

  <Blanks e1'bb'e2>  =  <Blanks e1'b'e2>
(for readability, we represent blanks by signs ' b ').

NOTE: When discussing one sentence of Refal, it is convenient to write its left side in full, i.e., as a function call which includes the function name and the activation brackets.

If this sentence becomes inapplicable, we have achieved what we wanted, and finish the work using the sentence:

  <Blanks e1> = e1
Thus the function definition is:
  Blanks {
     e1'bb'e2 = <Blanks e1'b'e2>;
     e1 = e1; }

Let us analyze the algorithmic efficiency of our program. What operations will a Refal interpreter perform while evaluating this function?

The variable e1 in the first sentence is open. Initially it takes an empty value, and then lengthened until the first combination ' b b ' (if any) is found. The variable e2 is closed; therefore, the remaining part of the argument is not examined. Since we obviously have to scan the argument in search of the first adjacent blanks, no unnecessary operations have yet been performed.

Now comes the replacement. A clever interpreter will figure out, by comparing the left and the right sides of the sentence, that in order to make the replacement it only needs to eliminate one of the blanks. A more straightforward interpreter will construct the replacement from the pieces in the view field on which the left side was projected. Depending on the degree of its cleverness, the interpreter may use the same blank and evaluation brackets as in the left side, or throw them away and insert new ones; but it certainly will not scan or copy the values of the variables e1 and e2 because this would violate requirement 2 for efficient interpretation. By manipulation with the addresses in the computer memory, the values of e1 and e2 will be simply moved from one place to another. This operation will require a constant time which is independent on the size of the expressions involved. When we analyze algorithms we usually want to know how the time required for the execution of the algorithm depends on the size of the objects with which it works. It is this dependence that is of decisive importance, while the constant times taken by constituent operations, which are dependent on the particular implementation of the algorithm or the speed of the computer, are of only marginal significance. Hence in our case the work of the Refal interpreter is still quite efficient.

In the next step of the Refal machine, the whole argument of Blanks , including the projection of e1 , will be scanned again in search of a pair of adjacent blanks. And this is clearly a waste of time because the projection of e1 cannot contain any. This is the fault of our program, and we can correct it by taking e1 out of the evaluation brackets in the right side. One blank, of course, must be left inside the brackets. The definition becomes:

  Blanks {
     e1'bb'e2 = e1 <Blanks 'b'e2>;
     e1 = e1; }
Now the algorithm behind our definition is quite efficient.

When function Blanks is working, the left activation bracket is used as a pointer which separates the processed part of the string from the part yet to be processed. Let us consider an example where we cannot use activation brackets in this way. We want to define the correction function Correct , which in a given string of symbols deletes a symbol if it is followed by the delete sign '#' . If there are several delete signs in the row it deletes an equal number of preceding symbols. Thus if we typed 'Crocodyle' and then noticed the error, we can go on:

  Crocodyle###iles were everywe#here 
and if Correct is used on this string, the result will be:
  Crocodiles were everywhere 

The straightforward definition, which we would have written if we did not care about efficiency, is:

  Correct {
     e1 sA'#' e2  =  <Correct e1 e2>;
     e1  =  e1; }
With this algorithm, the argument will be scanned from the beginning as many times as many delete signs are there. To amend this situation, we cannot simply put e1 outside of the activation brackets, because its right end may still be in need of correction. What would we do if we were programming in the machine language? After the first step, we would put a pointer at the place in the string where the action is, i.e., where we have just canceled a character:
  Crocodyl^##iles were everywe#here 

The pointer, which we represent by ^ , is an address in the computer which allows us to reach the place in one jump. At the next step then, we would not need to scan Crocodyl , but just jump by the pointer, see that the next character is again a delete sign, and cancel the preceding letter 'l' . At each next step we would start again with a jump by the pointer.

We can do the same in Refal. Structure brackets will serve us as pointers. Indeed, by efficiency requirement 1, with every parentheses the address of the conjugated parenthesis must be associated, so that we can jump from one to the other. With regard to the function Correct this means that we must enclose in parentheses the beginning of the string which was already scanned. The initial argument must then be:

  () Crocodyle###iles were everywe#here 
After the first use of the delete sign the argument will become:
  (Crocodyl) ##iles were everywe#here 
etc. The right parenthesis here is essentially a pointer.

If we still want to have the format of Correct as we defined before, i.e., simply <Correct_e.String> , we must introduce an auxiliary function, say Cor , with the format:

  <Cor (e.Scanned-already) e.Not-yet-scanned> 

The following definition will work efficiently:

  Correct {
     e1 = <Cor () e1>; }
   
  Cor {
  *1.A delete sign after the pointer;
  *  step back and delete
     (e1 sX) '#'e2 = <Cor (e1) e2>;
  *2.Look forward for the next delete sign;
  *  delete and move pointer
     (e1) e2 sX '#'e3 = <Cor (e1 e2) e3>;
  *3.No signes to delete
     (e1) e2 = e2; }
Exercise 3.1 Analyze what will happen if
  <Correct 'c##Crocodile'> 
is evaluated. Redefine Cor so that it prints the warning `Extra delete signs!' in such situations.