Developers Club geek daily blog

1 year, 5 months ago
The publication of the first part (habrahabr.ru/post/274603) caused quite extensive and interesting discussion on different aspects of language of application the PROLOGUE.
The purpose was – to show experienced, and not really, to programmers that there is nothing difficult in the Prologue, and everyone can apply it in work.
For some reason there were no questions directly in the text of the publication. I will think that there everything is clear.
Let's start consideration of more practical aspects of programming in the Prologue language.
In this section we will consider the main aspects of implementation in the PROLOGUE of broadcast of context-free languages language on the example of arithmetic expressions.
It is known that PROLOG arose from attribute grammars. Therefore parsing – the first and natural application of this language.
Unlike such means as regular expressions, on PROLOG it is easy to write translators for more difficult — context-free languages.
For the translator it is necessary to have lexical analyzer, the syntax analyzer and the generator of the object code.

1. Lexical analysis

Lexical analyzer on an input has a line of characters, on an output — the list of lexical units. Let's consider lexical analyzer for arithmetic expressions.
The initial predicate of lexr/2 executes conversion of a line of characters to the list of codes of characters, space suppression (delb/2), with the subsequent transition to lexr1/2 which recursively browses the input list, selecting from it lexical units — sample digits and integral numbers — sequences of digits.
The predicate of lexr1/2 completes work at exhaustion of the input list of characters.
term/4 predicate from the input list of characters selects the next lexical unit of Hs which registers in the beginning of the output list of a defiant predicate. The fourth argument — the remained list which will be processed on the following step of a recursion. The second argument — office, for accumulation of the list of digits of an integral number. At the beginning we set it in [].
The predicate of term/4 consists of four procedures for four situations which can arise when viewing the input list.
If in the head of the list there is a sample digit, enter it in the output list, previously having transformed it to a line. It provided that the working variable contains the empty list.
If the next element of the list — a sample digit, but in a working variable is values, the saved-up list of digits from the second argument is convertible at line and we enter in the output list. At the same time the sample digit is entered in the output list for reconsideration on the following recursive step.
If the next character — digit, then it is brought in the end of the second argument — the office list and the recursion proceeds.
The fourth procedure joins at exhaustion of the input list by search of the next digit of number. Exhaustion of the list means completion of viewing of an integral number.
Service predicates of spec/1, digit/1 provide check of a code of the character on compliance to a sample digit or digit.
The predicate of member/2 executes check of entry of an element into the list.
The predicate of append/3 executes connection of two lists.
lexr(S,R):- list_text(L,S), delb(L,L1),lexr1(L1,R),!.

lexr1(L,[Hs|T1]):-
        term(L,[],Hs,L1),
        lexr1(L1,T1).
lexr1([],[]).

term([H|T],[],Hs,T):- spec(H),list_text([H],Hs).
term([H|T],L,I,[H|T]):- 
       L\=[],        
        spec(H),
        list_text(L,Ls),
        int_text(I,Ls).
term([H|T],L,R,Lr):- 
       digit(H),        
        append(L,[H],L1),
        term(T,L1,R,Lr).
term([],L,I,[]):-
        L\=[],
        list_text(L,Ls),
        int_text(I,Ls).

delb([32|T],L):-delb(T,L).
delb([H|T],[H|T1]):-delb(T,T1).
delb([],[]).

spec(H):-member(H,"+-*/()").
digit(H):- H>47, H<58.
append([H|T],L,[H|L2]):-append(T,L,L2).
append([],L,L).
member(H,[H|T]).
member(H,[H1|T]):-H\=H1,member(H,T).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%   
lr:-
        A = $(10-237*65+3/837467)-(342-678)$,
        lexr(A,L),
        write(A),nl,
        write(L),nl.

It should be noted that the predicate of lexr/2 implements the determined calculations — without bektreking as lexical analysis does not demand returns.
At emergence of a bektreking, it can issue the wrong result.
For an exception of a bektreking which can appeal external procedure at the end of lexr/2 cutting off operation is inserted - "!", which disconnects a possibility of a bektreking of a predicate after its end.

2.DCG – grammar

All textbooks on the PROLOGUE write about DCG – Definite Clause Grammar and describe it as magic means for creation of the syntax analyzer, with reasonings about differential lists and declarative semantics.
In my opinion, DCG – just convenient operator for reduction of record at the description of grammars.
The PROLOGUE in itself is so arranged that in it it is possible, having directly described grammar, to receive the syntax analyzer, more precisely, its basis.

Grammar of simple arithmetic expression.
Ex =:: Sterm | Sterm AdSign Ex
Sterm =:: AdSign Term | Term
Term =:: Factor | Factor MuSign Term1
Term1 =:: Term | Term MuSign Term1
Factor =:: (Ex) | Number
AdSign =:: '+’ | ‘-‘
MuSign =::‘ * '|/

It is possible to rewrite it directly in the form of sentences of DCG.
ex -->  sterm. 
ex -->  sterm, adsign,ex.
sterm --> adsign, term.
sterm --> term.
term --> factor.
term --> factor,musign,term1.
term1 --> term.
term1 --> term, musign, term1.
factor --> [N],{number(N)}.
factor --> lb,ex,rb.
adsign --> ['+']. adsign --> ['-'].
musign --> ['*']. musign --> ['/'].
lb --> ['(']. rb --> [')'].


On an input it is necessary to submit the list of lexemes.
e:
A=[12, ’-‘, 98,’ * ’,’ (', 19, ’+ ', 34, ’*’ '(, 200, ’-', 23, ')', ')'],
ex (A, []).

If to use lexical analyzer, the input string can be entered from the device:

es:-
read_line (0, S),
lexr (S, L),
ex (L, []).

Let's look how it works. There is such built-in predicate of expand_term/2 which shows how the system processes DCG – the sentence.
If to gather
expand_term ((a-> b, c, d), L).
we will receive that:
L = a (A, B):-b (A, C), c (C, D), d (D, B)

If to gather the same predicate for the terminal rule

expand_term ((a-> ['+’]), L).

we will receive that:
L = a (['+’ | T], T)

The terminal rule well illustrates the processes happening at execution of grammatical analysis on the PROLOGUE – each application of the rule subtracts the necessary characters satisfying to this rule from the input list. In case of a terminal symbol, it has to be in the head of the current list, "tail" of the list is transferred further.
By a challenge of the first, input, rules of grammar, the second argument is reported to be the empty list that all input string up to the end was considered.
It is obvious that it is easy to rewrite any rule in the DCG format in the form of the main type of sentences of the PROLOGUE – for non-terminal rules it is necessary to add two arguments to each challenge and heading, and for the terminal rule – it is simple to substitute a constant in heading.
From here it is clear that "magic" of simplicity of implementation of parsing is based not on DCG properties – operators, and on own nature of the PROLOGUE language. DCG is useful for what exempts from need to insert two variables into each challenge and to monitor the correct naming of the input and output list of each argument.
Of course, it is a little advantage of such program as as result you receive only "Yes" or "No".
It is also easy to receive a syntax tree of expression – to add the corresponding arguments enough. But in this case it is necessary to consider a priority of operations.

3. Parsing

Having added one argument to each rule, we will receive a syntax tree of expression on an output. Here we do not need a straight line or reverse Polish notation as the syntax tree will be provided directly, in the form of the enclosed lists.
ex(R) -->  sterm(R). 
ex([S,R1,R2]) -->  sterm(R1), adsign(S),ex(R2).
sterm([S,R,[]]) --> adsign(S), term(R).
sterm(R) --> term(R).
term(R) --> factor(R).
term([S,R1,R2]) --> factor(R1),musign(S),term1(R2).
term1(R) --> term(R).
term1([S,R1,R2]) --> term(R1), musign(S), term1(R2).
factor(N) --> [N],{number(N)}.
factor(R) --> lb,ex(R),rb.
adsign('+') --> ['+']. adsign('-') --> ['-'].
musign('*') --> ['*']. musign('/') --> ['/'].
lb --> ['(']. rb --> [')'].

For a challenge of such program it is necessary to specify three parameters
e3:-
        S='10-3-5+4',
        lexr(S,L),
        ex(R,L,[]),
        calc(R,Res),
        write(S),nl,
        write(R),nl,
        write(Res),nl.

The result of parsing in the given example will have an appearance:

[-, 10, [-, 3, [+, 5,4]]]

The recursive predicate of calc/2 executes calculation of value of arithmetic expression on its syntax tree.
calc([S,A1,A2],Nr):-calc(A1,N1),calc(A2,N2),calc1(S,N1,N2,Nr),!.
calc(A1,A1):-A1\=[_|_].
calc1(*,N1,N2,Nr):- Nr is N1*N2.
calc1(/,N1,N2,Nr):- Nr is N1/N2.
calc1(+,N1,N2,Nr):- Nr is N1+N2.
calc1(-,N1,N2,Nr):- Nr is N1-N2.

For this example the result will be – "16". However it is incorrect, there have to be "6"! Really – the syntax tree is constructed incorrectly – it corresponds to execution of operations from right to left. As arithmetic operations of a levoassotsiativna, the incorrect result is received.
The grammar which suited for validation of arithmetic expression was constructed without associativity of operations.
Rule
ex ([S, R1,R2])-> sterm(R1), adsign(S), ex1(R2).

it is necessary to replace with the rule of a type

ex ([S, R1,R2])-> ex(R1), adsign(S), term(R2).

However such rule contains a left recursion and will go in cycles. What to do? A way out such – "to shatter" a recursive call into components, then already the recursion will not be left – the input argument will change!
ex(E)-->eterm(E).
ex([S,E1,E2])-->sterm(E1),sn(S),eterm(E2).

sterm(E)-->eterm(E).
sterm([S,E1,E2])-->eterm(E1),sn(S),eterm(E2).
sterm([S2,[S1,E1,E2],E3])--> eterm(E1),sn(S1),sterm(E2),sn(S2),eterm(E3).

eterm(E)-->fct(E).
eterm([S2,[S1,E1,E2],E3])--> fct(E1),sn2(S1),eterm(E2),sn2(S2),fct(E3).
eterm([S,E1,E2])-->fct(E1),sn2(S),fct(E2).
sn2(*)-->[*]. sn2(/)-->[/]. sn2(div)-->[div]. sn2(mod)-->[mod]. sn2(and)-->[and]. 
fct(E)-->number(E).
fct(E)-->lb,ex(E),rb.
fct(E)-->snot,fct(E).
number(X)-->[X],{number(X)}.
lb-->['('].
rb-->[')'].
sg(+)-->[+].
sg(-)-->[-].
sn(E)-->sg(E).

For this grammar the syntax tree of our example will have an appearance

[+, [-,-10,3], 5], 4]

and the received value will be equal to "6".

4. Direct calculation

The syntax tree of analysis of expression can be used for generation of the object code by recursive viewing by the procedure similar to the above-stated calc/2.
When only calculation of arithmetic expression is required, it is possible not to build a tree, and to execute calculation during parsing.
ex(E)-->eterm(E).
ex(R)-->sterm(E1),sn(S),eterm(E2),{clc([S,E1,E2],R)}.

sterm(E)-->eterm(E).
sterm(R)-->eterm(E1),sn(S),eterm(E2),{clc([S,E1,E2],R)}.
sterm(R)-->eterm(E1),sn(S1),sterm(E2),sn(S2),eterm(E3),{clc([S1,E1,E2],N),
        clc([S2,N,E3],R)}.

eterm(E)-->fct(E).
eterm(R)-->fct(E1),sn2(S1),eterm(E2),sn2(S2),fct(E3),{clc([S1,E1,E2],N),
        clc([S2,N,E3],R)}.
eterm(R)-->fct(E1),sn2(S),fct(E2),{clc([S,E1,E2],R)}.
sn2(*)-->[*]. sn2(/)-->[/]. sn2(div)-->[div]. sn2(mod)-->[mod]. sn2(and)-->[and]. 
fct(E)-->number(E).
fct(E)-->lb,ex(E),rb.
number(X)-->[X],{number(X)}.
lb-->['('].
rb-->[')'].
sg(+)-->[+].
sg(-)-->[-].
sn(E)-->sg(E).
clc(A1,A1):-A1\=[_|_].
clc([*,N1,N2],Nr):- Nr is N1*N2.
clc([/,N1,N2],Nr):- Nr is N1/N2.
clc([+,N1,N2],Nr):- Nr is N1+N2.
clc([-,N1,N2],Nr):- Nr is N1-N2.   

In the provided program it is possible to notice curly brackets. Such brackets are used in case of application of "normal" predicates of the PROLOGUE, i.e. for them it is not necessary to execute adding of two arguments.
On the basis of such approach it is possible to build translators of different context-free languages, including, HTML, XML.

In the third part we will consider application of the Prologue for creation of one type of intellectual systems — expert systems.

This article is a translation of the original post at habrahabr.ru/post/274799/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus