A Semantics for While with Break, Continue and Goto

This work presents a formal description of a subset of a C-like language in the form of an operational semantics. We give semantics to the following statements (presented in alphabetical order) : assig-nation, break, composition, continue, goto, if, skip and while. The semantics is given by an abstract machine composed by an stack, two counters and three functions. We prove some expected properties of the semantics.


Introduction
There is few work concerning the operational semantics of statements break, continue and goto for C-like languages.Previous work does not treat these statements or take a different point of view.In particular the sentence goto was not treated at all.
The two counters of our abstract machine index while-loops.Using them together with the second function, we can know when an instruction break, continue or goto is inside an active while-loop (we say a while-loop is active when its boolean condition evaluates to true).
The stack works as a continuation, we put in it the sentences that have to be executed after the actual one.
The first of the functions maps labels to environments and using it we give meaning to goto statements.
The third function represents an state in which we store the values of program variables.
The outline of our paper is as follows: section 2 presents the abstract syntax of the language (this includes arithmetic and boolean expressions and statements).Section 3 presents the semantics of expressions and section 4 the semantics of statements.We prove the semantics follows the intended meaning of the statements continue,break and goto.In section 5 we prove the semantics is deterministic.In section 6 we present related work and in section 7 conclusions.

Abstract syntax of the language
We use a syntactic notation based on BNF.Parenthesis can be used (not indicated in our BNF) to solve ambiguities and uniquely determine the corresponding parse tree.
We have the following Syntactic Categories and meta-variables ranging over them n will range over numerals, Num a will range over arithmetic expressions, AExp x will range over arithmetic variables, AVar b will range over boolean expressions, BExp x will range over boolean variables, BVar S will range over statements, Stm The meta-variables can be primed or subscripted for example a, a ′ , a 1 , a 2 all stand for arithmetic expressions.x refers both to arithmetic and boolean variables.What we mean can be inferred from the context.Definition 2.1.Abstract syntax for Arithmetic Expressions

Semantics of expressions
We define the following semantic functions:

Semantics of Arithmetic Expressions
The transition relation is specified by the rules of the transition system presented in Table 1.
+, * , − at the left denote syntax and at the right the corresponding operations over the semantical domains.
We define the semantic function A is a function Given a and s, there exists only one z such that a, s → z.
n, s → n x, s → s(inl x)

Semantics of Statements
We use pattern matching and conditionals to specify the abstract machine.Conditionals supplement pattern matching allowing to impose conditions over pattern variables as well as to check the values of functions.The way of reading our pattern matching definition is from top to bottom.The patterns are allowed to overlap (a variable overlaps all the patterns).When a pattern does not match, we continue with the following pattern.
Our pattern matching is exhaustive, this is achieved with variables in the pattern as the default case.
We specify a transition system with configurations of the form S, n, y, E, ρ, γ, s .
S is the corresponding statement, n is the first counter, y the second counter (both n and y are natural numbers).E is an environment of pairs (statement, counter) typed as: where Stack has constructors : and nil.ρ is the first function that has as domain the set of labels and as codomain the set of environments.γ is the second function used to know when a while statement is active.
We denote States by s.We specify by s[x := n] the state such that (s[x := n] x) = n and (s[x := n] y) = (s y).
Table 3 presents the semantics of statements.

Use of the counters
We use the counters to number while-loops.
The first counter is a number associated to the statement when is being executed, the second counter equals the first natural number not assigned yet as first counter to a while statement.
When we have a while statement in the top of the environment occur one of the following two cases 1. the while-loop is active and was put in the stack to repeat the iteration.In this case its first counter does not change.
2. The while statement is not active.We assign to it as first counter the second counter.
We use the first counter also in relation to the statements continue and break.
When we find an active while statement, we have to execute its body and after repeat the execution of the statement, for this the body is put as the sentence at the top of the environment and the sentence while is put below it.
We associate as counter to the while statement in the environment the value of its first counter, and as counter for its body its successor.This will be used in relationship with statements continue and break and is explained afterwards when we explain the intended meaning of statements.

Use of the functions ρ and γ
The first function (ρ) stores for each label, an environment.When we meet a goto label statement, we continue the execution in the environment aso-ciated to label.
The statement label:S can be found "before" than a corresponding goto; in this case, when we reach the goto statement the function ρ has an environment as definition for the corresponding label.
If this is not the case, the definition for the corresponding environment is done after finding the goto label statement.We go forward with the statements in the current environment until finding the statement labeled label.This statement and the ones in the current environment form the environment for label.
The second function (γ) is used to know if a while statement is active.Is defined in such a way that γn = true if the while statement numbered n is active and γn = f alse otherwise.

The auxiliar function h
When we meet an if instruction, we choose the then or else part depending on the boolean condition.The corresponding sentence is put in the environment as first sentence to be executed.
We always put the next sentence in the top of the environment because in case of being a while instruction we give a special treatment (that can be seen in the management of the indexes).
Can be the case of a labeled statement appearing in the then or else part of an if statement.To define the corresponding environment in the function ρ we define the function h, that appears at the end of Table 3.
We use the function h also in case of a not active while statement.Since when an active while is find, this statement is put in the stack, any label found in the while's body will have the while statement in its associated environment.In case the while statement is defused after, we have to associate to any label find in its body, an environment without this while statement.For this, we "compute" again the environment when we reach the defused while statement, applying function h.

The semantics and the intended meaning of statements
We show that the transition rules of the semantics of statements continue, break and goto follow their intended meaning.

The meaning of continue
If continue is inside an active while-loop has a counter of the form s(n).
In this case γ n = true.We have to skip all the sentences with the same number than continue (are sentences inside the same while-loop).We skip all these sentences until finding the while sentence with index n.This is achieved by the following transitions: If continue is at the greater level of the program (outside while loops) or inside a not active while loop its counter is 0, so the first two clauses does not match and the clause applyied is one of the following continue, m, y, (S, n) : E, ρ, γ, s → skip, m, y, (S, n) : E, ρ, γ, s continue, n, y, nil, ρ, γ, s → s this is, behaves as skip, i.e. continues with the next sentence, and finishes with the execution in the case there are no more sentences in the program.

The meaning of break
If break is inside an active while-loop, has a counter of the form s(n) where γ n = true.We have to skip all the sentences in the environment until finding the corresponding while sentence.This is achieved with the rules goto label, m, y, (while b do S, n) : E, ρ, γ, s → goto label, m, y, (S, 0) : If break is at the greater level of the program (outside while-loops) or inside a not active while loop its counter is 0, so the first two clauses does not match and the clause applyied is one of the following break, m, y, (S, n) : E, ρ, γ, s → skip, m, y, (S, n) : E, ρ, γ, s break, n, y, nil, ρ, γ, s → s break outside a while-loop behaves as skip.
The reason while we control in the transitions for continue and break if γ n = true is because we can find a sentence continue or break after a jump to a sentence inside a while-loop.The behaviour of the continue or break sentence depends on the while being active or not.In case is not active, we have not to skip sentences and we continue with the next statement below it.

The meaning of goto
If the sentence label S is found before the goto label sentence, the function ρ gives the environment whose sentences we have to execute.The corresponding transition is: goto label,m, y, E, ρ, γ, s → skip, m, y, (ρ label), ρ, γ, s When we find a labeled statement in the top of the environment with a different label than the goto statement, we redefine the function ρ to answer the corresponding environment for the label.where δ i → δ i+1 for 0 ≤ i < k and where δ k is a final configuration or a stuck configuration.
We say that a statement S with counters n and y, environment E, functions α and β and state s, terminates if and only if there is a finite derivation sequence starting in S, n, y, E, α, β, s and loops if and only if there is a infinite derivation sequence starting in S, n, y, E, α, β, s .
We say that a statement S from an state s, terminates if initial(S)s terminates and loops if initial(S)s loops.
Theorem 5.3.There are no stuck configurations Proof.All the possible configurations are contemplated in our semantics.Whatever configuration we reach we have a transition.
Corollary 5.5.A statement S from an state s, terminates or loops.
Proof.If S terminates, exists k and s ′ such that initial(S)s → k s ′ .If S from s loops, has an infinite derivation sequence.To both terminate and loop contradicts determinism.

Related Work
The work in [2] presents a formal semantics based in evolving algebras.There treatment of break, continue and goto is to call a function NextTask that is primitive, static and belongs to one of the algebras, so is not specified its implementation.
The treatment in [5,6] is rather different to our work, we base our definition in the index, while Norris defines special statement values (BreakVal, ContVal, RetVal and StmtVal) to whom occur transitions when a statement break, continue, return or an ordinary evaluation termination happens respectively.
He's semantics is big step and does not define the semantic of goto.

Conclusions
We have presented the semantics for break, continue and goto using an stack machine, with two counters and three functions.
It is known that we can express the same programs without using break, continue nor goto.In the present paper we have tried to show that the introduction of such constructions in giving program semantics is not so complicated nor so obscure as one could imagine before hand.

Definition 2 . 3 .
true and false stand for constant truth values.Abstract syntax for Statements S ::= x := a | x := b | skip | S 1 ; S 2 | if b then S 1 else S 2 | while b do S | continue | break | label : S | goto label label stands for an identifier.
for arithmetic expressions B : Bexp → (State → T) for boolean expressions where Z is the semantic domain of integers ranged over by metavariables n, z, T is the semantic domain of the truth values ranged over by metavariable t and State = (AVar + BVar) → (Z + T) is a semantic function (from syntactic domains to semantic domains) ranged over by s. + denotes disjoint union, with constructors inl and inr both over syntactic domains and over semantic domains.s ∈ State satisfies s(inl x) ∈ Z and s(inr x) ∈ T. Definition 3.1.

Table 1 :
Semantics of arithmetic expressions

Table 2 :
Semantics of boolean expressionsProof.Structural Induction on a.

Table 3 :
Semantics of Statements (continuation (a)) ′ : S, n) : E, ρ, γ, s → goto label, m, y, E, λ z. if (z = label ′ ) then (S, n) : E else (ρ z), γ, s goto label, m, y, (label S, n) : E, ρ,γ, s → skip, n, y, (S, n) : E, λz.if (z = label) then (S, n) : E else (ρ z), γ, s goto label, m, y, (while b do S, n) : E, ρ, γ, s → goto label, m When we find the statement label : S on top of the stack our transition is If the sentence goto label is inside an active while-loop and label S is found below the while-loop we have to jump outside the while-loop.This is achieved with the transition Can be the case that label S appears after the goto label inside a non active while-loop, in this case we search label S in the body of the whileloop.whereδhas the form S ′ , n ′ , y ′ , E ′ , ρ ′ , γ ′ , s ′ or the form s ′ .The first are called intermediate configurations, and the last final configurations.We say S, n, y, E, ρ, γ, s is stuck is there is no δ such that S, n, y, E, ρ, γ, s → δ.