💑 ✋🏿 🏭 Sprawl Lemmas for Regular and Context-Free Languages 🌜 🌾 👆🏿

Two growth lemmas are statements that are used to prove the boundedness of important classes of formal languages: regular and context-free. The importance of these classes for programmers is easy to understand: regular expressions (one of the descriptions of regular languages) are used quite often in the work, and programming languages whose syntax is described by context-free grammars are even more so.

The lemmas are very similar to each other both in formulations and in proofs. This proximity seemed so wonderful to me that I decided to devote an entire article to it.

The plan is this: we understand what regular languages are and what is the relationship between regular expressions and finite automata, we formulate and prove an extension lemma for regular languages, use it to prove the irregularity of several languages. Then we do a similar trick with context-free languages, along the way figuring out how they relate to regular languages and how to get around the discovered restrictions using common grammars. Go!

KDPV illustrates the growth process for KS grammars

A formal language is an arbitrary set of strings (that is, simply sequences) in a finite alphabet of characters. The lines that make up a language are also called words . The alphabet is usually denoted by large sigma: $\Sigma$ . , .. , : $\varepsilon$ .

(.. ), : . . , .

1.

: , , .

, :

$\varnothing$ — , ;
$\{\varepsilon\}$ — , , , ;
$\{a\}, a \in \Sigma$ — , .

. $A$ $B$ — , :

$A \cup B$ — ;
$A \cdot B = \{\alpha \beta | \alpha \in A, \beta \in B \}$ — : , $A$ , $B$ ;
$A^*= \{\alpha_1 \alpha_2 ... \alpha_k | k \in \mathbb{N}_0, \alpha_i \in A \}$ — : $k$ $A$ , $k$ .

: $\mathbb{N}_0 = \mathbb{N} \cup \{0\}$ , ,

: $A \cdot B = AB$ .

. , , . , PuTTY, - :

http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+

, url' , , , , url'. .

: ([0-9]), ([s]?), (a+ — , aa*, .. ) . , , .

, ( ), | (), * () . , .

, :

$abc(de^*f|de)$

, :

abcde
abcdf
abcdef
abcdeef
abcdeeef
abcdeeeef

2.

. — , . , (). .

. , , .

, , . , , . , — , .

. , , , , .

, $bab|aab^{*}ca$

. $a_i$ , $a_j$ , $a_k$ — . $a_i$ $a_k$ , $L_{ik}$ . , $a_i$ $a_j$ $L_{ij}$ , $a_j$ $a_k$ — $L_{jk}$ . , $a_j$ , $L_{jj}$ .

$a_j$ , $a_i$ $a_k$ ,

$L^{'}_{ik} = L_{ik} \cup L_{ij} \cdot L^{*}_{jj} \cdot L_{jk}$

, $a_j$ . , , , .

, , . , - , $\varepsilon$ . , . , .

, . : ; , , , , ; , , , , .

, . , , :

, .

3.

$L = \{a^nb^n | n \in \mathbb{N}_0 \}$ . $\varepsilon$ , $ab$ , $aabb$ , $aaabbb$ . ? , .

, , . , , : , $a$ — $b$ . , ? … , , ?

, ! , , .

. $L$ $n \in \mathbb{N}$ , $\forall w \in L, |w| \ge n$ $x$ , $y$ , $z$ , : $w = xyz$ ; $y \ne \varepsilon$ ; $|xy| \le n$ ; $\forall k \ge 0: xy^kz \in L$ .

Let be $L$ — , $n$ , $w \in L$ — $n$ .

$w$ : $a_0$ , $a_1$ , $a_2$ , ..., $a_m$ . , , $m+1$ , $m \ge n$ .

$n$ , . $a_i$ — , $j$ . $x$ — $i$ $w$ , $y$ — $w$ , $a_i$ $a_j$ , $z$ — $w$ , $a_j$ $a_n$ .

$a_i$ $a_j$ , ( !) , , , $\forall k \in \mathbb{N}_0: xy^kz \in L$ .

$a_i$ , $a_j$ — . , $a_0$ , $a_1$ , ..., $a_{j-1}$ . , $n$ . , $j \le n$ $|xy| \le n$ , .

: , , ( .. ) , , .

$L = \{a^nb^n | n \in \mathbb{N}_0\}$ . $n$ — . $a^nb^n$ $a^nb^n = xyz$ ,
$|xy|\le n$ , , , $xy$ $a$ . $y$ $a$ , . $xy^kz$ $k>1$ $a$ , $b$ , , $L$ . $L$ . , $L$ !

$(^n)^n$ , . .

4. -

— , .

: () $T$ () $N$ ; $\Sigma = N \cup T$ . $S \in N$ — .

$P$ . $\varphi$ $\Sigma$ : $(s_1, s_2) \in \varphi$ , $s_1$ $s_2$ . : . , $(s_1, s_2) \in \varphi$ , $s_1 \rightarrow s_2$ .

$\beta$ $\alpha$ , $\alpha = xs_1z$ , $\beta = xs_2z$ $(s_1, s_2) \in \varphi$ . , — . : $\alpha \vdash \beta$ .

, $\beta$ $\alpha$ ( ), $s_0 = \alpha$ , $s_1$ , $s_2$ , ..., $s_{k+1}=\beta$ , : $s_i \vdash s_{i+1}$ . $\alpha \Rightarrow \beta$ .

$s$ , : $s \in T^*$ , $S \Rightarrow s$ . , , .

, - (-), — . , - , - .

, - :

$S \rightarrow (S)S$
$S \rightarrow \varepsilon$

, , . , - , . ; .

, , «» :

def BuildPath(queue, parents, parent):
    path = []
    while parents[parent] != parent:
        path += [queue[parent]]
        parent = parents[parent]
    return path[::-1]

def Solve(rules, target):
    queue = ['S']
    parents = [0]

    idx = 0
    while idx < len(queue):
        current = queue[idx]

        for rule in rules:
            entryIdx = current.find(rule[0])
            while entryIdx != -1:
                new = current[:entryIdx] + rule[1] + current[entryIdx + len(rule[0]):]

                if new == target:
                    path = [queue[0]] + BuildPath(queue, parents, idx) + [new]
                    return path

                queue.append(new)
                parents.append(idx)

                entryIdx = current.find(rule[0], entryIdx + 1)

        idx += 1

, , , , , ; ! :

rules = [
    ("S", "(S)S"),
    ("S", ""),
]
target = "(()())()"
print('\n'.join(Solve(rules, target)))

S
(S)S
((S)S)S
((S)(S)S)S
((S)(S)S)(S)S
(()(S)S)(S)S
(()()S)(S)S
(()())(S)S
(()())()S
(()())()

- $L = \{a^nb^n | n \in \mathbb{N}_0\}$ :

rules = [
    ("S", "aSb"),
    ("S", ""),
]
target = "aaabbb"
print('\n'.join(Solve(rules, target)))

S
aSb
aaSbb
aaaSbbb
aaabbb

- , . , -, . , - : , , , .

, - .

5. -

$L = \{ a^nb^nc^n | n \in \mathbb{N}_0 \}$ . , - $a^nb^n$ , , , - :

$S \rightarrow \varepsilon$
$S \rightarrow AB$
$A \rightarrow aAb$
$B \rightarrow Bc$
$A \rightarrow \varepsilon$
$B \rightarrow \varepsilon$

$a^nb^nc^m$ . , $m$ $n$ ? « » , , . , , . -,

- . - $L$ $n \in \mathbb{N}$ , $\forall w \in L, |w| \ge n$ $u$ , $v$ , $x$ , $y$ , $z$ , : $w = uvxyz$ ; $vy \ne \varepsilon$ ; $|vxy| \le n$ ; $\forall k \ge 0: uv^kxy^kz \in L$ .

, .

, . , . , :
$S \rightarrow \varepsilon$
$A \rightarrow BC$
$A \rightarrow a$

, , : , — . . .

- acd

, . $n=2^{|N|+1}$ , $|N|$ — , $w \in L, |w| \ge n$ . . — , .. . , $m$ , $|N|+1$ . , -. , .

$B$ , , $B$ . : , , .

$B$ , $S \vdash uBz$ . $B$ , $B$ , $B \rightarrow vBy$ , $vy \ne \varepsilon$ , .. , . $B$ $x$ .

$S \vdash uBz$
$B \vdash vBy$
$B \vdash x$

, $\forall k \in \mathbb{N}_0 S \vdash uv^kxy^kz$ , $vy \ne \varepsilon$ .

$vxy$ . .. $B$ , , $|N|$ . , $2^{|N|+1} = n$ . $|vxy| \le n$ .

$L = \{ a^nb^nc^n | n \in \mathbb{N}_0\}$ . , -. , $n$ — . $a^nb^nc^n$ .

, $a^nb^nc^n=uvxyz$ , $|vxy| \le n$ $vy \ne \varepsilon$ , $uv^kxy^kz$ .

$vxy$ $a$ , $c$ , .. $w$ $a$ $c$ $n$ $b$ , $vxy$ $n$ .

$uv^kxy^kz$ $k$ . , $k>1$ $m$ , $uv^kxy^kz = a^mb^mc^m$ , $L$ . , -!

6.

, - . , $L = \{ a^nb^nc^n | n \in \mathbb{N}_0 \}$ , .

$S \rightarrow \varepsilon$
$S \rightarrow aHbCE$

$E$ , $b$ $c$ . :

$E \rightarrow \varepsilon$

$H$ :

$H \rightarrow aHbC$
$H \rightarrow \varepsilon$

, , $C$ $c$ . - !

$Cb \rightarrow bC$
$CE \rightarrow Ec$

! 5 , :

rules = [
    ("S", "aHbCE"),
    ("H", ""),
    ("H", "aHbC"),
    ("Cb", "bC"),
    ("CE", "Ec"),
    ("E", ""),
]
target = "aaabbbccc"
print('\n'.join(Solve(rules, target)))

S
aHbCE
aaHbCbCE
aaaHbCbCbCE
aaaHbbCCbCE
aaaHbbCbCCE
aaaHbbbCCCE
aaaHbbbCCEc
aaaHbbbCEcc
aaaHbbbEccc
aaaHbbbccc
aaabbbccc

: «» ; $C$ . .

( ) , . - .

Sprawl Lemmas for Regular and Context-Free Languages

1.

2.

3.

4. -

5. -

6.

More articles: