形式语言与自动机

发表于2023-02-23|更新于2023-04-23

|阅读量:

Induction 数学归纳法

regular language 正则语言

definition 定义

在字母表集合Σ上的正则语言定义如下：

空集合Ø是正则语言
只包含一个空串的语言{ε}是正则语言
对所有a∈Σ，{a}是正则语言
若A, B是正则语言，则$A \cdot B$，$A \bigcup B$，$A^*$都是正则语言
除此之外都不是正则语言

regular expression 正则表达式

regular expressions are a compact syntax for regular language

正则表达式是正则语言的紧凑语法

multiple regular expressions can denote the same language

多个正则表达式可以表示同一种语言，例如0+0、01+0、0都可以表示empty Ø

word over A={a,b} that have at least one a

其中${ a}\bigcup { b}$不可以写成{a, b}，因为单例集是基本的构建块(singleton sets are the basic building blocks)。但是可以化简成

句法和语义的关系

正则表达式与正则语言关系及相互转化

证明words starting with n a‘s and n b’s，$L={ a^nb^n|n∈N }$不是正则语言

Deterministic finite automata 确定有限状态自动机

Definition 定义

A DFA is a tuple$(Q,\delta,i,F)$

DFA接受word

a word $w∈A^*$ is accepted by a DFA$(Q,\delta,i,F)$ if $\delta ^*(i, w)=F$

DFA接受language

通过DFA归纳语言例子

DFA are not the most efficient data structure DFA 不是最高效的数据结构

Q1: is the language regular?

yes

Q2: if regular what is the regular expression?

$(a+b)^*a(a+b)(a+b)(a+b)$

Q3: How many states would one need for DFA?

total 16 states

为什么DFA不能表示成下面的形式

there are two a transitions
state 1 has zero transition

nondeterministic finite automata 非确定有限状态自动机

缘由：DFA的状态转移总数会基于状态数呈现指数级增长

知识点：P(s) powerset of a set S S集合的幂集，是一个由S全部子集组成的集合，S={s1, s2}，则P(S)={Ø, {s1}, {s2}, {s1, s2}}

definition 定义：

每个DFA也是一个NFA，根据定义{q}也是{q}的子集

there is always a equivalent DFA given an NFA 给定一个 NFA，总有一个等价的 DFA

word accepted by an NFA

there must be at least one path that accept the word

a word $w∈A^*$ is accepted by an NFA$(Q, i, \delta, F)$ if and only if

样例

Language accepted by an NFA NFA接受的语言

NFA with λ-transition

Goal: give NFAs the capability of selectiong a new state without reading a symbol.

$\delta:Q\times(A\cup\lambda)\rightarrow P(Q)$

知识点：λ-closure λ闭包

given a state q∈Q in an NFA returns the set of states one can reach using only λ.

λ-closure(q) is the smallest subset of Q.

使用λ-NFA，我们现在可以轻松地为语言并集、顺序组合和星形$(+，\cdot，*)$构建自动机。

Regular expression -> NFA-λ -> NFA -> DFA转化

goal 1: how to eliminate lambda transitions

rule 1: given a state q in the automata, the state is final if q was already final or 可以仅使用λ到达最终状态

rule 2:

construction

可靠性定理证明

NFA-λ和NFA有着同样的表现力

goal 2: how to construct an equivalent DFA from a given NFA

construction

样例

DFAs have $2^n$ states when building from NFAs but not all states are needed。因为不需要的状态无法从初始状态到达。对于一个有着n状态的NFA构建DFA，最坏情况是$|P(Q)|=2^n$

从NFA映射到DFA的结构有两个名称：subset construction 子集构造，determinisation 确定性

可靠性证明

chomsky hierarchy 语言体系

finite language 有限语言

regular language 正则语言

context-free language 上下文无关文法

context-concentive language 上下文相关文法

tree 数

DFA、NFA、NFA-lambda 有限状态自动机

pushdown automata 下推自动机

turing mechine 图灵机

kleene’s theorem

How to use kleene’s theorem to show that a language is not regular

proof of kleene’s theorem

DFA -> R.exp.：state-elimination algorithm 使用状态消解法

Take an automata with n states and delete state by state until having an automata with just two states.

R.exp.->DFA：Brzozowski Derivative

可以使用下面的方法，但是另一种方法Brzozowski Derivative可以直接转换

solution to a Linear system 针对线性系统的方法

Plan for today

closure properties of regular language 正则语言的闭包特性

$L^R$ is regular given L regular

根据kleene’s theorem，只需证明Produce a DFA for $L^R$ given a DFA for L

swap initial an final states
reverse all arrows

how to prove that a language is not regular: pumping lemma

if length of a word is long enough the path in the automata needs to visit same state over and over again

例题

context-free language 上下文无关语言

a CFG is a tuple(V, $\Sigma$, P, S)

V is a set of sumbols which we call terminal

$\Sigma$ is the alphabet(terminal symbols)

P a final set of rules. a rule is A->w where a∈ V, $w∈(V∪\Sigma)^*$

S∈V, starting symbol

words generated by a grammer

language accepted/generate by grammar

given a grammar G=(V, $\Sigma$, P, S) the setof words or languages accepted by G:

goal: prove $a^nb^n$ is generated by G for any n∈N

从语法生成words的不同策略（树结构）

there are multiple left most derivations of the same word

S->aS->aa和S->Sa->aa都可以表示aa

pushdown automata for CFL

$L={a^nb^n|n∈N}$ is not regular, there is no finate ddeterministic automata accepting L, because we need to count a’s. We need automata with memory to count. Use stack.

defination

多个状态，多个堆栈

example

language accepted by a pushdown automata

state, word, stack = configuration of PDA

state where reading w start, start with an empty stack

acceptance by empty stack and final state

GOAL: Given a CFG build a pushdown automaton accepting the same language

转化图的定义不太正确

第一个与过渡的类型有关，允许多个符号推送到堆栈是一个错误。只允许检查堆栈的顶部，并写一个符号。这个问题可以通过分解状态解决

第二个问题是为所有非终端提供的所有生产规则都有终端。这些语法比较特殊

PDA->CFG

closure properties for CFL

facts CFL are closed under union, concaleation, and kleene star

facts CFL are not closed under intesection

pumping lemma for CFL

在上下文无关语法中拥有的每一棵树都将具有最大深度，因此，我们不是像对常规语言那样查看字符串的长度，而是查看生成树的深度。

习题课

DFA

问题一：Design a DFA that accepts the language

例一：

解答：

例二：

解答：

例三：

NFA

nfa-λ

build $A(ab^*+a)$

DFA、NFA、NFA-λ相互转化

Kleene’s theorem

Regular expression -> NFA-λ -> NFA -> DFA转化

fact(*): if u,L,w are languages, λ不属于L, then if u=Lu+w, $u=L^ *w $

通过$aa^c+c=(aa^+1)c=a^*c$进行化简，对b同理

$a(a^c)+a(a^b)(a^b)^(a^c)=a((1+(a^b)(a^b)^)(a^c))=a(a^b)^a^c$

通过$(u^L)^u^*=(u+L)^*$进行化简得到$a(a+b)^*c$

R.exp.->DFA：Brzozowski Derivative

常规语言的闭包属性

Prove $A^*/L$ is a regular language

Main tool for showing closure properties: Kleene’s theorem

L is regular if L is recognized by a DFA

知识点：$w∈L^C$ if and only if w不属于L，表示补集

$b^a(a+bb^a)^*=b^a((1+bb^)a)^*=b^a(b^a)^*=(b^a)^b^a=(a+b)^a$

$[(a+b)^*]^c=?$

Fact: if $L_1$ and $L_2$ are regular languages, $L_1∩L_2$ is also regular.

a language that is not regular

context free grammar

what language is generated by the CFG below?

Design a CFG generating $L={0^n1^n|n>=0}∪{1^n0^n|n>=0} $

pushdown automata

another pumping lemma

正则语言抽水定理：如果我有无限的正则语言，并且在自动机中曾经识别过，应该能够找到并使用一些循环来pump

上下文无关语言抽水定理：如果我有上下文无关语言，会在某一阶段多次将值推入堆栈上，并且在循环的稍后阶段将其多次弹出。

附言：学习link

编译原理学习之：正则表达式（regular expression）和非正则语言（non-regular languages）_暖仔会飞的博客-CSDN博客_编译原理正则表达式

文章作者: 萌新QAQ

文章链接: https://jszmwq.github.io/2023/02/23/%E5%BD%A2%E5%BC%8F%E8%AF%AD%E8%A8%80%E4%B8%8E%E8%87%AA%E5%8A%A8%E6%9C%BA/

版权声明: 本博客所有文章除特别声明外，均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 j52nnw9的博客！

相关推荐

ChatGPT对网络空间安全的影响

pintos操作系统实验

同构网络可分任务调度问题

大数据优化建模与算法笔记

怎样使用ARENA系统

操作系统课程设计大作业

本地搜索