2012/09/06

双队列实现Trie

Trie的每个节点的含义是从root走到当前节点所搜集的边是一个词的一部分，或者是一个合法的词。所以每到一个节点，往下走的可能性最多和字母表里字母的个数相同。最简单的Trie实现是在节点里保持N个指针，N是字母表的大小，当然有些指针是空的。

这种实现可以给节点加一个标志位，用来标识当前节点是一个叶子节点，用于识别查找输入是一个词的一部分而不是整个词的情况。

用双队列实现Trie

Trie可以看成是一个有限状态机，其中的状态是词汇表里的任何一个词的一部分或者整个词，比如假设there在词汇表里，那么th就是一个状态，这个过度函数f满足f(th, e)=the，f(th, z)=th。

如果我们想把一个trie存在一个array里，最直接的想法如下图，例子里字母表为"abcd"，假设所有可能的字符串都在trie里。

数组里元素是下一个状态的base index，对于任何一个当前状态，下一个可能状态个数等于字母表的size，这里是4，所以要分配一个size为4的连续block表达，base index就是这个block的起始位置的index，存在代表当前状态的数组元素里，比如所有可能状态bx(其中x=a,b,c,d)的base index已经标注在图里了。

这样存储不能满足不满的trie，比如如果bx状态只有"ba","bb","bc"没有"bd"，我们想节省存储空间，那么不给状态bx分配size为4的blcok，分一个size为3的，这会带来一个问题，那么原来应该在"bd"位置存储的就不是"bdx"的base index，而是"cax"的base index。

而这两种情况下那个元素的意义可以这样区分，如果是bd，那么pre-state是xxxxb，如果是ca，那么pre-state是xxxxxc。所以解决的办法是另外弄一个array记录当前状态的pre-state。

查找算法，如果要找bdc
1. base index of state "": base[0]=0

2. state change form "" to "b":

index of state "b" is (i = base[0] + 'b' - 'a')

if( check[i] == "" ) bi = base[i] //(base index of bx)
else failed

3. state change from "b" to "bd":

index of state "bd" is (i = base[bi] + 'd' - 'a')

if( check[i] == "b" ) bi = base[i] //(base index of bdx)
else failed

第三步会返回failed，也就是不在trie里。

区别完整word和前缀的方法是，对最后一个输入字符，check成功，但是base[i]是非法的，比如base队列里初始化的时候为0，如果base[i]==0且i!=0就是非法的。

base队列每个元素有双重意义，元素的index代表state，元素的值代表直接后续状态的base index。所以上面代码里面的check[i] == "b"里面的"b"应该写为base的index，因为"b"的含义是一个state，state和base的index是等价的。改为check[i] == base[0] + 'b' - 'a'，例如第3步应改为：

3. state change from "b" to "bd":

pre_i = i // keep the index of state "b"

index of state "bd" is (i = base[bi] + 'd' - 'a')

if( check[i] == pre_i ) bi = base[i] //(base index of bdx)
else failed

构建trie

构建是不断插入字符串的过程，会遇到要插入的位置已经被占了，例如上例中插入"bdc"，当到状态"bd"的时候发现位置已经被"ca"占了，现在有两个选择，要么把整个bx block移动到队列最后，要么把cx block移动。谁小就移谁。

移动b block到位置p：

base[p + i] = base[b + i]
check[p + i] = check[b + i] //==""

//update check[] of state trade "b" as pre-state

state p+i的后续state block的base index bi=base[p + i]，所以base[bi] + j(假设这个index合法)的pre-state变为p+i，也就是check[base[bi] + j] = p+i

// free b block
check[b + i] = 0

2012/08/30

欧拉回路的回朔算法

首先算法不会给出所有的回路，当算法结束的时候，得到的仅仅是一个回路，算法检查边的顺序不同会造成结果的回路不同。把所有的边编号，其中的变量的含义g[node][i]是从node出发第i条边到达的节点。

search可以看成是检查当前节点node有没有未用过的边，如果没有就会返回，调用堆栈往上一层，search( g[node][i] )返回说明，边i到达节点g[node][i]，但是这个节点无边可用了，我们把i记下来(回朔)，然后看看从node出发的其他边能不能前进，如果假设node出来就一条边，那么算法会再往上返回，上面一层会认为node无边可用了，把探索的时候通向node的边也记下来。

分析算法的运行过程，算法一开始从一个节点开始，只要发现没用过的边就往前走，称这时候为探索模式，一直到没有边可用，注意这时候并不代表所有的边都用过了，还可能是当前节点的连接的边都用过了。没有边可用的时候算法开始回朔，回朔的同时记录边，回朔到某个节点，又有没用过的边，算法又开始往前，直到又没有边可用，开始回朔。最后所有的边都用完的时候，算法会一直到回朔到起点，最后记录的是从起点出来的边。

任何边在算法运行过程中只会被用一次，回朔开始的时候必然是到达了一个以前访问过的节点，因为如果到达一个节点，没被访问过，并且又无边可用，这个节点的度为1，不存在欧拉回路。

第一次回朔是什么时候开始的呢？回朔发生以前，站在任何一个节点看，算法必然是进去出来交替发生，对任何一个非起点节点，一进一出我们把节点的度减掉2，可以看出对于非起点节点是不可能发生进去出不来的情况，因为他们的度都是偶数，所以回朔第一次发生必然是在起点，因为起点是直接出来，度只需要减1。

还有一点就是，虽然第一次回朔发生在起点，在回朔的过程中还是会发现有的节点的degree没用完，算法可能再次进入探索模式。也就是当前节点有边可用进入探索模式，无边可用进入回朔模式。

2012/08/25

Matching

图里的match是边的集合M，要求M里没有两个边是有公共顶点的。这个名字可能是起源于婚姻问题，如果有公共顶点就是重婚了，所以是不允许的。

Marriage Problem: $m$个男孩，每个男孩认识一些女孩，现在开始match，要求每个男孩必须和他认识的女孩结婚，求这种方法存在的充分必要条件。

Hall's Theorem: Marriage Problem有解的充要条件是，任取$k$个男孩，他们必须认识至少$k$个女孩，其中$1\leq k\leq m$

二分图的最大match可以用网络流里面的Augmenting path来解决，先回顾一下网络流里的一些概念。

Residual network：对于一个网络上的flow来说，Residual network记录了这个flow在不违反网络每个路径的流量限制的条件下的所有变化的自由度。而且Residual network上的flow和原来的flow叠加还是原网络的一个合法的flow，满足flow的三个约束。

Augmenting path：是Residual network上的一个从源头到终点的简单路径。

augmenting path的residual capacity：是这个augmenting path上在Residual network里允许的最大流量。

二分图的最大match

所以Augmenting path不过是有向图的一个简单路径，所以用BFS或DFS查找均可。

二分图的最大match是这样转换为最大流问题的，首先男孩那边加一个节点$s$作为源头，并且和所有的男孩相连，女孩这边加一个节点作为终点$t$，同样和所有女孩相连，男孩到女孩的所有边都是男孩指向女孩。每条边的容量都设置为1。

很显然最后的flow不会有两个男孩连接到一个女孩，否则从这个女孩到$t$的流量就为2，这是不可能的。假设起始的时候flow的流量为0，每找到一个Augmenting path，这个path的容量只能是1。

用DFS找Augmenting path的时候，可以直接遍历所有男孩，因为这就相当于遍历所有$s$的邻居。

考虑两种Augmenting path，最简单的情况是，从某男出发，第一步就发现一个没被match的女孩，得一条边的Augmenting path；如果从某男开始DFS遇到的第一个女孩已经match了，这时候DFS不能停，因为按Residual network的容量看，这女孩到和她match的男孩的路径有一个反向的容量，所以继续，坏的情况，这男孩只认识这一个女孩，这时候DFS要开始返回了，这条path不是Augmenting path，好的情况，这男孩还认识其他没match的女孩，DFS会访问到这个新没match的女孩，这时候我们就得到了一条Augmenting path。

第二种情况实际上是劝说一个已经match的男孩换女朋友。

程序是这样的，bpm(u)可以理解为给u找女朋友，如果发现的女孩已有男朋友，劝说其男朋友换女朋友，而且不能换DFS已经见过的女孩，既然让人家换就得给人家找所有要再调bpm。

算法最终返回的条件是发现一个新的没match的女孩，而且这时候所有Augmenting path的男孩都得换女朋友。

2012/08/10

Program

整数剖分

用递归计算剖分的个数

2012/08/05

设计模式概论

代码复用是类的属性，通信是运行时对象的属性。利用继承实现协作是用类描述的，利用通信实现的协作是用对象描述的。

类和类型的区别，类定义了对象内部状态和操作的实现，类型则是和接口的概念对应的。

创建型模式强制你采用针对接口的编程习惯，而不是针对实现的编程。

继承和组合的比较，继承多是静态代码复用，子类了解父类的实现，某种意义上破坏了封装。组合是更好的封装，更依赖于接口的定义，更依赖于运行时对象之间的关系来实现程序的逻辑。

用设计模式应付未来的变化

用创建模式产生对象实例而不指定类名。

用Command或者Chain of Responsibility避免把请求操作的代码写死。

利用abstract factory和Bridge分离设计和实现以应付平台和API的变化。

利用Adapter，Decorator和Visitor改变已有对象的行为。

类厂

类厂语法特点是产生实例的时候并没有指定要产生的实例的类名。程序会维护一个全局的基类类厂指针，使得可以在运行时替换类厂实例。

类厂分类的方法可能是按风格，可能是按平台。程序的逻辑决定什么时候信息足够，可以实例化什么类厂。类厂的类型可以是编译期决定的，比如跨平台的应用，类厂实现必须是按平台分的。也可以是运行期才有足够的信息决定类厂的类型，比如用户对程序的风格设置。

更高级的方法，把类厂和一个字符串(类厂的名字)对应起来，客户程序传入一个字符串，程序逻辑返回一个类厂实例。

一旦类厂的实例确定了，程序的一个方面就确定了，也就是一般来说使用一个类厂，必须使用全部的这个类厂产生的实例。

Window System

Window类是系统定义的抽象的window系统，决定了系统定义的可以做事情。Window类把实际的操作委托给WindowImp的继承类。WindowSystemFactory是产生WindowImp继承类的实例的类厂接口，不同的类厂实现对应不同的Imp类例如WindowImp，FontImp。

Imp类例如WindowImp，FontImp像是给抽象系统Window提供的API，这API隐藏了实际OS相关的实现，一旦系统决定了类厂的实例，API就定了。

Operation和request的封装

对一个系统来说，request来自各个方面，例如OS，鼠标，键盘，网络，一但request产生，用户希望某个operation被运行。但是operation的实现可能分布在代码的各个地方，各种不同的类。

最直接的办法是把operation的代码封装在request的来源里，例如MenuItemChangeColor::Click(){}。但是这种方法的使得代码结合的太紧密。

识别request的来源的实质是需要参数化request的来源，比如MenuItemChangeColor，设计模式里用一个object来参数化，这个object是command。

command有一个方法Execute，里面封装了operation的代码，也可以在这些代码里把operation委托给程序的其他的办法。

access 和traversal mechanisms的封装

Iterator Pattern封装了这些机制，使用Iterator，客户程序不需要知道需要遍历的结构的内部细节，而且遍历的状态被保存在iterator里，所以可以同时有多个遍历同时运行。而且可以有不同类型的Iterator实现，比如如果结构是一个tree，可以PreIterator， PostIterator，MidIterator几个继承类。

遍历的算法经常需要积累信息，比如遍历一段文本，计数word，不同的算法可能需要积累不同的信息，不好的做法是把积累信息的code集成到Iterator里，这样会造成紧密结合，所以最好能把遍历里的action和Iterator分开。

引入Visitor

一般的遍历的所做的事情，例如分析问题的代码如下，把action封装在类SpellingChecker里，这个类的一个方法接受被遍历的对象的指针。上面code不好的地方是一堆if，而且随着系统的进化，这个方法会不停的被改变。更好一点的实现方法是

上面的机制抽象出来就得到了Visitor模式，首先遍历时候的action不一定是check，我们称抽象出来的封装action的类为Visitor。同样的，被访问的对象的抽象方法(CheckMe)称为Accept(Visitor&)。不同的遍历action被封装在Visitor的不同的method里。

2012/08/04

Cut-Set

cut-set也叫cocycle。任取一个支撑树，树上的边称为branch，图里其他的边称为chord。

任何一个cut-set必然包含任何一个支撑树的至少一条边。反过来，连通图$G$里的任何一个最小集合，如果这个集合包含所有支撑树的至少一条边，这个最小集合是一个cut-set。

任何一个cut-set必然包含任何一个cycle的偶数个边。

任取一个支撑树$T$，$b$为$T$的任何一个边，显然$\{b\}$是$T$的cut-set，$\{b\}$把图的顶点分为两部分，考虑整个图的一个cut-set，把图的顶点分成同样的两部分，而且这个cut-set只包含$T$的一个branch，其他的都是chord，这个cut-set称为fundamental cut-set respect to $T$。

支撑树$T$的任意一个边对应了唯一的一个fundamental cut-set。

对偶性，Fundamental cut-set和Fundamental cycle的关系，我们知道任选一个chord对应了一个Fundamental cycle，这个cycle上的边除了一个chord其他的是branch。进一步的每个branch对应了一个Fundamental cut-set，这个cut-set包含一个branch，其他都是chord。

一个cut-set respect两个顶点$v_1, v_2$意思是这个cut-set吧两个顶点分在两个不同的集合里。

Network Flows

一个weighted无向图，上面的cut-set的capacity是这个cut-set里的所有边的weight的和。

定理：$v_1, v_2$之间的最大可能flow等于所有的respect这两个顶点的cut-set里capacity最小的。

2012/07/19

和式

1) 和式有两种写法 $$\sum_{k=1}^na_k\quad \text{and} \quad \sum_{1\leq k \leq n}a_k$$ 换句话说，一种是index，一种是集合。因为和式并不关心相加的顺序，所以集合的写法更准确的捕捉了和式的含义，并且，集合写法有时候更方便，因为集合的变换更容易理解。

比如index变量的替换，现在把$k$变$k+1$，集合的写法是 $$\sum_{1\leq k \leq n}a_k=\sum_{1\leq k+1 \leq n}a_{k+1}$$ 而index写法，必须修改上下边界常数。

2) 集合的indicator与和式

如果把indicator记为$[p(k)]$其中$p$的含义是property，如果k index的term具有property $p$那么$[p(k)]=1$否则$[p(k)]=0$。

和式可以写成 $$\sum_k a_k [p(k)]$$ 比如小于等于$N$的所有的素数的倒数之和记为 $$\sum_k[k \text{ is prime}][k \leq N]\frac{1}{k}$$

3) 和式与递归

$$ \begin{array}{l} &S_0=a_0\\ &S_n=S_{n-1}+a_n \end{array}\tag{1} $$ 简单形式的$a_n$决定的简单形式的$S_n$

假设$a_n$具有简单的形式 $$ \begin{array}{l} &a_0=\alpha\\ &a_n=\beta+\gamma n \end{array} $$ 假设$S_n$的封闭形式是 $$S_n=A(n)\alpha+B(n)\beta+C(n)\gamma$$

我们知道递归公式和$S_n$的封闭形式一一对应，这点可以通过归纳法证明，所以如果给出一个封闭形式，可以推出递归公式里的量$\alpha, \beta, \gamma $，

再通过我们假设的和式的，可以得到一个恒等式，两边都是刚才给出的封闭形式

$$S_n=1 \Rightarrow \alpha=1, \beta=0, \gamma=0 \Rightarrow 1=A(n)$$ 同样的 $$S_n=n \Rightarrow \alpha=0, \beta=1, \gamma=0 \Rightarrow n=B(n)$$ 进一步 $$S_n=n^2 \Rightarrow \alpha=0, \beta=-1, \gamma=2 \Rightarrow n^2=-B(n)+2C(n)$$ 第一个$\Rightarrow$是因为$a_n$形式的假设，第二个是因为$S_n$形式的假设。

递归化为和式得封闭形式

观察和式的递归形式，第二个式子特点是$S_n$和$S_{n-1}$的系数都为1，剩下的一个项是$a_n$。

对于一般的递归公式 $$a_nT_n=b_nT_{n-1}+c_n$$ 如果能在这个式子两边乘一个n的函数$s_n$，化为式子(1)的形式，比较我们可以得和式的$a_n$，然后把和式写成 $$S_n=\sum a_k$$ 计算得到$S_n$后，从里面除去$s_na_n$得到$T_n$。

因子$s_n$可以通过关系 $$s_nb_n=s_{n-1}a_{n-1}$$ 或者 $$s_n=s_{n-1}\frac{a_{n-1}}{b_n}$$ 得到。

4) 合并两个和式

$$\sum_{k\in K}a_k + \sum_{k'\in K'}a_{k'}=\sum_{k\in K\cap K'}a_k+\sum_{k\in K\cup K'}a_k$$

5) 得和式$S_n$的递归的一个方法

利用恒等式 $$\sum_{k=0}^na_k+a_{n+1}=a_0+\sum_{k=1}^{n+1}a_k$$ 如果能化为 $$S_n+a_{n+1}=a_0+F(S_n)$$ 例如等比数列和式$S_n=\sum_{k=0}^nax^k$ $$F(S_n)=xS_n$$

5) $S=\sum_{1\leq j\leq k\leq n}a_ja_k$

$$ \begin{array}{l} 2S&=\sum_{1\leq j\leq k\leq n}a_ja_k+\sum_{1\leq k\leq j\leq n}a_ja_k\\ &=\sum_{1\leq j,k\leq n}a_ja_k+\sum_{1\leq j=k\leq n}a_ja_k\\ &=(\sum_{1\leq k\leq n}a_k)^2+\sum_{1\leq k\leq n}a^2_k \end{array} $$

6) $S=\sum_{k=0}^n k^2$

利用高一级和式的恒等式$k^3$ $$\sum_{k=0}^n k^3+(n+1)^3=\sum_{0\leq k\leq n+1} k^3=\sum_{0\leq k+1\leq n+1} (k+1)^3$$ 也就是 $$\sum_{0\leq k\leq n} (k+1)^3=\sum_{0\leq k\leq n} (k^3+3k^2+3k+1)$$ 消去$\sum_{k=0}^n k^3$解代数方程可得需要的和式。

2012/07/18

递归

1)The Tower of Hanoi

$T_n$定义为，在规则下，把$n$个disk从一个peg移到另一个peg，另外两个peg里的任何一个。对$n+1$的时候，把最上面的$n$个看成一个整体，可得递归关系 $$ \begin{array}{l} T_0=0\\ T_n=2T_{n-1}+1\quad n > 0 \end{array} $$ 递归可以这样解，两边加1 $$T_n+1=2(T_{n-1}+1)$$ 2) 直线分平面

考虑加入第$n$条线的时候，最多能增加几个区域？
1. 增加的区域是把原来的某个区域一分为二，
2. 如果要分某个老区域，新的线必然至少要和这个区域的边界相交一次。
3. 如果交点是$k$个，那么新区域是$k+1$个，因为每个交点和一个老线相关，而每个老线必然是两个老区域的边界，这两个区域都被分了，但是有些老区域被计数两次，这些被计数两次的老区域的特点是他们的两个边界是和新线相交的老线，因为区域全是凸的，如果一条直线和一个凸区域的边界相交两次，那么必然只相交两次。对新线来说，和一个区域相交两次，这两个交点必然相邻(因为区域中间是没有老线的)，反过来所有相邻交点必然是和一个老区域交了两次。总之，$k$个交点，新区域的个数是$2k-(k-1)$。
4. 显然新加入一条线，交点最多有$n-1$个。

所以递归关系是 $$ \begin{array}{l} T_0=0\\ T_n=T_{n-1}+n\quad n > 0 \end{array} $$ 3) Josephus Problem

n个人围一圈，第一轮kill掉编号为2的倍数的人，重新编号第二轮再kill掉编号为2的倍数的人，直到只剩一个人，求这个人开始的编号。

递归是把原始问题通过某种变化转变成较小的同样描述的问题，然后找出两个问题解的关系，上面的问题自然的给出了大问题变小问题的过程

用$J(n)$记$n$个人中剩下的那个人的编号，第一轮过后，得到较小的问题$J(m)$记为这个人在第二轮里的编号，这个人在第一轮的编号，可以由$J(m)$算出可得两个问题的关系，但是n为奇数或者偶数的时候关系并不一样。 $$ \begin{array}{l} J(1)&=1\\ J(2n)&=2J(n)-1\\ J(2n+1)&=2J(n)+1 \end{array} $$ 解为 $$J(2^m+l)=2l+1$$ 这个关系有个性质：$2^m+l$的二进制表达，做循环左移得$2l+1$。

考虑一般的递归公式 $$ \begin{array}{l} f(1)&=\alpha\\ f(2n)&=2f(n)+\beta_0\\ f(2n+1)&=2f(n)+\beta_1 \end{array} $$ 先假设$f(n)$的表达式为 $$f(n)=A(n)\alpha+B(n)\beta_0+C(n)\beta_1$$ 4) 长度为n的三进制数，其中0的个数为奇数的个数

记这样的三进制数为$b_n$个,按最后一位的情况分类 $$b_n=b_{n-1}+b_{n-1}+(3^{n-1}-b_{n-1})$$

2012/07/14

Lie Theory

二维旋转

令 $$z_\theta=\cos{\theta} + i \sin{\theta}$$ 那么 $$ \begin{array}{ll} z_\theta(x+iy) &=x\cos{\theta}-y\sin{\theta}+i(x\sin{\theta} + y\cos{\theta})\\ &= (1\ \ i) \left( \begin{array}{cc} \cos{\theta} & -\sin{\theta}\\ \sin{\theta} & \cos{\theta} \end{array} \right) \left( \begin{array}{c} x\\ y \end{array} \right) \end{array} $$ 注意到 $$ \begin{array}{ll} \left( \begin{array}{cc} \cos{\theta} & -\sin{\theta}\\ \sin{\theta} & \cos{\theta} \end{array} \right)^{-1} &=\frac{1}{\cos^2{\theta}+\sin^2{\theta}} \left( \begin{array}{cc} \cos{\theta} & \sin{\theta}\\ -\sin{\theta} & \cos{\theta} \end{array} \right) \\ &=\left( \begin{array}{cc} \cos{-\theta} & -\sin{-\theta}\\ \sin{-\theta} & \cos{-\theta} \end{array} \right) \end{array} $$ 上面的过程把复数$z_\theta$对应到矩阵$\left( \begin{array}{cc} \cos{\theta} & -\sin{\theta}\\ \sin{\theta} & \cos{\theta} \end{array} \right)$，矩阵记为$R_\theta$。

反过来 $$ \begin{array}{ll} R_\theta &=\cos{\theta} \left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array} \right) + \sin{\theta} \left( \begin{array}{cc} 0 & -1\\ 1 & 0 \end{array} \right) \end{array} $$ 更一般的 $$ \begin{array}{ll} \left( \begin{array}{cc} a & -b\\ b & a \end{array} \right) &=a \left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array} \right) + b \left( \begin{array}{cc} 0 & -1\\ 1 & 0 \end{array} \right) \end{array} $$ 这样得到了一个任意复数表达为矩阵的方法，可以验证这种表达方式满足很多复数的运算规则比如$z_1z_2, z^{-1}$和$|z|$。

4元数

当维度n大于2的时候没有类似复数一样的代数结构，但是4维的时候可以定义4元数，满足基本的算术规则，但是不满足交换律。把4元数组$(a, b, c, d)$对应到矩阵 $$ \left( \begin{array}{cc} a+id & -b-ic\\ b-ic & a-id \end{array} \right) $$ 可以验证加法和乘法结果是同样的矩阵形式，而且矩阵的行列式是$a^2+b^2+c^2+d^2$

上面的矩阵可以写成 $$ a\left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array} \right) + b\left( \begin{array}{cc} 0 & -1\\ 1 & 0 \end{array} \right) + c\left( \begin{array}{cc} 0 & -i\\ -i & 0 \end{array} \right) + d\left( \begin{array}{cc} i & 0\\ 0 & -i \end{array} \right) $$ 或者写成 $$a+b\ \textbf{i}+c\ \textbf{j}+d\ \textbf{k}$$ 纯虚4元数对加法封闭，对乘法的规则很有意思 $$ \begin{array}{cc} &(u_1\ \textbf{i}+u_2\ \textbf{j}+u_3\ \textbf{k})(v_1\ \textbf{i}+v_2\ \textbf{j}+v_3\ \textbf{k})\\ =&-(u_1, u_2, u_3)\cdot \left( \begin{array}{cc} v_1 \\ v_2 \\ v_3 \end{array} \right) + (u_1, u_2, u_3)\times(v_1, v_2, v_3) \left( \begin{array}{cc} \textbf{i} \\ \textbf{j} \\ \textbf{k} \end{array} \right) \end{array} $$ 4元数的矩阵表示的转置和复共轭为 $$ \overline{ \left( \begin{array}{cc} a+id & -b-ic\\ b-ic & a-id \end{array} \right)}^T = \left( \begin{array}{cc} a-id & b+ic\\ -b+ic & a+id \end{array} \right) $$ 换种写法 $$ \begin{array}{ll} &a\left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array} \right) + b\left( \begin{array}{cc} 0 & 1\\ -1 & 0 \end{array} \right) + c\left( \begin{array}{cc} 0 & i\\ i & 0 \end{array} \right) + d\left( \begin{array}{cc} -i & 0\\ 0 & i \end{array} \right) \\ =&a\left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array} \right) - b\left( \begin{array}{cc} 0 & -1\\ 1 & 0 \end{array} \right) - c\left( \begin{array}{cc} 0 & -i\\ -i & 0 \end{array} \right) - d\left( \begin{array}{cc} i & 0\\ 0 & -i \end{array} \right) \end{array} $$ 这是4元数的共轭的定义。

4元数空间里的$S^3$上的4元数和3维旋转

任何一个4元数都诱导了一个4元数空间的一个旋转，但是这个一般性的旋转并没有方便的3维不变子空间。考虑另一种变换，这种变换的特点是4元数空间的纯虚3维子空间是不变子空间。令$t$为单位4元数，conjugation by t是mapping $$q\mapsto t^{-1}qt$$ 定理：$t$可以写成$t=\cos\theta+u\sin\theta$，其中$u$是纯虚单位4元数。那么conjugation by t是一个4元数空间的等距变换，并且这个变换在纯虚空间的一个旋转，旋转是绕$u$角度为$2\theta$的旋转。

conjugation by t和conjugation by -t是同一个等距变换，变换的inverse是$t=\cos(-\theta)+u\sin(-\theta)$，也就是在纯虚空间的旋转是同样的$u$但是角度是$-2\theta$。

旋转的乘法定义如下，令$t_1,t_2$是单位4元数，先转$t_1$后转$t_2$表达为 $$q\mapsto t_2^{-1}(t_1^{-1}qt_1)t_2=(t_1t_2)^{-1}q(t_1t_2)$$ 因为$(t_1t_2)$依然是一个单位4元数，所以乘法的结果是conjugation by $(t_1t_2)$。

射影空间

构造$\mathbb{RP}^1$，令$z_\alpha=\cos\alpha+i\sin\alpha$，则$z_\alpha$是$S^1$上的一个点，所有的形式为$\{\pm z_\alpha\}$的元素的集合构成一个群，乘法定义为 $$ \{\pm z_\alpha\} \{\pm z_\beta\} = \{\pm (z_\alpha z_\beta)\} $$ 这种乘法的定义下，映射 $$\{\pm z_\alpha\} \mapsto (\pm z_\alpha)^2 $$ 这是$\mathbb{RP}^1$到$S^1$的一个同构。$\pm z_\alpha$对应了两个角度和$\alpha, \pi+\alpha$，上面的映射映到$2\alpha, 2\pi + 2\alpha$，这两个角度对应$S^1$上的同一个点。

$SU(2)$和$SO(3)$

$S^n$里只有$S^1$和$S^3$是Lie Group，上面的结论，conjugation by t和conjugation by -t完全相同，记为 $$\{\pm t\}\in \mathbb{RP}^3$$ 上面也建立了$\mathbb{RP}^3$和$SO(3)$之间的一一对应。

$SU(2)$ (special unitary group)，是上面4元数矩阵，并且行列式为1。因为行列式同时也是4元数的模，这样建立了$SU(2)$和$S^3$之间的一一对应。

总结为下面的图 $$ \begin{array}[c]{ccc} S^3&\stackrel{t\mapsto \{\pm t\}}{\longrightarrow}&\mathbb{RP}^3\\ \updownarrow\scriptstyle{}&&\\ SU(2)&\stackrel{t\mapsto \text{conjugation by} \pm t}{\longrightarrow}&SO(3) \end{array} $$

2012/07/10

Classical Mechanics

To some extent, the selection of a group of basic concepts, in terms of which others are to be defined, is a matter of choice. We have chosen to regard position and time (relative to some frame of reference) as basic. From this point of view, Newton’s laws must be regarded as containing definitions in addition to physical laws. The first law contains the definition of an inertial frame, together with the physical assertion that such frames exist, while the second and third laws contain definitions of mass and force. These laws, supplemented by the laws of force, such as the law of universal gravitation, provide the equations from which we can determine the motion of any dynamical system.

一维运动并且$F$只依赖于位置的情况

如果$F=m\ddot{x}$是force的定义，其他的物理规律例如law of universal gravitation会给出一个$F(x)$的函数，任何在这个law约束下的运动满足方程$m\ddot{x}=F(x)$。

对任何一个具体的运动，显然$T$是时间的函数，考虑在这种力的作用下动能$T=\frac{1}{2}m\dot{x}^2$的变化规律，对时间求导: $$\frac{dT}{dt}=m\dot{x}\ddot{x}=F(x)\dot{x}$$ 现在选定一个时间为起点0，计算$t$时间后动能的变化 $$T_t-T_0=T|_0^t=\int_0^t \frac{dT}{dt}dt$$ 把上面的动能和力函数的关系代入可得 $$T_t-T_0=\int_0^t \frac{dT}{dt}dt=\int_0^t F(x)\dot{x}dt=\int_{x_0}^{x_t} F(x)dx$$ 如果定义势能为 $$V(y)=-\int_{x_0}^{y} F(x)dx$$ 可得 $$T_t+V(x_t)=T_0=\text{const}$$ 注意按势能的定义，$x_0$点的势能为0，也就是说在$x_0$点能量全是动能$T_0$。

这个关系是在上面描述的$F(x)$的约束下的任何运动都满足的。守恒定律有些推论，比如因为动能不能为负，所以运动的势能不会无止境的增加。而势能是位置的函数，所以如果一个运动的动能在某个时刻或者某个位置已知，那么这个运动就会只能在固定的区域，这个区域就是势能小于某个值(任一时刻的动能加势能也就是总能量)。势能是位置的函数，所以这个区域可以表达成方程$$V(x) \leq E$$

$m\ddot{x}=F$左边是second laws对force的定义，右边是根据其他关于force的定律决定的函数，这是一个二次微分方程，首次积分是对原方程积分一次去掉二次导数由原微分方程得到的等式。

平衡位置

定义是受力为0的位置，根据势能的定义，也就是$\frac{dV(y)}{dy}=0$的位置，在势能定义的公式里面$x_0$的选择具有任意性，我们可以选择考察中的位置为$x_0$，也就是这点的势能是0。根据平衡位置的定义，对位置的一阶导数也是0。所以平衡位置附近的级数展开看起来是个好主意。 $$V(x)=\frac{1}{2}x^2V''(0)+o(x^3)\tag{1}$$ 这里$x$是对平衡位置的偏移。

不考虑高阶无穷小(只在平衡位置附近这样近似)，这是抛物线的方程，如果$V''(0) > 0$开口向上，对应平衡点附近是吸引力，反过来是排斥力。

根据势能$V$的定义是force $F$对位置的积分，反过来$V$关于位置的函数已知，方程$(1)$，对位置求导，可得$F$在平衡位置附近的表达式$F=-kx$，其中$k=V''(0)$，进一步加上second laws可得运动方程 $$m\ddot{x}+kx=0\tag{2}$$ 也就是说，如果我们知道了某个平衡位置附近势能关于位置的函数，平衡位置这个事实可以导出势能函数的一阶导数在这个平衡位置的值是0，计算这个位置的$V$的二阶导数得到$k$，这些信息足够我们得到这点附近的运动方程$(2)$。

方程可以写成 $$\ddot{x}+\frac{k}{m}x=0$$ 平衡位置显然有两种，对应引力和斥力，如果是引力，令$\frac{k}{m}=\omega^2$得 $$\ddot{x}+\omega^2x=0$$ 此微分方程的解为 $$x=a\cos(\omega t-\theta)$$ 上面的逻辑关系可以看出，这时候运动是一个周期性运动，其频率由势能在平衡位置的二阶导数唯一决定。并且上面的推理没有对势能函数的性质做出任何限制，所以适用于所有的引力平衡位置。

实际计算的过程是，1.得势能关于位置的函数。2.计算势能一阶导数为0的位置(平衡位置)3.计算这个位置的势能关于位置的函数的二阶导数值得到震动的周期。

数学和物理

描述一个运动，需要两个要素，某个时刻的位置和速度，为什么需要速度呢？难道不是位置对时间求导就得到速度了吗？需要速度的原因是：速度是位置关于时间的导数是一个规律，但是研究的对象不是所有可导函数，这个位置对时间的函数的形式受物理规律约束。

这些物理约束一般表现为置对时间的函数的一阶或者二阶导数的形式。比如如果force是位置的函数，根据second low可以得到加速度和位置的关系，加上一个约束。

First low告诉我们，虽然数学上速度是位置对时间函数的导数，但是这种函数在没有外力的情况下只能是线性函数，也就是物理对数学的约束是约束位置对时间函数的形式。

物理的约束是以是位置对时间函数的一阶或者二阶导数形式出现的，并不完全确定运动方程，所以要研究n多微分方程。

例子：单摆

重力分解为杆的方向和切向，杆的方向被平衡，剩下为$F=-mg\sin{\theta}$，$x$为圆弧的长度，因为速度只有切向的，根据势能$V$的定义 $$V(x)=-\int_{x_0}^{x_t}-mg\sin{(x/l)}dx=-mgl\cos{(x/l)}|_{x_0}^{x_t}$$ 如果取$x_0=0$有 $$V(x)=mgl[1-\cos(x/l)]$$

例子：两个电荷中间放一个电荷，三个极性都相同，可得一个震动

2012/06/22

Android

开发环境

1. Using JDK 1.6
2. Remove the default Run/Debug configure of Eclipse project setting.
3. 安装对应的移动设备的USB驱动，打开设备的USB调试。

Application components

Android由Application components构成。Activities对应程序的一个界面，Activities之间是独立的，比如一个程序可以启动另一个程序的一个Activities。SDK提供了一个Activities基类。Services是后台运行的过程。Content providers是程序数据的接口，其他模块通过Content providers来访问自己的或者其他程序的数据。Broadcast receivers管理对系统事件广播的处理。

Activities

Activity的UI设计记录在AndroidManifest.xml里，这个XML作为资源存在程序的目录里，setContentView方法用于把Activity和这个XML联系起来。每个时刻只有一个Activity处于活动状态，系统会维护一个Activity的堆栈，用于跟踪程序的流程，比如按返回键的时候回到上一个Activity。
Activity需要在程序的AndroidManifest.xml里声明，其中一个是main Activity也就是，用户启动程序的时候执行的Activity。
<action android:name="android.intent.action.MAIN" />
Intent对象封装Activity启动命令的来源，当系统启动Activity的时候会把这样一个对象传给Activity，

2012/06/20

Beautiful Architecture

Common among the notions of architecture is the idea of structures, each defined by components of various sorts and their relations: how they fit together, invoke each other, communicate, synchronize, and otherwise interact.

Table 1-1. Structure summary
Structure	Components	Relations	Concerns
Information Hiding	Information Hiding Modules	Is a part of Is contained in	Changeability Modularity Buildability
Uses	Programs	Uses	Producibility Ecosystem
Process	Processes (tasks, threads)	Gives work to Gets resources from Shares resources with Contained in ...	Performance Changeability Capacity
Data Access	Programs and Segments	Has access to	Security Ecosystem

决定architecture的第一步是和stakeholders to understand and prioritize quality concerns and constraints，而不是通过functional requirements来决定architecture，因为满足functional requirements的architecture很多，选择其中一个的决定取决于quality concerns，并不是所有的可能都能达到要求。比如呈现网页的architecture很多，CGI，静态网页，ASP，JSP，Python。

Fred Brooks said that conceptual integrity is the most important attribute of an architecture: "It is better to have a system...reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas" (1995).

The first class of evaluation methods determines properties of the architecture, often by modeling or simulation of one or more aspects of the system.

A consistent architecture does not offer two (or more) ways to do the same thing, forcing the user to waste time choosing.

We would look for architectures with a well-defined Uses Structure that would support incremental construction, so that at each iteration of construction we would have a useful, testable system. We would also look for architectures that have well-defined module interfaces and that are inherently testable, so that the construction progress is transparent and visible.

2.2. Design Town

第一步，conceptional design，例如模块的划分，结构分层，业务流程模型，辅助库的选择。

3.2 Project Darkstar

Massively Multiplayer Online games (MMOs)，需要结构能scaling；Multi-cores chip evolution，需要concurrent。但是

This context gave us our first goal for the architecture. The requirements for scaling dictated that the system be distributed and concurrent, but we needed to present a much simpler programming model to the game developer. The simple statement of the goal is that the game developer should see the system as a single machine running a single thread, and all of the mechanisms that would allow deployment on multiple threads and multiple machines should be taken care of by the Project Darkstar infrastructure.

The general programming model that Project Darkstar requires is a reactive one, in which the server side of the game is written as a listener for events generated by the clients. When an event is detected, the game server should generate a task, which is a short-lived sequence of computations that includes manipulation of information in the virtual world and communication with the client that generated the original event and possibly with other clients. Tasks can also be generated by the game server itself, either as a response to some internal change or on a periodic, timed basis. In this way, the game server can generate characters in the game or world that aren't controlled by some outside player.

Darkstar在server端实现了一些独立的services，类似典型的操作系统，给server端的游戏提供access persistent storage, schedule and run tasks, and perform communication服务。服务的实现被隐藏在服务的接口后面，允许其实现的变化。

game的特点，不会有对数据的复杂query，每个task大概会query十几个对象。被query的对象一半的机会会被task修改。

task被包在一个transaction里面。如果两个task要修改一份数据(通过data service提供的接口)，data service会做出仲裁，访问失败的task会被reschedule。

Session Service管理客户端的login，认证，建立session，一旦session建立，客户端发向server的message会被Session Service解析，并且决定产生什么task。Session Service确保同一个客户端发的message被顺序处理，这样Task Service就不用关心task的顺序问题，
Session掩盖了底部的通信机制，所以Session的实际连接可以被重定向，移动到不同的server。

有点高级了，所有的tasks都被编译成java vm的bytecodes，可以在任何满足条件的环境里运行。

tasks都是被序列化的，server crash过程中丢失的task会被恢复出来，在别的server上reschedule，crash问题就转化成了延迟。

All data that lasts longer than a particular task be stored persistently.这种设计的读写延迟可以用更底层的cache技术优化。

Service结构的设计使得server端可以对通信监控，可以自由的把交互频繁的客户端的session移到同一个server，而得到更好的运行效率。

4. Making Memories

As we look at each of these facets, keep in mind that they are different ways of looking at the overall system. For instance, we used a modular architecture to support different deployment scenarios. At the same time, each module is built in a layered architecture. These are orthogonal but intersecting concerns. Each set of modules follows the same layering, and each layer is found across all the modules.

5. Resource-Oriented Architectures

What the business really wants are easier ways to manage the data they have collected, build upon it, and reuse it to support their customers and core functions.

2012/06/16

Moment Generating Function

r.v. $X$的Moment Generating Function是一个参数化的期望，一个参数化的函数把r.v. $X$映射到另一个r.v.，然后求其期望，$f_t(X)=e^{tX}$的期望： $$\phi(t)=E(f_t(X))=E(e^{tX})$$ 可以推出： $$\phi^{(n)}(0)=E(X^n)$$ 所以计算Moment Generating Function的过程就是计算期望的过程，最后的一个用$t$参数化的期望。

2012/06/15

Book Highlight - I

Quantum

Einstein, Bohr, And The Great Debate About the Nature of Reality

In his quest for a new mechanics for the quantised world of the atom, Heisenberg concentrated on the frequencies and relative intensities of the spectral lines produced when an electron instantaneously jumped from one energy level to another. He had no other choice; it was the only available data about what was happening inside an atom. Despite the imagery conjured up by all the talk of quantum jumps and leaps, an electron did not 'jump' through space as it moved between energy levels like a boy jumping off a wall onto the pavement below. It was simply in one place and an instant later it popped up in another without being anywhere in between. Heisenberg accepted that all observables, or anything connected with them, were associated with the mystery and magic of the quantum jump of an electron between two energy levels. Lost forever was the picturesque miniature solar system in which each electron orbited a nuclear sun.

On the pollen-free haven of Helgoland, Heisenberg devised a method of book-keeping to track all possible electron jumps, or transitions, that could occur between the different energy levels of hydrogen. The only way he could think of recording each observable quantity, associated with a unique pair of energy levels, was to use an array: $$(v_{ij})_{m\times n}$$ This was the array for the entire set of possible frequencies of the spectral lines that could theoretically be emitted by an electron when it jumps between two different energy levels. If an electron quantum jumps from the energy level $E_2$ to the lower energy level $E_1$, a spectral line is emitted with a frequency designated by $v_{21}$ in the array. The spectral line of frequency $v_{12}$ would only be found in the absorption spectrum, since it is associated with an electron in energy level $E_1$ absorbing a quantum of energy sufficient to jump to energy level $E_2$. A spectral line of frequency vmn would be emitted when an electron jumps between any two levels whose energies are $E_m$ and $E_n$, where $m$ is greater than $n$. Not all the frequencies $v_{mn}$ are exactly observed. For example, measurement of $v_{11}$ is impossible, since it would be the frequency of the spectral line emitted in a 'transition' from energy level $E_1$ to energy level $E_1$ – a physical impossibility. Hence $v_{11}$ is zero, as are all potential frequencies when $m=n$. The collection of all non-zero frequencies, $v_{mn}$, would be the lines actually present in the emission spectrum of a particular element.

Another array could be formed from the calculation of transition rates between the various energy levels. If the probability for a particular transition, $a_{mn}$, from energy level $E_m$ to $E_n$, is high, then the transition is more likely than one with a lower probability. The resulting spectral line with frequency $v_{mn}$ would be more intense than for the less probable transition. Heisenberg realised that the transition probabilities $a_{mn}$ and the frequencies $v_{mn}$ could, after some deft theoretical manipulation, lead to a quantum counterpart for each observable quantity known in Newtonian mechanics such as position and momentum.

Schrödinger finally proposed that the wave function of an electron, for example, was intimately connected to the cloud-like distribution of its electric charge as it traveled through space. In wave mechanics the wave function was not a quantity that could be directly measured because it was what mathematicians call a complex number. 4+3i is one example of such a number, and it consists of two parts: one 'real' and the other 'imaginary'. 4 is an ordinary number and is the 'real' part of the complex number 4+3i. The 'imaginary' part, 3i, has no physical meaning because i is the square root of -1. The square root of a number is just another number that multiplied by itself will give the original number. The square root of 4 is 2 since 2×2 equals 4. There is no number that multiplied by itself equals -1. While 1×1=1, –1×–1 is also equal to 1, since by the laws of algebra, a minus times a minus generates a plus.

The wave function was unobservable; it was something intangible that could not be measured. However, the square of a complex number gives a real number that is associated with something that can actually be measured in the laboratory. The square of 4+3i is 25. Schrödinger believed that the square of the wave function of an electron, $|\Psi(x,t)|^2$ was a measure of the smeared-out density of electric charge at location x at time t.

As part of his interpretation of the wave function, Schrödinger introduced the concept of a 'wave packet' to represent the electron as he challenged the very idea that particles existed. He argued that an electron only 'appeared' to be particle-like but was not actually a particle, despite the overwhelming experimental evidence in favour of it being so. Schrödinger believed that a particle-like electron was an illusion. In reality there were only waves. Any manifestation of a particle electron was due to a group of matter waves being superimposed into a wave packet. An electron in motion would then be nothing more than a wave packet that moved like a pulse sent, with a flick of the wrist, travelling down the length of a taut rope tied at one end and held at the other. A wave packet that gave the appearance of a particle required a collection of waves of different wavelengths that interfered with one another in such a way that they cancelled each other out beyond the wave packet.

If giving up particles and reducing everything to waves rid physics of discontinuity and quantum jumps, then for Schrödinger it was a price worth paying. However, his interpretation soon ran into difficulties as it failed to make physical sense. Firstly, the wave packet representation of the electron began to unravel when it was discovered that the constituent waves would spread out across space to such a degree that they would have to travel faster than the speed of light if they were to be connected with the detection of a particle-like electron in an experiment.

Try as he might, there was no way for Schrödinger to prevent this dispersal of the wave packet. Since it was made up of waves that varied in wavelength and frequency, as the wave packet traveled through space it would soon begin to spread out as individual waves moved at different velocities. An almost instantaneous coming together, a localisation at one point in space, would have to take place every time an electron was detected as a particle. Secondly, when attempts were made to apply the wave equation to helium and other atoms, Schrödinger's vision of the reality that lay beneath his mathematics disappeared into an abstract, multi-dimensional space that was impossible to visualize.

Nor could Schrödinger's interpretation account for the photoelectric and Compton effects. There were unanswered questions: how could a wave packet possess electric charge? Could wave mechanics incorporate quantum spin? If Schrödinger's wave function did not represent real waves in everyday three-dimensional space, then what were they? It was Max Born who provided the answer.

The wave function itself has no physical reality; it exists in the mysterious, ghost-like realm of the possible. It deals with abstract possibilities, like all the angles by which an electron could be scattered following a collision with an atom. There is a real world of difference between the possible and the probable. Born argued that the square of the wave function, a real rather than a complex number, inhabits the world of the probable. Squaring the wave function, for example, does not give the actual position of an electron, only the probability, the odds that it will found here rather than there. For example, if the value of the wave function of an electron at X is double its value at Y, then the probability of it being found at X is four times greater than the probability of finding it at Y. The electron could be found at X, Y or somewhere else.

Niels Bohr would soon argue that until an observation or measurement is made, a microphysical object like an electron does not exist anywhere. Between one measurement and the next it has no existence outside the abstract possibilities of the wave function. It is only when an observation or measurement is made that the 'wave function collapses' as one of the 'possible' states of the electron becomes the 'actual' state and the probability of all the other possibilities becomes zero.

For Born, Schrödinger's equation described a probability wave. There were no real electron waves, only abstract waves of probability. 'From the point of view of our quantum mechanics there exists no quantity which in an individual case causally determines the effect of a collision', wrote Born. And he confessed, 'I myself tend to give up determinism in the atomic world.' Yet while the 'motion of particles follows probability rules', he pointed out, 'probability itself propagates according to the law of causality'.

It took Born the time between his two papers to fully grasp that he had introduced a new kind of probability into physics. 'Quantum probability', for want of a better term, was not the classical probability of ignorance that could in theory be eliminated. It was an inherent feature of atomic reality. For example, the fact that it was impossible to predict when an individual atom would decay in a radioactive sample, amid the certainty that one would do so, was not due to a lack of knowledge but was the result of the probabilistic nature of the quantum rules that dictate radioactive decay.

According to the uncertainty principle, a measurement that yields an exact value for the momentum of a microphysical object or system excludes even the possibility of simultaneously measuring its position. The question that Einstein wanted to answer was: Does the inability to measure its exact position directly mean that the electron does not have a definite position? The Copenhagen interpretation answered that in the absence of a measurement to determine its position, the electron has no position. EPR set out to demonstrate that there are elements of physical reality, such as an electron having a definite position, that quantum mechanics cannot accommodate – and therefore, it is incomplete.

EPR attempted to clinch their argument with a thought experiment. Two particles, A and B, interact briefly and then move off in opposite directions. The uncertainty principle forbids the exact measurement, at any given instant, of both the position and the momentum of either particle. However, it does allow an exact and simultaneous measurement of the total momentum of the two particles, A and B, and the relative distance between them.

The key to the EPR thought experiment is to leave particle B undisturbed by avoiding any direct observation of it. Even if A and B are light years apart, nothing within the mathematical structure of quantum mechanics prohibits a measurement of the momentum of A yielding information about the exact momentum of B without B being disturbed in the process. When the momentum of particle A is measured exactly, it indirectly but simultaneously allows, via the law of conservation of momentum, an exact determination of the momentum of B. Therefore, according to the EPR criterion of reality, the momentum of B must be an element of physical reality. Similarly, by measuring the exact position of A, it is possible, because the physical distance separating A and B is known, to deduce the position of B without directly measuring it. Hence, EPR argue, it too must be an element of physical reality. EPR appeared to have contrived a means to establish with certainty the exact values of either the momentum or the position of B due to measurements performed on particle A, without the slightest possibility of particle B being physically disturbed.

Given their reality criterion, EPR argued that they had thus proved that both the momentum and position of particle B are 'elements of reality', that B can have simultaneously exact values of position and momentum. Since quantum mechanics via the uncertainty principle rules out any possibility of a particle simultaneously possessing both these properties, these 'elements of reality' have no counterparts in the theory. Therefore the quantum mechanical description of physical reality, EPR conclude, is incomplete.

Einstein, Podolsky and Rosen had conjured up an imaginary experiment that involved a pair of correlated particles, A and B, so far apart that it should be impossible for them to physically interact with one another. EPR argued that a measurement carried out on particle A could not physically disturb particle B. Since any measurement is performed on only one of the particles, EPR believed they could cut off Bohr's counter-attack – an act of measurement causes a 'physical disturbance'. Since the properties of the two particles are correlated, they argued that by measuring a property of particle A, such as its position, it is possible to know the corresponding property of B without disturbing it. EPR's aim was to demonstrate that particle B possessed the property independently of being measured, and since this was something that quantum mechanics failed to describe, it was therefore incomplete. Bohr countered, never so succinctly, that the pair of particles were entangled and formed a single system no matter how far apart they were. Therefore, if you measured one, then you also measured the other.

The Copenhagen interpretation requires an observer outside the universe to observe it, but since there is none – leaving God aside – the universe should never come into existence but remain forever in a superposition of many possibilities. This is the long-standing measurement problem writ large. Schrödinger's equation that describes quantum reality as a superposition of possibilities, and attaches a range of probabilities to each possibility, does not include the act of measurement. There are no observers in the mathematics of quantum mechanics. The theory says nothing about the collapse of the wave function, the sudden and discontinuous change of the state of a quantum system upon observation or measurement, when the possible becomes the actual. In Everett's many worlds interpretation there was no need for an observation or measurement to collapse the wave function, since each and every quantum possibility coexists as an actual reality in an array of parallel universes.

2012/06/13

比赛

A，B，先赢$c$场的队赢，求比的场数的期望。

最多会比$2(c-1)+1=2c-1$场，A,B的情况完全对称，考虑A赢的情况，乘以2得全部的可能。

令$c\leq k\leq 2c-1$为A赢的情况下，比赛的场数。可能的组合是 $$N_k = {k\choose c}-{k-1\choose c}={k-1\choose c-1}$$ $k$场里选$c$场$A$赢，去掉A在$k-1$场以前已经赢的情况。

概率是 $$p^c(1-p)^{k-c}$$ A赢的情况下，比赛场数的期望是 $$N(p)=\sum_{k=c}^{2c-1}{k-1\choose c-1}p^i(1-p)^{k-c}$$ 根据对称性，B赢的情况下，比赛场数的期望是$N(1-p)$，所以 $$N=N(p)+N(1-p)$$

Set and Function

集合的角度看映射

集合可以看成具有某种属性的元素的全体，两个集合$A, B$，其交集就是不能用这两个属性区分的元素的集合，例如$A$代表有红颜色的元素，$B$代表有黑颜色的元素，$A\cap B$可以是部分表面红部分表面黑的元素。颜色黑红这两个属性不能区分一些元素。

映射前不能被区分的元素$A\cap B$

映射后不能区分的元素$f(A)\cap f(B)$

映射前不能被区分的元素被映射后的命运$f(A\cap B)$

1)所有的mapping都不增加信息，也就是map后$A\cap B$必须还在$f(A)\cap f(B)$里： $$f(A\cap B)\subseteq f(A)\cap f(B)$$ 因为如果有一个元素在$f(A\cap B)$里但不在$f(A)\cap f(B)$里，它必然是在$f(A),f(B)$其中一个里面，并且只在其中一个里面，变的可以区分了。

2) injection不减少信息，以前能区分的元素不会被injection变的不能区分了，$f(A\cap B)$包含了所有映射后不能被区分的元素： $$f(A)\cap f(B) \subseteq f(A\cap B)$$ 所以对injection有 $$f(A)\cap f(B) = f(A\cap B)$$ 3) 逆映射在原空间上引入了等价关系，如果map到同一点认为是等价的，取这个等价关系的商空间，原来的映射变成新的商空间到值域的injection。这个商空间的特点是把原来因为map而减少的信息直接原空间内消化了，原来的map变成了不减少信息的injection。

満射得到的商空间到值域的映射是一一映射，所以保持所有的集合运算。

等价类的特点就是，如果两个等价类相交，那么重合。

2012/06/12

有用的命令

$ \newcommand{\infint}{{\int_{-\infty}^{\infty}}} \newcommand{\ie}{\textit{i.e.}\ } \newcommand{\eg}{\textit{e.g.}\ } \newcommand{\pfrac}[2]{{\frac{\partial #1}{\partial #2}}} \newcommand{\eqn}[1]{Eq.\ (\ref{#1})} $
Putty command line:
start d:\softwares\putty.exe -ssh -pw xxx -P 1234 user@localhost

Run VirtualBox headless:
VBoxHeadless.exe -s Sage-5.0

Sage interactive:

Latex newcommad command:

$$\infint f(x)dx$$ $$\pfrac{L}{q_n},\quad \pfrac{}{x_i}$$

2012/06/10

概率计算技巧

1) 利用经典分布。例如二项分布，求$n$个元素里具有某种属性的元素的个数为$r$的概率，首先求出单个元素具有这种属性的概率，然后利用分布公式。

$d$为显性基因，$r$为隐性基因，父母为都为$rd$，求4个孩子里有3个呈显性的概率。

Solution: If we assume that each child is equally likely to inherit either of two genes from each parent, the probabilities that the child of two hybrid parents will have $dd, rr, rd$ pairs of genes are, respectively $(1/4, 1/4, 1/2)$, Hence, because an offspring will have the outward appearance of the dominant gene if its gene pair is either $dd, \text{ or } rd$, it follows that the number of such children is binomially distributed with parameters $(4, 3/4)$.(注意到单个元素具有显性这种属性的概率需要计算，不是直接知道。)

二项分布其实给出了概率空间里，$n$个(不相交的)子集的概率情况，一般问题中要求的具有某个属性个子集可以表达成这$n$个子集的并。

2) 区间上的均匀分布：随机变量的值落到某个子区间的概率等于这个子区间的长度占全部区间的比例。

3) 正态分布：密度函数可以吸收$x$的线性变换而不改变形式 $$f(x)=\frac{1}{\sqrt{2\pi}\sigma} \exp{ \left( -\frac{1}{2} \left( \frac{x-\mu}{\sigma} \right)^2 \right) }$$

4) 随机变量$X$的函数$g(X)$的期望计算的时候只能把$g$做到积分$E(X)=\int_{-\infty}^{\infty}xf(x)dx$里面的$x$上，不能作用在$E(X)$的结果上。 $$E(g(X))=\int_{-\infty}^{\infty}g(x)f(x)dx$$ moment是$g(X)=X^n$。

Var是$g(X)=(X-m)^2, m=E(X)$。

二维的时候 $$E(g(X, Y))=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}g(x, y)f(x, y)dxdy$$ 但是如果$g$是线性函数，显然可以和积分交换，所以期望具有线性。

5) 分布函数$F(a)$接受的参数是一个界限$a$，联合分布函数$F(a, b)$接受的参数是两个界限$a, b$。得单个分布的方法是另一个界限传入$+\infty$，例如 $$F_X(a)=F(a, +\infty)$$ 联合密度是$p(x, y)=P(X=x, Y=y)$，要得单个r.v.的方法是对另一个求和 $$p_X(x)=\sum_{y}p(x,y)$$ 连续的时候是对另一个积分 $$f_X(x)=\int_{-\infty}^{\infty}f(x,y)dy$$ 一个例子

$E(X+Y)=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}(x+y)f(x, y)dxdy\\ =\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}xf(x, y)dxdy + \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}yf(x, y)dxdy\\ =\int_{-\infty}^{\infty}x\int_{-\infty}^{\infty}f(x, y)dydx + \int_{-\infty}^{\infty}y\int_{-\infty}^{\infty}f(x, y)dxdy\\ =\int_{-\infty}^{\infty}xf_X(x)dx + \int_{-\infty}^{\infty}yf_Y(y)dy\\ =E(X) + E(Y)$
6) 现实中期望是容易得到的参数，只要对r.v.做很多实验，求平均值就可以了。有些不等式给出了概率空间某些集合的概率的估计是期望的函数。

7) 利用indicator函数

indicator的特点是期望等于取值1的概率 $$E(I)=P(I=1)=1-P(I=0)$$ 并且这个r.v. indicator的平方的期望不变 $$E(I^2)=1^2\cdot P(I=1) + 0^2\cdot P(I=0)$$

25种优惠券，被选中的机会均等，求随机取10张优惠券里面优惠券种类的期望。

概率空间里的元素$\omega$为10张优惠券的组合，考虑集合$A_i=\{\omega | \omega \text{ contains type $i$ coupon }\}$，定义$I_i$为$A_i$的indicator，问题中的r.v. 可以表达成$X(\omega)=\sum_iI_i(\omega)$，由期望的线性 $$E(x)=E(\sum_iI_i)=\sum_iE(I_i)=\sum_i[1-\left(\frac{24}{25}\right)^{10}]=25[1-\left(\frac{24}{25}\right)^{10}]$$ 利用条件，概率空间里的元素是set，r.v是set的一个内部的一个计数，计数可以分解成indicator。

8) 密度函数，分布函数表达的都是r.v.某个逆像的概率测度，r.v.独立的定义是任何逆像作为概率空间的子集是独立的。子集独立定义用到其交集的概率，而联合分布其实是定义了逆像的交集的概率。

独立r.v.实际上对应了概率空间内部的一个乘积分解，原空间实际上可以表达成两个概率空间的乘积。

一个随机变量对应了概率空间的一个维度(测量样本的一个属性)，测很多个属性，里面独立的对应了概率空间内部的乘积结构。

9) 概率和组合：
i) 如果问题中的组合有$N$种可能，每种组合的概率是相等的，也就是$p=\frac{1}{N}$，那么要求其中某些组合的集合的概率，1.直接计数，算满足条件的组合有个数$M$，概率是$\frac{M}{N}$。2.已知每个组合的概率是$p=\frac{1}{N}$，满足条件的组合有$M$种，简单的把$M$个组合的概率加起来就得到结果，$M\frac{1}{N}$.
ii) 有时候满足要求的组合的概率并不相等，这时候要单独计算每种组合的概率和个数，最后做加法。比如抛硬币，出现正面结束，求抛的次数小于5的概率。
iii) 还有满足条件的元素可以分成多组，每组内部的元素概率相等，不同组的元素的概率不同。计算方法和ii类似，组内计数，的组的概率，整个组构成的子集的概率，然后所有的组相加，例如后面的比赛的例子。

10) 概率，和式，积分

和式以及积分，分成两部分，第一是变量可以取值的范围，或者下标可以取值的范围，第二部分是刚才约束的变量的函数，这个函数取遍所有的可能不重复的值求和。

两个部分可以独立的变化，只要不改变实际的范围或者函数的取值。

11) Conditioning 用indicator可以把一个集合的概率$P$和它的indicator的r.v.的期望联系起来(进一步，它们的条件期望和条件概率也相等)，期望可以用条件期望的方法计算。例子，假设$X,Y$是两个独立的r.v.,计算$P\{X < Y\}$ $$P\{X < Y\}=\int_{-\infty}^{\infty}P\{ X < Y|Y=y \}f_Y(y)dy$$ 等式右边的积分是期望带来的。

2012/06/09

Patterns in Permutations and Words

Chapter 1. What is a pattern？

Roughly speaking, for our purposes, patterns in permutations are permutations with additional restrictions, and patterns in words are certain restricted Words (possibly permutations).

Reduced form of a permutation：把一个$r$ permutation里的元素重新命名，使得他们成为一个$\{1,\cdots r\}$排列，大小顺序不变。例如： $$Red(3174)=2143$$ Reverse of a permutation：$\sigma=\sigma_1\sigma_2\cdots\sigma_n$的reverse $r(\sigma)=\sigma_n\sigma_{n-1}\cdots\sigma_1$。
Complement of a permutation：用$\{1,\cdots r\}$里面最小的替换$\sigma=\sigma_1\sigma_2\cdots\sigma_n$里最大的，第二小的替换第二大的，以此类推，记为$c(\sigma)$。
Inverse of a permutation：The $\sigma_i$-th position of the inverse $i(\sigma)$ is occupied by $i$。

这三个变换称为$S_n$的trivial bijections。

Permutation matrix：Permutation可以描述为第$i$个位置是几($\sigma_i$)，用矩阵表示，列用值index，行用位置index。

Ascent是Permutation里的一个字母，后面接比它大的字母；Descent后面接比它小的字母的。n-permutation里面Ascent的个数加Descent的个数是$n-1$；如果一个Permutation以Ascent开始，然后Descent，然后Ascent，以此类推，这个Permutation称为Alternating Permutation。

Interval of a Permutation是里面的一个factor，而且他们的值是相邻的(不要求按值的大小排列)，例如423 is an interval in 1642375。如果一个Permutation里全部proper initial factor都不是Interval。

Dirac: The Hamiltonian Method

I'm very happy to be here at Yeshiva and to have this chance to talk to you about some mathematical methods that I have been working on for a number of years. I would like first to describe in a few words the general object of these methods.

In atomic theory we have to deal with various fields. There are some fields which are very familiar, like the electromagnetic and the gravitational fields; but in recent times we have a number of other fields also to concern ourselves with, because according to the general ideas of De Broglie and Schrodinger every particle is associated with waves and these waves may be considered as a field. So we have in atomic physics the general problem of setting up a theory of various fields in interaction with each other. We need a theory conforming to the principles of quantum mechanics, but it is quite a difficult matter to get such a theory.

One can get a much simpler theory if one goes over to the corresponding classical mechanics, which is the form which quantum mechanics takes when one makes Planck's constant $\hbar$ tend to zero. It is very much easier to visualize what one is doing in terms of classical mechanics. It will be mainly about classical mechanics that I shall be talking in these lectures.

Now you may think that that is really not good enough, because classical mechanics is not good enough to describe Nature. Nature is described by quantum mechanics. Why should one, therefore, bother so much about classical mechanics? Well, the quantum field theories are, as I said, quite difficult and so far, people have been able to build up quantum field theories only for fairly simple kinds of fields with simple interactions between them. It is quite possible that these simple fields with the simple interactions between them are not adequate for a description of Nature. The successes which we get with quantum field theories are rather limited. One is continually running into difficulties and one would like to broaden one's basis and have some possibility of bringing more general fields into account. For example, one would like to take into account the possibility that Maxwell's equations are not accurately valid. When one goes to distances very close to the charges that arc producing the fields, one may have to modify Maxwell's field theory so as to make it into a nonlinear electrodynamics. This is only one example of the kind of generalization which it is profitable to consider in our present state of ignorance of the basic ideas, the basic forces and the basic character of the fields of atomic theory.

In order to be able to start on this problem of dealing with more general fields, we must go over the classical theory. Now, if we can put the classical theory into the Hamiltonian form, then we can always apply certain standard rules so as to get a first approximation to a quantum theory. My talks will be mainly concerned with this problem of putting a general classical theory into the Hamiltonian form. When one has done that, one is well launched onto the path of getting an accurate quantum theory. One has, in any case, a first approximation.

Of course, this work is to be considered as a preliminary piece of work. The final conclusion of this piece of work must be to set up an accurate quantum theory, and that involves quite serious difficulties, difiiculties of a fundamental character which people have been worrying over for quite a number of years. Some people are so much impressed by the difficulties of passing over from Hamiltonian classical mechanics to quantum mechanics that they think that maybe the whole method of working from Hamiltonian classical theory is a bad method. Particularly in the last few years people have been trying to set up alternative methods for getting quantum field theories. They have made quite considerable progress on these lines. They have obtained a number of conditions which have to be satisfied. Still I feel that these alternative methods, although they go quite a long way towards accounting for experimental results, will not lead to a final solution to the problem. I feel that there will always be something missing from them which we can only get by working from a Hamiltonian, or maybe from some generalization of the concept of a Hamiltonian. So I take the point of view that the Hamiltonian is really very important for quantum theory.

In fact, without using Hamiltonian methods one cannot solve some of the simplest problems in quantum theory, for example the problem of getting the Balmer formula for hydrogen, which was the very beginning of quantum mechanics. A Hamiltonian comes in therefore in very elementary ways and it seems to me that it is really quite crucial to work from a Hamiltonian; so I want to talk to you about how far one can develop Hamiltonian methods.

I would like to begin in an elementary way and I take as my starting point an action principle. That is to say, I assume that there is an action integral which depends on the€˜ motion, such that, when one varies the motion, and puts down the conditions for the action integral to be stationary, one gets the equations of motion. The method of starting from an action principle has the one great advantage, that one can easily make the theory conform in to the principle of relativity. We need our atomic theory in conform to relativity because in general we are dealing with particles moving with high velocities.

lf we want to bring in the gravitational field, then we have to make our theory conform to the general principle of relativity, which means working with a space-time which is not flat. Now the gravitational field is not very important in atomic physics, because gravitational forces are extremely weak compared with the other kinds of forces which are present in atomic processes, and for practical purposes one can neglect the gravitational field. People have in recent years worked to some extent on bringing the gravitational field into the quantum theory, but I think that the main object of this work was the hope that bringing in the gravitational field might help to solve some of the difficulties. As far as one can see at present, that hope is not realized, and bringing in the gravitational field seems to add to the difficulties rather than remove them. So that there is not very much point at present in bringing gravitational field into atomic theory. However, the methods which I am going to describe are powerful mathematical methods which would be available whether one brings in the gravitational field or not. We start off with an action integral which I denote by $$I=\int L dt \tag{1-1}$$ It is expressed as a time integral, the integrand $L$ being the Lagrangian. So with an action principle we have a Lagrangian. We have to consider how to pass from that Lagrangian to a Hamiltonian. When We have got the Hamiltonian, we have made the first step toward getting a quantum theory.

You might wonder whether one could not take the Hamiltonian as the starting point and short-circuit this work of beginning with an action integral, getting a Lagrangian from it and passing from the Lagrangian to the Hamiltonian. The objection to trying to make this short-circuit is that it is not at all easy to formulate the conditions for a theory to be relativistic in terms of the Hamiltonian. In terms of the action integral, it is very easy to formulate the conditions for the theory to be relativistic: one simply has to require that the action integral shall be invariant. One can easily construct innumerable examples of action integrals which are invariant. They will automatically lead to equations of motion agreeing with relativity, and any developments from this action integral will therefore also be in agreement with relativity.

When we have the Hamiltonian, we can apply a standard method which gives us a first approximation to a quantum theory, and if we are lucky we might be able to go on and get an accurate quantum theory. You might again wonder whether one could not short-circuit that work to some extent. Could one not perhaps pass directly from the Lagrangian to the quantum theory, and shortcircuit altogether the Hamiltonian ? Well, for some simple examples one can do that. For some of the simple fields which are used in physics the Lagrangian is quadratic in the velocities, and is like the Lagrangian which one has in the non-relativistic dynamics of particles. For these examples for which the Lagrangian is quadratic in the velocities, people have devised some methods for passing directly from the Lagrangian to the quantum theory. Still, this limitation of the Lagrangians being quadratic in the velocities is quite a severe one. I want to avoid this limitation and to work with a Lagrangian which can be quite a general function of the velocities. To get a general formalism which will be applicable, for example, to the non-linear electrodynamics which I mentioned previously, I don't think one can in any way shortcircuit the route of starting with an action integral, getting a Lagrangian, passing from the Langrangian to the Hamiltonian, and then passing from the Hamiltonian to the quantum theory. That is the route which I want to discuss in this course of lectures.

In order to express things in a simple way to begin with, I would like to start with a dynamical theory involving only a finite number of degrees of freedom, such as you are familiar with in particle dynamics. It is then merely a formal matter to pass from this finite number of degrees of freedom to the infinite number of degrees of freedom which we need for a field theory.

Starting with a finite number of degrees of freedom, we have dynamical coordinates which I denote by $q$. The general one is $q_n, n = 1,\cdots , N, N$ being the number of degrees of freedom. Then we have the velocities $dq_n/dt = \dot{q}_n$. The Lagrangian is a function $L = L(q, \dot{q})$ of the coordinates and the velocities.

You may be a little disturbed at this stage by the importance that the time variable plays in the formalism. We have a time variable $t$ occurring already as soon as we introduce the Lagrangian. It occurs again in the velocities, and all the work of passing from Lagrangian to Hamiltonian involves one particular time variable. From the relativistic point of view we are thus singling out one particular observer and making our whole formalism refer to the time for this observer. That, of course, is not really very pleasant to a relativist, who would like to treat all observers on the same footing. However, it is a feature of the present formalism which l do not see how one can avoid if one wants to keep to the generality of allowing the Lagrangian to be any function of the coordinates and velocities. We can be sure that the contents of the theory are relativistic, even though the form of the equations is not manifestly relativistic on account of the appearance of one particular time in a dominant place in the theory.

Let us now develop this Lagrangian dynamics and pass over to Hamiltonian dynamics, following as closely as we can the ideas which one learns about as soon as one deals with dynamics from the point of view of working with general coordinates. We have the Lagrangian equations of motion which follow from the variation of the action integral: $$\frac{d}{dt}\frac{\partial L}{\partial \dot{q}_n}=\frac{\partial L}{\partial q_n} \tag{1-2}$$ To go over to the Hamiltonian formalism, we introduce the momentum variables $p_n$, which are defined by $$p_n=\frac{\partial L}{\partial \dot{q}_n} \tag{1-3}$$ Now in the usual dynamical theory, one makes the assumption that the momenta are independent functions of the velocities, but that assumption is too restrictive for the applications which we are going to make. We want to allow for the possibility of these momenta not being independent functions of the velocities. In that case, there exist certain relations connecting the momentum variables, of the type $\phi(q,p)=0$

There may be several independent relations of this type, and if there are, we distinguish them one from another by a suffix $m=1, \cdots, M$, so we have $$\phi_m(q,p) = 0 \tag{1-4}$$ The $q$'s and the $p$'s are the dynamical variables of the Hamiltonian theory. They are connected by thc relations (1-4), which are called the primary constraints of the Hamiltonian formalism. This terminology is due to Bergmann, and I think it is a good one.

Let us now consider the quantity $p_n\dot{q}_n - L$. (Whenever there is a repeated suffix I assume a summation over all values of that suffix.) Let us make variations in the variables $q$ and $\dot{q}$), in the coordinates and the velocities. These variations will cause variations to occur in the momentum variables $p$. As a result of these variations, $$ \begin{eqnarray} \delta(p_n\dot{q}_n - L) &=& \delta p_n\dot{q}_n + p_n\delta \dot{q}_n -\left(\frac{\partial L}{\partial q_n}\right)\delta q_n - \left(\frac{\partial L}{\partial \dot{q}_n}\right)\delta \dot{q}_n\\ &=& \delta p_n\dot{q}_n -\left(\frac{\partial L}{\partial q_n}\right)\delta q_n \end{eqnarray} \tag{1-5} $$ by $(1\text{-}3)$.

Now you see that the variation of this quantity $p_n\dot{q}_n - L$ involves only the variation of the $q$'s and that of the $p$'s. It does not involve the variation of the velocities. That means that $p_n\dot{q}_n L$ - can be expressed in terms of the $q$'s and the $p$'s, independent of the velocities. Expressed in this way, it is called the Hamiltonian $H$.

However, the Hamiltonian defined in this way is not uniquely determined, because we may add to it any linear combination of the $\phi$'s, which are zero. Thus, we could go over to another Hamiltonian $$H^* = H + c_m\phi_m \tag{1-6}$$ where the quantities $c_m$ are coefficients which can be any function of the $q$'s and the $p$'s. $H^*$ is then just as good as $H$; our theory cannot distinguish between $H$ and $H^*$. The Hamiltonian is not uniquely determined.

We have seen in $(1\text{-}5)$ that $$\delta H = \dot{q}_n\delta p_n -\left(\frac{\partial L}{\partial q_n}\right)\delta q_n$$ This equation holds for any variation of the $q$'s and the $p$'s subject to the condition that the constraints $(1\text{-}4)$ are preserved. The $q$'s and the $p$'s cannot be varied independently because they are restricted by $(1\text{-}4)$, but for any variation of the $q$'s and the $p$'s which preserves these conditions, we have this equation holding. From the general method of the calculus of variations applied to a variational equation with constraints of this kind, we deduce $$ \dot{q}_n =\frac{\partial H}{\partial p_n} + u_m\frac{\partial \phi_m}{\partial p_n}\tag{1-7} $$ and $$ -\frac{\partial L}{\partial q_n} = \frac{\partial H}{\partial q_n} + u_m\frac{\partial \phi_m}{\partial q_n} $$ or $$\dot{p}_n =- \frac{\partial H}{\partial q_n} - u_m\frac{\partial \phi_m}{\partial q_n}$$ with the hclp of $(1\text{-}2)$ and $(1\text{-}3)$, where the $u_m$ are unknown coefficients. We have here the Hamiltonian equations of motion, describing how the variables $q$ and $p$: vary in time, but these equations involve unknown coefficients $u_m$.

It is convenient to introduce a certain formalism which enables one to write these equations briefly, namely the Poisson bracket formalism. It consists of the following: If we have two functions of the the $q$'s and the $p$'s, say $f(q, p$) and $g(q, p)$, they have a Poisson bracket $[f, g]$ which is defined by $$[f, g] = \frac{\partial f}{\partial q_n} \frac{\partial g}{\partial p_n} - \frac{\partial f}{\partial p_n} \frac{\partial g}{\partial q_n} \tag{1-9}$$ The Poisson brackets have certain properties which follow from their definition, namely $[f,g]$ is anti-symmetric in $f$ and $g$: $$[f, g] = -[g, f], \tag{1-10}$$ it is linear in either member: $$[f_1 + f_2, g] = [f_1, g] + [f_2, g], \text{etc.;} \tag{1-11}$$ and we have the product law, $$[f_1f_2, g] = f_1[f_2, g] + [f_1, g]f_2. \tag{1-12}$$ Finally, there is the relationship, known as the Jacobi Identity, connecting three quantities: $$[f, [g, h]] + [g, [h, f]] + [h, [f, g]] = 0, \tag{1-13}$$ With the help of the Poisson bracket, one can rewrite the equations of motion. For any function $g$ of the $q$'s and the $p$'s, we have $$\dot{g}=\frac{\partial g}{\partial q_n}\dot{q}_n + \frac{\partial g}{\partial p_n}\dot{p}_n \tag{1-14}$$ If we substitute for $q_n$ and $p_n$ their values given by $(1\text{-}7)$ and $(1\text{-}8)$, we find that $(1\text{-}14)$ is just $$\dot{g} = [g, H] + u_m[g, \phi_m]\tag{1-15}$$ The equations of motion are thus all written concisely in the Poisson bracket formalism.

We can write them in a still more concise formalism if we extend the notion of Poisson bracket somewhat. As I have defined Poisson brackets, they have a meaning only for quantities $f$ and $g$ which can be expressed in terms of the $q$'s and the $p$'s. Something more general, such as a general velocity variable which is not expressible in terms of the $q$'s and the $p$'s, does not have a Poisson bracket with another quantity. Let us extend the meaning of Poisson brackets and suppose that they exist for any two quantities and that they satisfy the laws (1-10), (1-11), (1-12), and (1-13), but are otherwise undetermined when the quantities are not functions of the $q$'s and the $p$'s.

Then we may write (1-15) as $$\dot{g} = [g, H + u_m\phi_m]. \tag{1-16}$$ Here you see the coefficients $u$ occurring in one of the members of a Poisson bracket. The coefficients $u_m$ are not functions of the $q$'s and the $p$'s, so that we cannot use the definition (1-9) for determining the Poisson bracket in (1-16). However, we can proceed to work out this Poisson bracket using the laws (1-10), (1-11), (1-12) and (1-13). Using the summation law (1-11) we have: $$[g, H + u_m\phi_m] = [g, H] + [g, u_m\phi_m]. \tag{1-17}$$ and using the product law (1-12) $$[g, u_m\phi_m] = [g, u_m]\phi_m + u_m[g, \phi_m]. \tag{1-18}$$ The last bracket in (1-18) is well-defined, for $g$ and $\phi_m$, are both functions of the $q$'s and the $p$'s. The Poisson bracket $[g, u_m]$ is not defined, but it is multiplied by something that vanishes, $\phi_m$. So the first term on the right of (1-18) vanishes. The result is that $$[g, H + u_m\phi_m] = [g, H] + u_m[g, \phi_m]. \tag{1-19}$$ making (1-16) agree with (1-15

There is something that we have to be careful about in working with the Poisson bracket formalism: We have the constraints (1-4), but must not use one of these constraints before working out a Poisson bracket. If we did, we would get a wrong result. So we take it as a rule that Poisson brackets must all be worked out before we make use of the constraint equations. To remind us of this rule in the formalism, I write the constraints (1-4) as equations with a different equality sign $\approx$ from the usual. Thus they are written $$\phi_m \approx 0 \tag{1-20}$$ I call such equations weak equations, to distinguish them from the usual or strong equations.

One can make use of (1-20) only after one has worked out all the Poisson brackets which one is interested in. Subject to this rule, the Poisson bracket (1-19) is quite definite, and we have the possibility of writing our equations of motion (1-16) in a very concise form: $$\dot{g} \approx [g, H_T] \tag{1-21}$$ with a Hamiltonian I call the total Hamiltonian, $$H_T = H + u_m \phi_m \tag{1-22}$$ Now let us examine the consequences of these equations of motion. In the first place, there will be some consistency conditions. We have the quantities $\phi$ which have to be zero throughout all time. We can apply the equation of motion (1-21) or (1-15) taking $g$ to be one of the $\phi$'s. We know that $\dot{g}$ must be zero for consistency, and so we get some consistency conditions. Let us see what they are like. Putting $g = \phi_m$ and $\dot{g} = 0$ in (1-15), we have: $$[\phi_m, H] + u_{m'}[\phi_m, \phi_{m'}] \approx 0 \tag{1-23}$$ We have here a number of consistency conditions, one for each value of $m$. We must examine these conditions to see what they lead to. lt is possible for them to lead directly to an inconsistency. They might lead to the inconsistency $1 = 0$. If that happens, it would mean that our original Lagrangian is such that the Lagrangian equations of motion are inconsistent. One can easily construct an example with just one degree of freedom. If we take $L = q$ then the Lagrangian equation of motion (1-2) gives immediately $1 = O$. So you see, we cannot take the Lagrangian to be completely arbitrary. We must impose on it the condition that the Lagrangian equations of motion do not involve an inconsistency. With this restriction the equations (1-23) can be divided into three kinds.

One kind of equation reduces to $0 = 0$, i.e. it is identically satisfied, with the help of the primary constraints.

Another kind of equation reduces to an equation independent of the $u$'s, thus involving only the $q$'s and the $p$'s. Such an equation must be independent of the primary constraints, otherwise it is of the first kind. Thus it is of the form $$\chi(q, p) = 0 \tag{1-24}$$ Finally, an equation in (1-23) may not reduce in either of these ways; it then imposes a condition on the $u$'s.

The first kind we do not have to bother about any more. Each equation of the second kind means that we have another constraint on the Hamiltonian variables. Constraints which turn up in this way are called secondary constraints. They differ from the primary constraints in that the primary constraints are consequences merely of the equations (1-3) that defined the momentum variables, while for the secondary constraints, one has to make use of the Lagrangian equations of motion as well.

If we have a secondary constraint turning up in our theory, then we get yet another consistency condition, because we can work out $\dot{\chi}$ according to the equation of motion (1-15) and we require that $\dot{\chi} = 0$ . So we get another equation $$[\chi, H] + u_m[\chi,\phi_m] \approx 0 \tag{1-25}$$ This equation has to be treated on the same footing as (1-23). One must again see which of the three kinds it is. If it is of the second kind, then we have to push the process one stage further because we have a further secondary constraint. We carry on like that until we have exhausted all the consistency conditions, and the final result will be that we are left with a number of secondary constraints of the type (1-24) together with a number of conditions on the coefficients $u$ of the type (1-23).

The secondary constraints will for many purposes be treated on the same footing as the primary constraints. It is convenient to use the notation for them: $$\phi_k \approx 0,\ k = M + 1, \cdots, M+K \tag{1-26}$$ where $K$ is the total number of secondary constraints. They ought to be written as weak equations in the same way as primary constraints, as they are also equations which one must not make use of before one works out Poisson brackets. So all the constraints together may be written as $$\phi_j \approx 0,\ j = M + 1, \cdots, M+K \equiv \mathcal{J} \tag{1-27}$$ Let us now go over to the remaining equations of the third kind. We have to see what conditions they impose on the coefficients $u$. These equations are $$[\phi_j, H] + u_m[\phi_j, \phi_m] \approx 0 \tag{1-28}$$ where $m$ is summed from 1 to $M$ and $j$ takes on any of the values from 1 to $\mathcal{J}$. We have these equations involving conditions on the coefficients u, insofar as they do not reduce merely to the constraint equations.

Let us look at these equations from the following point of view. Let us suppose that the $u$'s are unknowns and that we have in (1-28) a number of non-homogeneous linear equations in these unknowns $u$, with coefficients which are functions of the $q$'s and the $p$'s. Let us look for a solution of these equations, which gives us the $u$'s as functions of the $q$'s and the $p$'s, say $$u_m = U_m(q, p) \tag{1-29}$$ There must exist a solution of this type, because if there were none it would mean that the Lagrangian equations of motion are inconsistent, and we are excluding that case.

The solution is not unique. If we have one solution, we may add to it any solution $V_m(q, p)$ of the homogeneous equations associated with (1-28): $$V_m(\phi_j, \phi_m)=0 \tag{1-30}$$ and that will give us another solution of the inhomogeneous equations (1-28). We want the most general solution of (1-28) and that means that we must consider all the independent solutions of (1-30), which we may denote by $V_{am}(q, p), a = 1,\cdots, A$. The general solution of (1-28) is then $$u_m = U_m + v_aV_{am} \tag{1-31}$$ in terms of coefficients $v_a$ which can be arbitrary.

Let us substitute these expressions for $u$ into the total Hamiltonian of the theory (1-22). That will give us the total Hamiltonian $$H_T = H + U_m\phi_m + v_aV_{am}\phi_m \tag{1-32}$$ We can write this as $$H_T= H' + v_a\phi_s \tag{1-33}$$ where $$H' = H + U_m\phi_m \tag{1-33'}$$ and $$\phi_a = V_{am}\phi_{m} \tag{1-34}$$ In terms of this total Hamiltonian (1-33) we still have the equations of motion (1-21).

As a result of carrying out this analysis, we have satisfied all the consistency requirements of the theory and we still have arbitrary coefficients $v$. The number of the coefficients $v$ will usually be less than the number of coefficients $u$. The $u$'s are not arbitrary but have to satisfy consistency conditions, while the $v$'s are arbitrary coefficients. We may take the $v$'s to be arbitrary functions of the time and we have still satisfied all the requirements of our dynamical theory.

This provides a difference of the generalized Hamiltonian formalism from what one is familiar with in elementary dynamics. We have arbitrary functions of the time occurring in the general solution of the equations of motion with given initial conditions. These arbitrary conditions of the time must mean that we are using a mathematical framework containing arbitrary features, for example, a coordinate system which we can choose in some arbitrary way, or the gauge in electrodynamics as a result of this arbitrariness in the mathematical framework, the dynamical variables at future times are not completely determined by the initial dynamical variables, and this shows itself up through arbitrary functions appearing in the general solution.

We require some terminology which will enable one to appreciate the relationships between the quantities which occur in the formalism. I find the following terminology useful. I define any dynamical variable, $R$, a function of the $q$'s and the $p$'s, to be first-class if it has zero Poisson brackets with all the $\phi$'s: $$[R, \phi_j] \approx 0,\ j=1,\cdots, \mathcal{J} \tag{1-35}$$ It is sufficient if these conditions hold weakly. Otherwise $R$ is second-class. If $R$ is first-class, then $[R, \phi_j]$ has to be strongly equal to some linear function of the $\phi$'s, as anything that is weakly zero in the present theory is strongly equal to some linear function of the $\phi$'s. The $\phi$'s are, by definition, the only independent quantities which are weakly zero. So we have the strong equations $$[R, \phi_j] = r_{jj'}\phi_{j'} \tag{1-36}$$ Before going on, I would like to prove a

Theorem: the Poisson bracket of two first-class quantities is also first-class. Proof. Let $R, S$ be first-class: then in addition to (1-36), we mve $$[S, \phi_j] = s_{jj'}\phi_{j'} \tag{1-36'}$$ Let us form $[[R, S], \phi_j]$. We can work out this Poisson bracket using Jacobi's identity (1-13) $$[[R, S], \phi_j] = [[R, \phi_j], S] - [[S, \phi_j], R] \approx 0$$ by (1-36), (1-36'), the product law (1-12), and (1-20). The whole thing vanishes weakly. We have proved therefore that [R, S] is first-class.

We have altogether four different kinds of constraints. We can divide constraints into first-class and secondclass, which is quite independent of the division into primary and secondary.

I would like you to notice that $H'$ given by (1-33') and the $\phi_a$ given by (1-34) are first-class. Forming the Poisson bracket of $\phi_a$ with $\phi_j$, we get, by (1-34), $U_{am}[\phi_a, \phi_j]$ plus terms that vanish weakly. Since the $U_{am}$ are deï¬ned to satisfy (1-30), $\phi_a$ is first-class. Thus (1-28) with $U_m$ for $u_m$ shows that $H'$ is first-class. Thus (1-33) gives the total Hamiltonian in terms of a first-class Hamiltonian $H'$ together with some first-class $\phi$'s.

Any linear combination of the $\phi$'s is of course another constraint, and if we take a linear combination of the primary constraints we get another primary constraint. So each $\phi_a$ is a primary constraint; and it is first-class. So the final situation is that we have the total Hamiltonian expressed as the sum of a first-class Hamiltonian plus a linear combination of the primary, first-class constraints.

The number of independent arbitrary functions of the time: occurring in the general solution of the equations of motion is equal to the number of values which the suffix $a$ takes on. That is equal to the number of independent primary first-class constraints, because all the independent primary first-class constraints are included in the sum (1-33).

That gives you then the general situation. We have deduced it by just starting from the Lagrangian equations of motion, passing to the Hamiltonian and working out consistency conditions.

From the practical point of view one can tell from the general transformation properties of the action integral what arbitrary functions of the time will occur in the general solution of the equations of motion. To each of these functions of the time there must correspond some primary first-class constraint. So we can tell which primary first-class constraints we are going to have without going through all the detailed calculation of working out Poisson brackets; in practical applications of this theory we can obviously save a lot of work by using that method.

I would like to go on a bit more and develop one further point of the theory. Let us try to get a physical understanding of the situation where we start with given initial variables and get a solution of the equations of motion containing arbitrary functions. The initial variables which we need are the $q$'s and the $p$'s. We don't need to be given initial values for the coefficients $v$. These initial conditions describe what physicists would call the initial physical state of the system. The physical state is determined only by the $q$'s and the $p$'s and not by the coefficients $v$.

Now the initial state must determine the state at later times. But the $q$'s and the $p$'s at later times are not uniquely determined by the initial state because we have the arbitrary functions $v$ coming in. That means that the state does not uniquely determine a set of $q$'s and $p$'s, even though a set of $q$'s and $p$'s uniquely determines a state. There must be several choices of $q$'s and $p$'s which correspond to the same state. So we have the problem of looking for all the sets of $q$'s and $p$'s that correspond to one particular physical state.

All those values for the $q$'s and $p$'s at a certain time which can evolve from one initial state must correspond to the same physical state at that time. Let us take particular initial values for the $q$'s and the $p$'s at time $t = 0$, and consider what the $q$'s and the $p$'s are after a short time interval $\delta t$. For a general dynamical variable $g$, with initial value $g_0$, its value at time $\delta t$ is $$ \begin{eqnarray} g(\delta t) &=& g_0 + \dot{g}\delta t\\ &=& g_0 + [g, H_T]\delta t \\ &=& g_0 + \delta t \{ [g, H'] + v_a[g, \phi_a] \} \end{eqnarray} \tag{1-37} $$ The coefficients $v$ are completely arbitrary and at our disposal. Suppose we take different values, $v'$, for these coefficients. That would give a different $g(\delta t)$, the difference being $$\Delta g(\delta t) = \delta t(v_a - v'_a)[g, \phi_a] \tag{1-38}$$ We may write this as $$\Delta g(\delta t) = \epsilon_a[g, \phi_a] \tag{1-39}$$ where $$\epsilon_a = \delta t(v_a - v'_a) \tag{1-40}$$

is a small arbitrary number, small because of the coefficients $\delta t$ and arbitrary because the $v$'s and the $v'$'s are arbitrary. We can change all our Hamiltonian variables in accordance with the rule (1-39) and the new Hamiltonian variables will describe the same state. This change in the Hamiltonian variables consists in applying all infinitesimal contact transformation with a generating function $\epsilon_a\phi_a$. We come to the conclusion that the $\phi_a$'s, which appeared in the theory in the first place as the primary first-class constraints, have this meaning: as generating functions of infinitesimal contact transformations, they lead to changes in the $q$'s and the $p$'s that do not affect the physical state.

However, that is not the end ofthe story. We can go on further in the same direction. Suppose we apply two of these contact transformations in succession. Apply first a contact transformation with generating function $\epsilon_a\phi_a$ and then apply a second contact transformation with generating function $\gamma_{a'}\phi_{a'}$ where the $\gamma$'s are some new small coefficients. We get finally $$g' = g_0 + \epsilon_a[g, \phi_a] + \gamma_{a'}[g + \epsilon_a[g, \phi_a], \phi_{a'}] \tag{1-41}$$ (I retain the second order terms involving products $\epsilon\gamma$, but I neglect the second order terms involving $\epsilon^2$ or involving $\gamma^2$. This is legitimate and sufficient. I do that because I do not want to write down more than I really need for getting the desired result.) If we apply the two transformations in succession in the reverse order, We get finally $$g'' = g_0 + \gamma_{a'}[g, \phi_a] + \epsilon_a[g + \gamma_{a'}[g, \phi_{a'}], \phi_{a}] \tag{1-42}$$ Now let us subtract these two. The difference is $$\Delta g = \epsilon_{a}\gamma_{a'}\{ [[g, \phi_a], \phi_{a'}] - [[g, \phi_{a'}], \phi_{a}] \} \tag{1-43}$$ By Jacobi's identity (1-13) this reduces to $$\Delta g = \epsilon_{a}\gamma_{a'}[g, [\phi_a, \phi_{a'}]] \tag{1-44}$$ This $\Delta g$ must also correspond to a change in the $q$'s and the $p$'s which does not involve any change in the physical state, because it is made up by processes which individually don't involve any change in the physical state. Thus we see that we can use $$[\phi_a, \phi_{a'}] \tag{1-45}$$ as a generating function of an infinitesimal contact transformation and it will still cause no change in the physical state.

Now the $\phi_a$ are first-class: their Poisson brackets are weakly zero, and therefore strongly equal to some linear function of the $\phi$'s. This linear function of the $\phi$'s must be first-class because of the theorem I proved a little while back, that the Poisson bracket of two first-class quantities is first-class. So we see that the transformations which we get this way, corresponding to no change in the physical state, are transformations for which the generanng function is a first-class constraint. The only way these transformations are more general than the ones we had before is that the generating functions which we had before are restricted to be first-class primary constraints. Those that we get now could be first-class secondary constraints. The result of this calculation is to show that we might have a first-class secondary constraint as a generating function of an infinitesimal contact translormation which leads to a change in the $q$'s and the $p$'s without changing the state.

For the sake of completeness, there is a little bit of further work one ought to do which shows that a Poisson bracket $[H', \phi_a]$ of the first-class Hamiltonian $H'$ with a first-class $\phi$ is again a linear function of first-class constraints. This can also be shown to be a possible generator for infinitesimal contact transformations which do not change the state.

The final result is that those transformations of the dynamical variables which do not change physical states are infinitesimal contact transformations in which the generating function is a primary first-class constraint or possibly a secondary first-class constraint. A good many of the secondary first-class constraints do turn up by the process (1-45) or as $[H', \phi_a]$ . I think it may be that all the first-class secondary constraints should be included among the transformations which don't change the physical state, but I haven't been able to prove it. Also, I haven't found any example for which there exist first-class secondary constraints which do generate a change in the physical state.

2012/06/06

Information Channels

Forward Probabilitis：已知输入是$a_i$，输出是$b_j$的概率可写成矩阵(这个概率和源的分布无关): $$P_{ij}=P(b=b_j|a=a_i)$$ 如果往channel里一端输入$a_i$，另一端一定会输出一个，所以$\sum_j P_{ij}=1$，输入的信号源有$\sum_ip_i=1$。

Backword Probabilitis：已知输出是$b_j$那么输出$a_i$的概率为(这个概率和输出的分布无关)： $$Q_{ij}=P(a=a_i | b=b_j)$$ Joint Probabilitis：输出是$b_j$，输入$a_i$的概率为： $$R_{ij}=P(a=a_i, b=b_j)$$ 关系： $$p_iP_{ij}=P(a_i)P(b_j|a_i)=P(a_i, b_j)=P(b_j)P(a_i|b_j)=q_jQ_{ij}$$

2012/06/04

Bit Wizardry

1) little endian和big endian可以用int32*往char*类型转换的来区分，int32*指向一个word，当cast到char*的时候如果得的是最不重要的byte，那么是little endian，如果得的是最重要的byte那么是big endian。

得最不重要的byte可以看成是$\% 256$运算，得最重要的byte可以看成是除以$2^{24}$。

2) 对2的幂取模相当于bit-and：int32 a = b % 32 // == b & (32-1)，也就是只留下后面的部分。

除一个编译期的常数，编译器会把除法优化成取模和乘法的组合。

3) XOR是有且只有一个不为0，如果两个整数$x, y$相加会溢出，现在要求平均值，可以用位运算实现。两个二进制数相加，可以分成两部分

一部分是有进位的bits，一部分是没有进位的bits，因为进位最多为1，所以也可以上分成进位和进位后当前bit剩下的值。识别的办法是，x & y 为进位(1的时候是有进位，0的时候是没有进位)，x ^ y为除去进位的效果后当前bit剩下的值(也就是mod 2)。

因为求平均值除以2是可以结合的，所以可以分别作右移，因为加法里进位本身是做左移动，现在识别出来后，左右移动抵消x & y就不用动了，把x ^ y右移，然后相加的平均值。(连续进位之和两个数的相对位置有关，所以连续进位被延迟到移位后的加法里)。

4) XOR满足结合律，所以
对任何的t有(a ^ t) ^ t = a ^ ( t ^ t ) = a ^ 0 = a
如果令a ^ t = b，那么 b ^ t = a
a ^ t = b是我们引入的方程，可以把t解出来，两边用a XOR得 t = a ^ b

用语言描述，如果已知a以及a和一个数XOR的结果b，那么把他俩XOR就可以得到未知的那个数。& 和 | 没有这个性质，比如$2^{32} - 1$和任何数做|运算都得到$2^{32} - 1$，知道$2^{32} - 1$和结果不可能把这个数恢复出来。

5) 求x是整数，得x的下一个偶数 (x | 1)+1，得前一个偶数 ( x - 1 )&~1

6) 测试x的第i位，x & ( 1UL << i )，返回$x[i]*2^{i-1}$

7) 设置第i位， x | ( 1UL << i ) ，清除第i位 x & ~( 1UL << i )，取反第i位 x ^ ( 1UL << i )

8)
x = a & b $\Longleftrightarrow$ (a * b) mod 2
x = a ^ b $\Longleftrightarrow$ (a + b) mod 2
x = a ^ b $\Longleftrightarrow$ if( b == 1 ){ x = ~a } else { x = a } // 对1反射，穿过0
x = a ^ b $\Longleftrightarrow$ if( a != b ){ x = 1 } else { x = 0 } // 判断是不是不相等

9)
a, b // 初始知道a, b
a, a^b // 抛弃b，留a^b
b, a^b // 抛弃a，留b
b, a // 抛弃 a^b，留a

信息没有丢失，一直维护两个数，但是位置交换了。

10) x &= -x; // isolate lowest bit

11)
N: 4 8 16 32 64 128 256 512 1024
m: 5 11 19 37 67 131 269 523 1061

上面的表给出了$N$位表达里面对m取摸的一个性质，$f(i) = 2^i \mod m$是一一映射，所以可以构造一个look up table，快速通过对m的mod得最低bit的index。

12) 整数的$x$的$\lfloor \log_2 x \rfloor$用bit操作就很自然：

13) 判断一个数是不是2的幂，如果是这个数减去1得m，m的二进制表达前i位全是1，和原数XOR，右移和m相等。

14) 得比x大且离得最近的2的幂，要得的数是$2^i$其中$i$是x的最高bit的index+1，方法是用x的最高bit，构造出$2^i-1$然后加1

2012/05/30

Source Code

1) 定义

2012/05/29

DNA

1) Sequence Alignment：两个字符序列，允许的操作是插入"-"，目标是使得对其的字母最多，但是加入"-"要接受惩罚。

初始字符串

加入"-"使得对齐的字符更多

可以看成A, B两个人出牌，每次出一张，可以不出，不能同时不出，A不出的话横着走一格，B不出竖着走一格，都出走到紧接着右下的格子。任何一个人得牌出完了就结束了，图里看就是到达了右边一列或者下边一排的格子。

用GDPP建模

a) $\Omega_{i,j}$的entiries定义为从开始到A出了$i$张牌，B出了$j$张牌的所有路径。

b) 变换定义为$\Omega_{i-1,j}, \Omega_{i,j-1}, \Omega_{i-1,j-1}$里的entries，通过再次出牌变为$\Omega_{i,j}$的entries。

c) 变换的价值定义为：一个人不出的话-1( $\Omega_{i-1,j}, \Omega_{i,j-1}$里过来的 )，都出了但是不一样+0，出了一样+1( $\Omega_{i-1,j-1}里过来的$ )。

d) 最优函数定义为$\mathcal{F}(\Omega_{i,j})$所有路径价值的最大值，阶段是二维index，每个阶段只输出一个值。

e) 递归关系是： $$\mathcal{F}(\Omega_{i,j})=\max[ \mathcal{F}(\Omega_{i-1,j}) - 1, \mathcal{F}(\Omega_{i,j-1}) - 1, \mathcal{F}(\Omega_{i-1,j-1})+ \delta( M_{i,j} = X ) ]$$