Skip to content

Commit 94924d2

Browse files
authored
Merge pull request #12 from RockySJ/master
def, mmd, wd
2 parents ef74e32 + 80f3a28 commit 94924d2

3 files changed

Lines changed: 75 additions & 59 deletions

File tree

src/chaps/ch01_introduction.tex

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -162,13 +162,13 @@ \subsection{与已有概念的区别和联系}
162162
\end{tabular}
163163
\end{table}
164164

165-
\textbf{2. 迁移学习 VS 终身学习}
165+
\textbf{2. 迁移学习 VS 多任务学习}
166166

167-
终身学习强调连续不断地在一个概念和任务上进行学习,模型持续优化。迁移学习则侧重于模型的迁移和共同学习
167+
多任务学习指多个相关的任务一起协同学习;迁移学习则强调知识由一个领域迁移到另一个领域的过程。迁移是思想,多任务是其中的一个具体形式
168168

169-
\textbf{3. 迁移学习 VS 多任务学习}
169+
\textbf{3. 迁移学习 VS 终身学习}
170170

171-
多任务学习指多个相关的任务一起协同学习;迁移学习则强调知识由一个领域迁移到另一个领域的过程。迁移是思想,多任务是其中的一个具体形式
171+
终身学习可以认为是序列化的多任务学习,在已经学习好若干个任务之后,面对新的任务可以继续学习而不遗忘之前学习的任务。迁移学习则侧重于模型的迁移和共同学习
172172

173173
\textbf{4. 迁移学习 VS 领域自适应:}
174174

@@ -184,7 +184,7 @@ \subsection{与已有概念的区别和联系}
184184

185185
\textbf{7. 迁移学习 VS 协方差漂移}
186186

187-
协方差漂移也是迁移学习要研究的问题之一,它特指数据的条件概率分布发生变化
187+
协方差漂移指数据的边缘概率分布发生变化。领域自适应研究问题解决的就是协方差漂移现象
188188

189189
\subsection{负迁移}
190190
我们都希望迁移学习能够比较顺利地进行,我们得到的结果也是满足我们要求的,皆大欢喜。然而,事情却并不总是那么顺利。这就引入了迁移学习中的一个负面现象,也就是所谓的\textbf{负迁移}。

src/chaps/ch04_basic.tex

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ \subsubsection{迁移学习}
4141

4242
结合上述形式化,我们给出\textbf{领域自适应(Domain Adaptation)}这一热门研究方向的定义:
4343

44-
\textbf{领域自适应(Domain Adaptation):} 给定一个有标记的源域$\mathcal{D}_s=\{\mathbf{x}_{i},y_{i}\}^n_{i=1}$和一个无标记的目标域$\mathcal{D}_t=\{\mathbf{x}_{j}\}^{n+m}_{j=n+1}$,假定它们的特征空间相同,即$\mathcal{X}_s = \mathcal{X}_t$,并且它们的类别空间也相同,即$\mathcal{Y}_s = \mathcal{Y}_t$。但是这两个域的边缘分布不同,即$P_s(\mathbf{x}_s) \ne P_t(\mathbf{x}_t)$,条件概率分布也不同,即$Q_s(y_s|\mathbf{x}_s) \ne Q_t(y_t|\mathbf{x}_t)$。迁移学习的目标就是,利用有标记的数据$\mathcal{D}_s$去学习一个分类器$f:\mathbf{x}_t \mapsto \mathbf{y}_t$来预测目标域$\mathcal{D}_t$的标签$\mathbf{y}_t \in \mathcal{Y}_t$.
44+
\textbf{领域自适应(Domain Adaptation):} 给定一个有标记的源域$\mathcal{D}_s=\{\mathbf{x}_{i},y_{i}\}^n_{i=1}$和一个无标记的目标域$\mathcal{D}_t=\{\mathbf{x}_{j}\}^{n+m}_{j=n+1}$,假定它们的特征空间相同,即$\mathcal{X}_s = \mathcal{X}_t$,并且它们的类别空间也相同,即$\mathcal{Y}_s = \mathcal{Y}_t$以及条件概率分布也相同,即$Q_s(y_s|\mathbf{x}_s) = Q_t(y_t|\mathbf{x}_t)$。但是这两个域的边缘分布不同,即$P_s(\mathbf{x}_s) \ne P_t(\mathbf{x}_t)$。迁移学习的目标就是,利用有标记的数据$\mathcal{D}_s$去学习一个分类器$f:\mathbf{x}_t \mapsto \mathbf{y}_t$来预测目标域$\mathcal{D}_t$的标签$\mathbf{y}_t \in \mathcal{Y}_t$.
4545

4646
在实际的研究和应用中,读者可以针对自己的不同任务,结合上述表述,灵活地给出相关的形式化定义。
4747

@@ -206,14 +206,14 @@ \subsubsection{KL散度与JS距离}
206206

207207
\subsubsection{最大均值差异MMD}
208208

209-
最大均值差异是迁移学习中使用频率最高的度量。Maximum mean discrepancy,它度量在再生希尔伯特空间中两个分布的距离,是一种核学习方法。两个随机变量的距离为
209+
最大均值差异是迁移学习中使用频率最高的度量。Maximum mean discrepancy,它度量在再生希尔伯特空间中两个分布的距离,是一种核学习方法。两个随机变量的MMD平方距离为
210210

211211
\begin{equation}
212212
\label{eq-dist-mmd}
213-
MMD(X,Y)=\left \Vert \sum_{i=1}^{n_1}\phi(\mathbf{x}_i)- \sum_{j=1}^{n_2}\phi(\mathbf{y}_j) \right \Vert^2_\mathcal{H}
213+
MMD^2(X,Y)=\left \Vert \sum_{i=1}^{n_1}\phi(\mathbf{x}_i)- \sum_{j=1}^{n_2}\phi(\mathbf{y}_j) \right \Vert^2_\mathcal{H}
214214
\end{equation}
215215

216-
其中$\phi(\cdot)$是映射,用于把原变量映射到\textit{再生核希尔伯特空间}(Reproducing Kernel Hilbert Space, RKHS)~\cite{borgwardt2006integrating}中。什么是RKHS?形式化定义太复杂,简单来说就是这个空间是对于函数的内积完备的。就是比欧几里得空间更高端的。
216+
其中$\phi(\cdot)$是映射,用于把原变量映射到\textit{再生核希尔伯特空间}(Reproducing Kernel Hilbert Space, RKHS)~\cite{borgwardt2006integrating}中。什么是RKHS?形式化定义太复杂,简单来说希尔伯特空间是对于函数的内积完备的,而再生核希尔伯特空间是具有再生性$\langle K(x,\cdot),K(y,\cdot)\rangle_\mathcal{H}=K(x,y)$的希尔伯特空间。就是比欧几里得空间更高端的。将平方展开后,RKHS空间中的内积就可以转换成核函数,所以最终MMD可以直接通过核函数进行计算
217217

218218
理解:就是求两堆数据在RKHS中的\textit{均值}的距离。
219219

@@ -255,6 +255,21 @@ \subsubsection{Hilbert-Schmidt Independence Criterion}
255255
\end{equation}
256256
其中$X,Y$是两堆数据的kernel形式。
257257

258+
\subsubsection{Wasserstein Distance}
259+
260+
Wasserstein Distance是一套用来衡量两个概率分部之间距离的度量方法。该距离在一个度量空间$(M,\rho)$上定义,其中$\rho(x,y)$表示集合$M$中两个实例$x$$y$的距离函数,比如欧几里得距离。两个概率分布$\mathbb{P}$$\mathbb{Q}$之间的$p{\text{-th}}$ Wasserstein distance可以被定义为
261+
262+
\begin{equation}
263+
W_p(\mathbb{P}, \mathbb{Q}) = \Big(\inf_{\mu \in \Gamma(\mathbb{P}, \mathbb{Q}) } \int \rho(x,y)^p d\mu(x,y) \Big)^{1/p},
264+
\end{equation}
265+
266+
其中$\Gamma(\mathbb{P}, \mathbb{Q})$是在集合$M\times M$内所有的以$\mathbb{P}$$\mathbb{Q}$为边缘分布的联合分布。著名的Kantorovich-Rubinstein定理表示当$M$是可分离的时候,第一Wasserstein distance可以等价地表示成一个积分概率度量(integral probability metric)的形式
267+
268+
\begin{equation}
269+
W_1(\mathbb{P},\mathbb{Q})= \sup_{\left \| f \right \|_L \leq 1} \mathbb{E}_{x \sim \mathbb{P}}[f(x)] - \mathbb{E}_{x \sim \mathbb{Q}}[f(x)],
270+
\end{equation}
271+
其中$\left \| f \right \|_L = \sup{|f(x) - f(y)|} / \rho(x,y)$并且$\left \| f \right \|_L \leq 1$称为$1-$利普希茨条件。
272+
258273
\subsection{迁移学习的理论保证*}
259274
\textit{
260275
本部分的标题中带有*号,有一些难度,为可看可不看的内容。此部分最常见的形式是当自己提出的算法需要理论证明时,可以借鉴。}

src/main.toc

Lines changed: 51 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -31,53 +31,54 @@
3131
\contentsline {subsubsection}{\numberline {4.3.5}Principal Angle}{16}{subsubsection.4.3.5}
3232
\contentsline {subsubsection}{\numberline {4.3.6}A-distance}{16}{subsubsection.4.3.6}
3333
\contentsline {subsubsection}{\numberline {4.3.7}Hilbert-Schmidt Independence Criterion}{16}{subsubsection.4.3.7}
34-
\contentsline {subsection}{\numberline {4.4}迁移学习的理论保证*}{16}{subsection.4.4}
35-
\contentsline {section}{\numberline {5}迁移学习的基本方法}{18}{section.5}
36-
\contentsline {subsection}{\numberline {5.1}基于样本迁移}{18}{subsection.5.1}
37-
\contentsline {subsection}{\numberline {5.2}基于特征迁移}{19}{subsection.5.2}
38-
\contentsline {subsection}{\numberline {5.3}基于模型迁移}{19}{subsection.5.3}
39-
\contentsline {subsection}{\numberline {5.4}基于关系迁移}{20}{subsection.5.4}
40-
\contentsline {section}{\numberline {6}第一类方法:数据分布自适应}{22}{section.6}
41-
\contentsline {subsection}{\numberline {6.1}边缘分布自适应}{22}{subsection.6.1}
42-
\contentsline {subsubsection}{\numberline {6.1.1}基本思路}{22}{subsubsection.6.1.1}
43-
\contentsline {subsubsection}{\numberline {6.1.2}核心方法}{22}{subsubsection.6.1.2}
44-
\contentsline {subsubsection}{\numberline {6.1.3}扩展}{24}{subsubsection.6.1.3}
45-
\contentsline {subsection}{\numberline {6.2}条件分布自适应}{25}{subsection.6.2}
46-
\contentsline {subsection}{\numberline {6.3}联合分布自适应}{26}{subsection.6.3}
47-
\contentsline {subsubsection}{\numberline {6.3.1}基本思路}{26}{subsubsection.6.3.1}
48-
\contentsline {subsubsection}{\numberline {6.3.2}核心方法}{26}{subsubsection.6.3.2}
49-
\contentsline {subsubsection}{\numberline {6.3.3}扩展}{28}{subsubsection.6.3.3}
50-
\contentsline {subsection}{\numberline {6.4}小结}{29}{subsection.6.4}
51-
\contentsline {section}{\numberline {7}第二类方法:特征选择}{30}{section.7}
52-
\contentsline {subsection}{\numberline {7.1}核心方法}{31}{subsection.7.1}
53-
\contentsline {subsection}{\numberline {7.2}扩展}{31}{subsection.7.2}
54-
\contentsline {subsection}{\numberline {7.3}小结}{31}{subsection.7.3}
55-
\contentsline {section}{\numberline {8}第三类方法:子空间学习}{32}{section.8}
56-
\contentsline {subsection}{\numberline {8.1}统计特征对齐}{32}{subsection.8.1}
57-
\contentsline {subsection}{\numberline {8.2}流形学习}{34}{subsection.8.2}
58-
\contentsline {subsection}{\numberline {8.3}扩展与小结}{36}{subsection.8.3}
59-
\contentsline {section}{\numberline {9}深度迁移学习}{37}{section.9}
60-
\contentsline {subsection}{\numberline {9.1}深度网络的可迁移性}{37}{subsection.9.1}
61-
\contentsline {subsection}{\numberline {9.2}最简单的深度迁移:finetune}{41}{subsection.9.2}
62-
\contentsline {subsection}{\numberline {9.3}深度网络自适应}{42}{subsection.9.3}
63-
\contentsline {subsubsection}{\numberline {9.3.1}基本思路}{42}{subsubsection.9.3.1}
64-
\contentsline {subsubsection}{\numberline {9.3.2}核心方法}{43}{subsubsection.9.3.2}
65-
\contentsline {subsubsection}{\numberline {9.3.3}小结}{48}{subsubsection.9.3.3}
66-
\contentsline {subsection}{\numberline {9.4}深度对抗网络迁移}{48}{subsection.9.4}
67-
\contentsline {subsubsection}{\numberline {9.4.1}基本思路}{48}{subsubsection.9.4.1}
68-
\contentsline {subsubsection}{\numberline {9.4.2}核心方法}{48}{subsubsection.9.4.2}
69-
\contentsline {subsubsection}{\numberline {9.4.3}小结}{51}{subsubsection.9.4.3}
70-
\contentsline {section}{\numberline {10}上手实践}{52}{section.10}
71-
\contentsline {section}{\numberline {11}迁移学习前沿}{58}{section.11}
72-
\contentsline {subsection}{\numberline {11.1}机器智能与人类经验结合迁移}{58}{subsection.11.1}
73-
\contentsline {subsection}{\numberline {11.2}传递式迁移学习}{58}{subsection.11.2}
74-
\contentsline {subsection}{\numberline {11.3}终身迁移学习}{59}{subsection.11.3}
75-
\contentsline {subsection}{\numberline {11.4}在线迁移学习}{60}{subsection.11.4}
76-
\contentsline {subsection}{\numberline {11.5}迁移强化学习}{61}{subsection.11.5}
77-
\contentsline {subsection}{\numberline {11.6}迁移学习的可解释性}{61}{subsection.11.6}
78-
\contentsline {section}{\numberline {12}总结语}{62}{section.12}
79-
\contentsline {section}{\numberline {13}附录}{63}{section.13}
80-
\contentsline {subsection}{\numberline {13.1}迁移学习相关的期刊和会议}{63}{subsection.13.1}
81-
\contentsline {subsection}{\numberline {13.2}迁移学习研究学者}{63}{subsection.13.2}
82-
\contentsline {subsection}{\numberline {13.3}迁移学习资源汇总}{66}{subsection.13.3}
83-
\contentsline {subsection}{\numberline {13.4}迁移学习常用算法及数据资源}{67}{subsection.13.4}
34+
\contentsline {subsubsection}{\numberline {4.3.8}Wasserstein Distance}{17}{subsubsection.4.3.8}
35+
\contentsline {subsection}{\numberline {4.4}迁移学习的理论保证*}{17}{subsection.4.4}
36+
\contentsline {section}{\numberline {5}迁移学习的基本方法}{19}{section.5}
37+
\contentsline {subsection}{\numberline {5.1}基于样本迁移}{19}{subsection.5.1}
38+
\contentsline {subsection}{\numberline {5.2}基于特征迁移}{20}{subsection.5.2}
39+
\contentsline {subsection}{\numberline {5.3}基于模型迁移}{20}{subsection.5.3}
40+
\contentsline {subsection}{\numberline {5.4}基于关系迁移}{21}{subsection.5.4}
41+
\contentsline {section}{\numberline {6}第一类方法:数据分布自适应}{23}{section.6}
42+
\contentsline {subsection}{\numberline {6.1}边缘分布自适应}{23}{subsection.6.1}
43+
\contentsline {subsubsection}{\numberline {6.1.1}基本思路}{23}{subsubsection.6.1.1}
44+
\contentsline {subsubsection}{\numberline {6.1.2}核心方法}{23}{subsubsection.6.1.2}
45+
\contentsline {subsubsection}{\numberline {6.1.3}扩展}{25}{subsubsection.6.1.3}
46+
\contentsline {subsection}{\numberline {6.2}条件分布自适应}{26}{subsection.6.2}
47+
\contentsline {subsection}{\numberline {6.3}联合分布自适应}{27}{subsection.6.3}
48+
\contentsline {subsubsection}{\numberline {6.3.1}基本思路}{27}{subsubsection.6.3.1}
49+
\contentsline {subsubsection}{\numberline {6.3.2}核心方法}{27}{subsubsection.6.3.2}
50+
\contentsline {subsubsection}{\numberline {6.3.3}扩展}{29}{subsubsection.6.3.3}
51+
\contentsline {subsection}{\numberline {6.4}小结}{30}{subsection.6.4}
52+
\contentsline {section}{\numberline {7}第二类方法:特征选择}{31}{section.7}
53+
\contentsline {subsection}{\numberline {7.1}核心方法}{32}{subsection.7.1}
54+
\contentsline {subsection}{\numberline {7.2}扩展}{32}{subsection.7.2}
55+
\contentsline {subsection}{\numberline {7.3}小结}{32}{subsection.7.3}
56+
\contentsline {section}{\numberline {8}第三类方法:子空间学习}{33}{section.8}
57+
\contentsline {subsection}{\numberline {8.1}统计特征对齐}{33}{subsection.8.1}
58+
\contentsline {subsection}{\numberline {8.2}流形学习}{35}{subsection.8.2}
59+
\contentsline {subsection}{\numberline {8.3}扩展与小结}{37}{subsection.8.3}
60+
\contentsline {section}{\numberline {9}深度迁移学习}{38}{section.9}
61+
\contentsline {subsection}{\numberline {9.1}深度网络的可迁移性}{38}{subsection.9.1}
62+
\contentsline {subsection}{\numberline {9.2}最简单的深度迁移:finetune}{42}{subsection.9.2}
63+
\contentsline {subsection}{\numberline {9.3}深度网络自适应}{43}{subsection.9.3}
64+
\contentsline {subsubsection}{\numberline {9.3.1}基本思路}{43}{subsubsection.9.3.1}
65+
\contentsline {subsubsection}{\numberline {9.3.2}核心方法}{44}{subsubsection.9.3.2}
66+
\contentsline {subsubsection}{\numberline {9.3.3}小结}{49}{subsubsection.9.3.3}
67+
\contentsline {subsection}{\numberline {9.4}深度对抗网络迁移}{49}{subsection.9.4}
68+
\contentsline {subsubsection}{\numberline {9.4.1}基本思路}{49}{subsubsection.9.4.1}
69+
\contentsline {subsubsection}{\numberline {9.4.2}核心方法}{49}{subsubsection.9.4.2}
70+
\contentsline {subsubsection}{\numberline {9.4.3}小结}{52}{subsubsection.9.4.3}
71+
\contentsline {section}{\numberline {10}上手实践}{53}{section.10}
72+
\contentsline {section}{\numberline {11}迁移学习前沿}{59}{section.11}
73+
\contentsline {subsection}{\numberline {11.1}机器智能与人类经验结合迁移}{59}{subsection.11.1}
74+
\contentsline {subsection}{\numberline {11.2}传递式迁移学习}{59}{subsection.11.2}
75+
\contentsline {subsection}{\numberline {11.3}终身迁移学习}{60}{subsection.11.3}
76+
\contentsline {subsection}{\numberline {11.4}在线迁移学习}{61}{subsection.11.4}
77+
\contentsline {subsection}{\numberline {11.5}迁移强化学习}{62}{subsection.11.5}
78+
\contentsline {subsection}{\numberline {11.6}迁移学习的可解释性}{62}{subsection.11.6}
79+
\contentsline {section}{\numberline {12}总结语}{63}{section.12}
80+
\contentsline {section}{\numberline {13}附录}{64}{section.13}
81+
\contentsline {subsection}{\numberline {13.1}迁移学习相关的期刊和会议}{64}{subsection.13.1}
82+
\contentsline {subsection}{\numberline {13.2}迁移学习研究学者}{64}{subsection.13.2}
83+
\contentsline {subsection}{\numberline {13.3}迁移学习资源汇总}{67}{subsection.13.3}
84+
\contentsline {subsection}{\numberline {13.4}迁移学习常用算法及数据资源}{68}{subsection.13.4}

0 commit comments

Comments
 (0)