-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
1265 lines (576 loc) · 67.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html class="theme-next gemini use-motion" lang="">
<head>
<meta charset="UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1"/>
<meta name="theme-color" content="#222">
<script src="//cdn.bootcss.com/pace/1.0.2/pace.min.js"></script>
<link href="//cdn.bootcss.com/pace/1.0.2/themes/pink/pace-theme-flash.css" rel="stylesheet">
<meta http-equiv="Cache-Control" content="no-transform" />
<meta http-equiv="Cache-Control" content="no-siteapp" />
<link href="/lib/fancybox/source/jquery.fancybox.css?v=2.1.5" rel="stylesheet" type="text/css" />
<link href="/lib/font-awesome/css/font-awesome.min.css?v=4.6.2" rel="stylesheet" type="text/css" />
<link href="/css/main.css?v=5.1.4" rel="stylesheet" type="text/css" />
<link rel="apple-touch-icon" sizes="180x180" href="/images/apple-touch-icon-next.png?v=5.1.4">
<link rel="icon" type="image/png" sizes="32x32" href="/images/favicon2.ico?v=5.1.4">
<link rel="icon" type="image/png" sizes="16x16" href="/images/favicon-16x16-next.png?v=5.1.4">
<link rel="mask-icon" href="/images/logo.svg?v=5.1.4" color="#222">
<meta name="keywords" content="Hexo, NexT" />
<meta property="og:type" content="website">
<meta property="og:title" content="格物 致知">
<meta property="og:url" content="http://yoursite.com/index.html">
<meta property="og:site_name" content="格物 致知">
<meta property="og:locale" content="default">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="格物 致知">
<script type="text/javascript" id="hexo.configurations">
var NexT = window.NexT || {};
var CONFIG = {
root: '/',
scheme: 'Gemini',
version: '5.1.4',
sidebar: {"position":"left","display":"post","offset":12,"b2t":false,"scrollpercent":false,"onmobile":false},
fancybox: true,
tabs: true,
motion: {"enable":true,"async":false,"transition":{"post_block":"fadeIn","post_header":"slideDownIn","post_body":"slideDownIn","coll_header":"slideLeftIn","sidebar":"slideUpIn"}},
duoshuo: {
userId: '0',
author: 'Author'
},
algolia: {
applicationID: '',
apiKey: '',
indexName: '',
hits: {"per_page":10},
labels: {"input_placeholder":"Search for Posts","hits_empty":"We didn't find any results for the search: ${query}","hits_stats":"${hits} results found in ${time} ms"}
}
};
</script>
<link rel="canonical" href="http://yoursite.com/"/>
<title>格物 致知</title>
</head>
<body itemscope itemtype="http://schema.org/WebPage" lang="default">
<div class="container sidebar-position-left
page-home">
<div class="headband"></div>
<header id="header" class="header" itemscope itemtype="http://schema.org/WPHeader">
<div class="header-inner"><div class="site-brand-wrapper">
<div class="site-meta ">
<div class="custom-logo-site-title">
<a href="/" class="brand" rel="start">
<span class="logo-line-before"><i></i></span>
<span class="site-title">格物 致知</span>
<span class="logo-line-after"><i></i></span>
</a>
</div>
<p class="site-subtitle">子衿</p>
</div>
<div class="site-nav-toggle">
<button>
<span class="btn-bar"></span>
<span class="btn-bar"></span>
<span class="btn-bar"></span>
</button>
</div>
</div>
<nav class="site-nav">
<ul id="menu" class="menu">
<li class="menu-item menu-item-主页">
<a href="/" rel="section">
<i class="menu-item-icon fa fa-fw fa-home"></i> <br />
主页
</a>
</li>
<li class="menu-item menu-item-关于">
<a href="/about/" rel="section">
<i class="menu-item-icon fa fa-fw fa-user"></i> <br />
关于
</a>
</li>
<li class="menu-item menu-item-标签">
<a href="/tags/" rel="section">
<i class="menu-item-icon fa fa-fw fa-tags"></i> <br />
标签
</a>
</li>
<li class="menu-item menu-item-分类">
<a href="/categories/" rel="section">
<i class="menu-item-icon fa fa-fw fa-th"></i> <br />
分类
</a>
</li>
<li class="menu-item menu-item-归档">
<a href="/archives/" rel="section">
<i class="menu-item-icon fa fa-fw fa-archive"></i> <br />
归档
</a>
</li>
</ul>
</nav>
</div>
</header>
<main id="main" class="main">
<div class="main-inner">
<div class="content-wrap">
<div id="content" class="content">
<section id="posts" class="posts-expand">
<article class="post post-type-normal" itemscope itemtype="http://schema.org/Article">
<div class="post-block">
<link itemprop="mainEntityOfPage" href="http://yoursite.com/2018/06/09/pytorch0-4的概述/">
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
<meta itemprop="name" content="唐 赛">
<meta itemprop="description" content="">
<meta itemprop="image" content="/images/avatar.png">
</span>
<span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
<meta itemprop="name" content="格物 致知">
</span>
<header class="post-header">
<h1 class="post-title" itemprop="name headline">
<a class="post-title-link" href="/2018/06/09/pytorch0-4的概述/" itemprop="url">pytorch0.4的概述</a></h1>
<div class="post-meta">
<span class="post-time">
<span class="post-meta-item-icon">
<i class="fa fa-calendar-o"></i>
</span>
<span class="post-meta-item-text">Posted on</span>
<time title="Post created" itemprop="dateCreated datePublished" datetime="2018-06-09T16:13:20+08:00">
2018-06-09
</time>
</span>
<span class="post-category" >
<span class="post-meta-divider">|</span>
<span class="post-meta-item-icon">
<i class="fa fa-folder-o"></i>
</span>
<span class="post-meta-item-text">In</span>
<span itemprop="about" itemscope itemtype="http://schema.org/Thing">
<a href="/categories/pytorch/" itemprop="url" rel="index">
<span itemprop="name">pytorch</span>
</a>
</span>
</span>
<div class="post-wordcount">
<span class="post-meta-item-icon">
<i class="fa fa-file-word-o"></i>
</span>
<span class="post-meta-item-text">Words count in article:</span>
<span title="Words count in article">
1,877 字
</span>
<span class="post-meta-divider">|</span>
<span class="post-meta-item-icon">
<i class="fa fa-clock-o"></i>
</span>
<span class="post-meta-item-text">Reading time ≈</span>
<span title="Reading time">
9 分钟
</span>
</div>
</div>
</header>
<div class="post-body" itemprop="articleBody">
<p>pytorch0.4支持了Windows系统的开发,在<a href="https://pytorch.org/" target="_blank" rel="noopener">首页</a>即可使用pip安装pytorch和torchvision。<br>说白了,以下文字就是来自<a href="https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html" target="_blank" rel="noopener">官方文档60分钟入门的简要翻译</a>.</p>
<h2 id="pytorch是啥"><a href="#pytorch是啥" class="headerlink" title="pytorch是啥"></a>pytorch是啥</h2><p>python的科学计算库,使得NumPy可用于GPU计算,并提供了一个深度学习平台使得灵活性和速度最大化</p>
<h3 id="入门"><a href="#入门" class="headerlink" title="入门"></a>入门</h3><h4 id="Tensors-张量"><a href="#Tensors-张量" class="headerlink" title="Tensors(张量)"></a>Tensors(张量)</h4><p>Tensors与NumPy的ndarrays类似,另外可以使用GPU加速计算</p>
<p>未初始化的5*3的矩阵:<code>x = torch.empty(5, 3)</code><br>随机初始化的矩阵:<code>x = torch.rand(5, 3)</code><br>全零矩阵,定义数据类型:<code>x = torch.zeros(5, 3, dtype=torch.long)</code><br>由数据构造矩阵:<code>x = torch.tensor([5.5, 3])</code><br>由已存在张量构造矩阵,性质与之前张量一致:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">x = x.new_ones(<span class="number">5</span>, <span class="number">3</span>, dtype=torch.double) </span><br><span class="line">x = torch.randn_like(x, dtype=torch.float)</span><br></pre></td></tr></table></figure></p>
<p>获取维度:<code>print(x.size())</code></p>
<h4 id="Operations"><a href="#Operations" class="headerlink" title="Operations"></a>Operations</h4><p>有多种operation的格式,这里考虑加法</p>
<p>1.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">y = torch.rand(<span class="number">5</span>, <span class="number">3</span>)</span><br><span class="line">print(x + y)</span><br></pre></td></tr></table></figure>
<p>2.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">print(torch.add(x, y))</span><br></pre></td></tr></table></figure>
<p>3.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">result = torch.empty(<span class="number">5</span>, <span class="number">3</span>)</span><br><span class="line">torch.add(x, y, out=result)</span><br><span class="line">print(result)</span><br></pre></td></tr></table></figure>
<p>4.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># adds x to y</span></span><br><span class="line">y.add_(x)</span><br><span class="line">print(y)</span><br></pre></td></tr></table></figure>
<p>operations中需要改变张量本身的值,可以在operation后加<em>,比如`x.copy</em>(y), x.t_()`</p>
<p>索引:<code>print(x[:, 1])</code><br>改变维度:<code>x.view(-1, 8)</code></p>
<h3 id="和Numpy的联系"><a href="#和Numpy的联系" class="headerlink" title="和Numpy的联系"></a>和Numpy的联系</h3><p>torch tensor 和 numpy array之间可以进行相互转换,他们会共享内存位置,改变一个,另一个会跟着改变。</p>
<h4 id="tensor-to-array"><a href="#tensor-to-array" class="headerlink" title="tensor to array"></a>tensor to array</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">a = torch.ones(<span class="number">5</span>)</span><br><span class="line">b = a.numpy()</span><br><span class="line">a.add_(<span class="number">1</span>)</span><br><span class="line">print(a,b)</span><br></pre></td></tr></table></figure>
<h4 id="array-to-tensor"><a href="#array-to-tensor" class="headerlink" title="array to tensor"></a>array to tensor</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line">a = np.ones(<span class="number">5</span>)</span><br><span class="line">b = torch.from_numpy(a)</span><br><span class="line">np.add(a, <span class="number">1</span>, out=a)</span><br><span class="line">print(a)</span><br><span class="line">print(b)</span><br></pre></td></tr></table></figure>
<h3 id="CUDA-Tensors"><a href="#CUDA-Tensors" class="headerlink" title="CUDA Tensors"></a>CUDA Tensors</h3><p>tensor可以使用<code>.to</code>方法将其移动到任何设备。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># let us run this cell only if CUDA is available</span></span><br><span class="line"><span class="comment"># We will use ``torch.device`` objects to move tensors in and out of GPU</span></span><br><span class="line"><span class="keyword">if</span> torch.cuda.is_available():</span><br><span class="line"> device = torch.device(<span class="string">"cuda"</span>) <span class="comment"># a CUDA device object</span></span><br><span class="line"> y = torch.ones_like(x, device=device) <span class="comment"># directly create a tensor on GPU</span></span><br><span class="line"> x = x.to(device) <span class="comment"># or just use strings ``.to("cuda")``</span></span><br><span class="line"> z = x + y</span><br><span class="line"> print(z)</span><br><span class="line"> print(z.to(<span class="string">"cpu"</span>, torch.double)) <span class="comment"># ``.to`` can also change dtype together!</span></span><br></pre></td></tr></table></figure>
<h2 id="Autograd-自动求导"><a href="#Autograd-自动求导" class="headerlink" title="Autograd(自动求导)"></a>Autograd(自动求导)</h2><p>pytorch神经网络的核心模块就是autograd,autograd模块对Tensors上的所有operations提供了自动求导。</p>
<h3 id="Tensor"><a href="#Tensor" class="headerlink" title="Tensor"></a>Tensor</h3><p><code>torch.Tensor</code>是模块中的核心类,如果设置属性<code>.requires_grad = True</code>,开始追踪张量上的所有节点操作,指定其是否计算梯度。使用<code>.backward()</code>方法进行所有梯度的自动求导,张量的梯度会累积到<code>.grad</code>属性中。<br><code>.detach()</code>停止张量的追踪,从梯度计算中分离出来;另外在评估模型时一般使用代码块<code>with torch.no_grad():</code>,因为模型中通常训练的参数也会有<code>.requires_grad = True</code>,这样写可以停止全部张量的梯度更新。<br><code>Function</code>类是autograd的变体,<code>Tensor</code>和<code>Function</code>相互交错构建成无环图,编码了完整的计算过程,每个Variable(变量)都有<code>.grad_fn</code>属性,引用一个已经创建了的Tensor的Function.<br>如上,使用<code>.backward()</code>计算梯度。如果张量是一个标量(只有一个元素),不需要对<code>.backward()</code>指定参数;如果张量不止一个元素,需要指定<code>.backward()</code>的参数,其匹配张量的维度。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> torch</span><br><span class="line">x = torch.ones(<span class="number">2</span>, <span class="number">2</span>, requires_grad=<span class="keyword">True</span>)</span><br><span class="line">print(x)</span><br><span class="line">y = x + <span class="number">2</span></span><br><span class="line">print(y)</span><br><span class="line">print(y.grad_fn)</span><br><span class="line">z = y * y * <span class="number">3</span></span><br><span class="line">out = z.mean()</span><br><span class="line">print(z, out)</span><br><span class="line"></span><br><span class="line">a = torch.randn(<span class="number">2</span>, <span class="number">2</span>)</span><br><span class="line">a = ((a * <span class="number">3</span>) / (a - <span class="number">1</span>))</span><br><span class="line">print(a.requires_grad)</span><br><span class="line">a.requires_grad_(<span class="keyword">True</span>) <span class="comment"># 改变a张量内在的属性</span></span><br><span class="line">print(a.requires_grad)</span><br><span class="line">b = (a * a).sum()</span><br><span class="line">print(b.grad_fn)</span><br></pre></td></tr></table></figure>
<h3 id="Gradients"><a href="#Gradients" class="headerlink" title="Gradients"></a>Gradients</h3><p>反向传播时,由于<code>out</code>是一个标量,<code>out.backward()</code>等效于<code>out.backward(torch.tensor(1))</code></p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">out.backward()</span><br><span class="line">print(x.grad)</span><br><span class="line"></span><br><span class="line">x = torch.randn(<span class="number">3</span>, requires_grad=<span class="keyword">True</span>)</span><br><span class="line"></span><br><span class="line">y = x * <span class="number">2</span></span><br><span class="line"><span class="keyword">while</span> y.data.norm() < <span class="number">1000</span>:</span><br><span class="line"> y = y * <span class="number">2</span></span><br><span class="line"></span><br><span class="line">print(y)</span><br><span class="line"></span><br><span class="line">gradients = torch.tensor([<span class="number">0.1</span>, <span class="number">1.0</span>, <span class="number">0.0001</span>], dtype=torch.float)</span><br><span class="line">y.backward(gradients)</span><br><span class="line"></span><br><span class="line">print(x.grad)</span><br><span class="line"></span><br><span class="line">print(x.requires_grad)</span><br><span class="line">print((x ** <span class="number">2</span>).requires_grad)</span><br><span class="line"></span><br><span class="line"><span class="keyword">with</span> torch.no_grad():</span><br><span class="line"> print((x ** <span class="number">2</span>).requires_grad)</span><br></pre></td></tr></table></figure>
<h2 id="神经网络"><a href="#神经网络" class="headerlink" title="神经网络"></a>神经网络</h2><p>神经网络可以用<code>torch.nn</code>构建。<code>nn</code>依赖于<code>autograd</code>定义模型和求导,<code>nn.Module</code>定义网络层,方法<code>forward(input)</code>返回网络输出。</p>
<p>举例说明,如下是对数字图片分类的卷积网络架构。<br><img src="https://pytorch.org/tutorials/_images/mnist.png" alt=""><br>这是一个简单的前馈神经网络,将输入数据依次通过几层网络层后最终得到输出。<br>神经网络典型的训练步骤如下:</p>
<ul>
<li>定义神经网络及学习的参数(权重)</li>
<li>迭代输入数据</li>
<li>将输入数据输入到网络结构中</li>
<li>计算代价函数</li>
<li>误差向后传播</li>
<li>更新网络权重 <code>weight = weight - learning_rate * gradient</code></li>
</ul>
<h3 id="定义网络"><a href="#定义网络" class="headerlink" title="定义网络"></a>定义网络</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> torch</span><br><span class="line"><span class="keyword">import</span> torch.nn <span class="keyword">as</span> nn</span><br><span class="line"><span class="keyword">import</span> torch.nn.functional <span class="keyword">as</span> F</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Net</span><span class="params">(nn.Module)</span>:</span></span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">__init__</span><span class="params">(self)</span>:</span></span><br><span class="line"> super(Net, self).__init__()</span><br><span class="line"> <span class="comment"># 1 input image channel, 6 output channels, 5x5 square convolution</span></span><br><span class="line"> <span class="comment"># kernel</span></span><br><span class="line"> self.conv1 = nn.Conv2d(<span class="number">1</span>, <span class="number">6</span>, <span class="number">5</span>)</span><br><span class="line"> self.conv2 = nn.Conv2d(<span class="number">6</span>, <span class="number">16</span>, <span class="number">5</span>)</span><br><span class="line"> <span class="comment"># an affine operation: y = Wx + b</span></span><br><span class="line"> self.fc1 = nn.Linear(<span class="number">16</span> * <span class="number">5</span> * <span class="number">5</span>, <span class="number">120</span>)</span><br><span class="line"> self.fc2 = nn.Linear(<span class="number">120</span>, <span class="number">84</span>)</span><br><span class="line"> self.fc3 = nn.Linear(<span class="number">84</span>, <span class="number">10</span>)</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">forward</span><span class="params">(self, x)</span>:</span></span><br><span class="line"> <span class="comment"># Max pooling over a (2, 2) window</span></span><br><span class="line"> x = F.max_pool2d(F.relu(self.conv1(x)), (<span class="number">2</span>, <span class="number">2</span>))</span><br><span class="line"> <span class="comment"># If the size is a square you can only specify a single number</span></span><br><span class="line"> x = F.max_pool2d(F.relu(self.conv2(x)), <span class="number">2</span>)</span><br><span class="line"> x = x.view(<span class="number">-1</span>, self.num_flat_features(x))</span><br><span class="line"> x = F.relu(self.fc1(x))</span><br><span class="line"> x = F.relu(self.fc2(x))</span><br><span class="line"> x = self.fc3(x)</span><br><span class="line"> <span class="keyword">return</span> x</span><br><span class="line"></span><br><span class="line"> <span class="function"><span class="keyword">def</span> <span class="title">num_flat_features</span><span class="params">(self, x)</span>:</span></span><br><span class="line"> size = x.size()[<span class="number">1</span>:] <span class="comment"># all dimensions except the batch dimension</span></span><br><span class="line"> num_features = <span class="number">1</span></span><br><span class="line"> <span class="keyword">for</span> s <span class="keyword">in</span> size:</span><br><span class="line"> num_features *= s</span><br><span class="line"> <span class="keyword">return</span> num_features</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">net = Net()</span><br><span class="line">print(net)</span><br></pre></td></tr></table></figure>
<p>out:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">Net(</span><br><span class="line"> (conv1): Conv2d(<span class="number">1</span>, <span class="number">6</span>, kernel_size=(<span class="number">5</span>, <span class="number">5</span>), stride=(<span class="number">1</span>, <span class="number">1</span>))</span><br><span class="line"> (conv2): Conv2d(<span class="number">6</span>, <span class="number">16</span>, kernel_size=(<span class="number">5</span>, <span class="number">5</span>), stride=(<span class="number">1</span>, <span class="number">1</span>))</span><br><span class="line"> (fc1): Linear(in_features=<span class="number">400</span>, out_features=<span class="number">120</span>, bias=<span class="keyword">True</span>)</span><br><span class="line"> (fc2): Linear(in_features=<span class="number">120</span>, out_features=<span class="number">84</span>, bias=<span class="keyword">True</span>)</span><br><span class="line"> (fc3): Linear(in_features=<span class="number">84</span>, out_features=<span class="number">10</span>, bias=<span class="keyword">True</span>)</span><br><span class="line">)</span><br></pre></td></tr></table></figure></p>
<p>可以仅定义<code>forward()</code>函数,当使用<code>autograd</code>时<code>backward()</code>被自动定义。可以在<code>forward()</code>函数中使用任何operation操作。<br><code>net.parameters()</code>返回模型中的可学习参数。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">params = list(net.parameters())</span><br><span class="line">print(len(params))</span><br><span class="line">print(params[<span class="number">0</span>].size()) <span class="comment"># conv1's .weight</span></span><br></pre></td></tr></table></figure>
<p>使所有参数的梯度归零然后开始计算梯度</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">net.zero_grad()</span><br><span class="line">out.backward(torch.randn(<span class="number">1</span>, <span class="number">10</span>))</span><br></pre></td></tr></table></figure>
<h3 id="代价函数"><a href="#代价函数" class="headerlink" title="代价函数"></a>代价函数</h3><p>代价函数将(output,target)作为输入,计算output与target之间的距离。<br>nn模块中有几种不同的代价函数选择,最简单的是<code>nn.MSELoss</code>,计算均方误差<br>eg:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">output = net(input)</span><br><span class="line">target = torch.arange(<span class="number">1</span>, <span class="number">11</span>) <span class="comment"># a dummy target, for example</span></span><br><span class="line">target = target.view(<span class="number">1</span>, <span class="number">-1</span>) <span class="comment"># make it the same shape as output</span></span><br><span class="line">criterion = nn.MSELoss()</span><br><span class="line"></span><br><span class="line">loss = criterion(output, target)</span><br><span class="line">print(loss)</span><br></pre></td></tr></table></figure></p>
<p>按照向后传播的方向传播loss,使用<code>grad_fn</code>可以查看整个流程的计算图<br><figure class="highlight plain"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d</span><br><span class="line"> -> view -> linear -> relu -> linear -> relu -> linear</span><br><span class="line"> -> MSELoss</span><br><span class="line"> -> loss</span><br></pre></td></tr></table></figure></p>
<p>使用<code>loss.backward()</code>,流程中所有<code>requres_grad=True</code>的张量累积它的梯度至<code>.grad</code><br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">print(loss.grad_fn) <span class="comment"># MSELoss</span></span><br><span class="line">print(loss.grad_fn.next_functions[<span class="number">0</span>][<span class="number">0</span>]) <span class="comment"># Linear</span></span><br><span class="line">print(loss.grad_fn.next_functions[<span class="number">0</span>][<span class="number">0</span>].next_functions[<span class="number">0</span>][<span class="number">0</span>]) <span class="comment"># ReLU</span></span><br></pre></td></tr></table></figure></p>
<h3 id="向后传播"><a href="#向后传播" class="headerlink" title="向后传播"></a>向后传播</h3><p><code>loss.backward()</code>传播误差,<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">net.zero_grad() <span class="comment"># zeroes the gradient buffers of all parameters</span></span><br><span class="line"></span><br><span class="line">print(<span class="string">'conv1.bias.grad before backward'</span>)</span><br><span class="line">print(net.conv1.bias.grad)</span><br><span class="line"></span><br><span class="line">loss.backward()</span><br><span class="line"></span><br><span class="line">print(<span class="string">'conv1.bias.grad after backward'</span>)</span><br><span class="line">print(net.conv1.bias.grad)</span><br></pre></td></tr></table></figure></p>
<h3 id="更新权重"><a href="#更新权重" class="headerlink" title="更新权重"></a>更新权重</h3><p>误差每次传播后,需要对权重进行更新,简单的更新方式如下:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">learning_rate = <span class="number">0.01</span></span><br><span class="line"><span class="keyword">for</span> f <span class="keyword">in</span> net.parameters():</span><br><span class="line"> f.data.sub_(f.grad.data * learning_rate)</span><br></pre></td></tr></table></figure></p>
<p><code>torch.optim</code>实现了这一过程,并有着不同的更新规则GD, Nesterov-SGD, Adam, RMSProp,</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> torch.optim <span class="keyword">as</span> optim</span><br><span class="line"></span><br><span class="line"><span class="comment"># create your optimizer</span></span><br><span class="line">optimizer = optim.SGD(net.parameters(), lr=<span class="number">0.01</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># in your training loop:</span></span><br><span class="line">optimizer.zero_grad() <span class="comment"># zero the gradient buffers</span></span><br><span class="line">output = net(input)</span><br><span class="line">loss = criterion(output, target)</span><br><span class="line">loss.backward()</span><br><span class="line">optimizer.step() <span class="comment"># Does the update</span></span><br></pre></td></tr></table></figure>
<p>note: 每次迭代时由于梯度的累积,需要手动将梯度归零<code>optimizer.zero_grad()</code></p>
</div>
<footer class="post-footer">
<div class="post-eof"></div>
</footer>
</div>
</article>
<article class="post post-type-normal" itemscope itemtype="http://schema.org/Article">
<div class="post-block">
<link itemprop="mainEntityOfPage" href="http://yoursite.com/2018/05/30/HMM模型和他的python应用/">
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
<meta itemprop="name" content="唐 赛">
<meta itemprop="description" content="">
<meta itemprop="image" content="/images/avatar.png">
</span>
<span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
<meta itemprop="name" content="格物 致知">
</span>
<header class="post-header">
<h1 class="post-title" itemprop="name headline">
<a class="post-title-link" href="/2018/05/30/HMM模型和他的python应用/" itemprop="url">HMM模型和他的python应用</a></h1>
<div class="post-meta">
<span class="post-time">
<span class="post-meta-item-icon">
<i class="fa fa-calendar-o"></i>
</span>
<span class="post-meta-item-text">Posted on</span>
<time title="Post created" itemprop="dateCreated datePublished" datetime="2018-05-30T15:34:19+08:00">
2018-05-30
</time>
</span>
<span class="post-category" >
<span class="post-meta-divider">|</span>
<span class="post-meta-item-icon">
<i class="fa fa-folder-o"></i>
</span>
<span class="post-meta-item-text">In</span>
<span itemprop="about" itemscope itemtype="http://schema.org/Thing">
<a href="/categories/NLP/" itemprop="url" rel="index">
<span itemprop="name">NLP</span>
</a>
</span>
</span>
<div class="post-wordcount">
<span class="post-meta-item-icon">
<i class="fa fa-file-word-o"></i>
</span>
<span class="post-meta-item-text">Words count in article:</span>
<span title="Words count in article">
2,380 字
</span>
<span class="post-meta-divider">|</span>
<span class="post-meta-item-icon">
<i class="fa fa-clock-o"></i>
</span>
<span class="post-meta-item-text">Reading time ≈</span>
<span title="Reading time">
9 分钟
</span>
</div>
</div>
</header>
<div class="post-body" itemprop="articleBody">
<p>用隐含马尔可夫模型和他的应用<br>对于序列标注问题,目前主流的方法是条件随机场和长短期记忆网络,而一些简单的任务如分词、词性标注和命名实体识别,应用隐马尔可夫模型能够快速同时高效地进行学习。</p>
<h2 id="模型简介"><a href="#模型简介" class="headerlink" title="模型简介"></a>模型简介</h2><p>HMM被认为是解决大多数NLP问题最为快速有效的问题。将语言模型与通信问题联系起来,通信的本质是编解码和传输的过程。一个典型的通信系统包括信息源、信道、接收者、信息、上下文和编码。通信中,如何根据接收端的观测信号o<sub>1</sub>,o<sub>2</sub>,o<sub>3</sub>,…来推测信号源发送的信息s<sub>1</sub>,s<sub>2</sub>,s<sub>3</sub>,…?当然是求条件概率了。</p>
<p>$$<br>s_1,s_2,s_3,…=ArgMax_{s_1,s_2,s_3,…}{P(s_1,s_2,s_3,…|o_1,o_2,o_3,…)}<br>$$<br>根据贝叶斯公式,上述公式也等价于</p>
<p>$$<br>\frac{P(o_1,o_2,o_3,…|s_1,s_2,s_3,…)\cdot{P(s_1,s_2,s_3,…)}}{P(o_1,o_2,o_3,…)}<br>$$<br>公式中,分母可以看做是常数,因此对于通信问题求状态信息概率的问题可以看作是求联合概率$P(o_1,o_2,o_3,…|s_1,s_2,s_3,…)\cdot{P(s_1,s_2,s_3,…)}$,HMM模型通过简化该模型来计算信号源的概率。</p>
<p>HMM模型基于马尔可夫假设和独立输出假设。</p>
<p>符合马尔可夫假设的随机过程称为马尔可夫过程,也称为马尔可夫链。即随机状态中各个状态的概率分布只和前一个状态有关,下图表示一个离散的马尔可夫过程。<br><img src="http://pa51v0mwk.bkt.clouddn.com/%E9%A9%AC%E5%B0%94%E5%8F%AF%E5%A4%AB%E9%93%BE.png" alt="马尔可夫链"><br>其中,模型参数为转移概率,从一个状态转移到下一个状态的概率。随机选择初始状态后,运行一段时间T后,根据马尔可夫链的转移概率可以生成一个状态序列:$s_1,s_2,s_3,…,s_T$。或者可以根据已存在的状态序列,通过计算某一状态的出现次数与转到零一状态的次数之比来估计转移概率。<br>HMM模型是马尔可夫模型的扩展,每一时刻输出o<sub>t</sub>,且只与s<sub>t</sub>相关,s<sub>t</sub>是不可见的,只能通过观察o<sub>t</sub>来估计隐含状态s<sub>t</sub>。<br>基于马尔可夫假设和独立输出假设,之前的联合概率可写作<br>$$<br>P(s_1,s_2,s_3,…,o_1,o_2,o_3,…)=\prod_tP(s_t|s_{t-1})\cdot{P(o_t|s_t)}<br>$$<br>其中,$P(s_t|s_{t-1}$称为转移概率,$P(o_t|s_t)$称为生成概率。<br>HMM模型参数为$<S,O,A,B,\Pi>$</p>
<ul>
<li>S表示模型状态,N是状态数量,$S={S_1,S_2,…,S_N}$</li>
<li>O表示每个状态的观察值,M为所有可能观察值的数量,$O={O_1,O_2,…,O_M}$</li>
<li>$A={a_{ij}}\in{R^{N*N}}$表示状态转移概率矩阵</li>
<li>$B={b_j(k)}\in{R^{N*M}}$表示生成概率矩阵</li>
<li>$\Pi\in{R^N}$表示初始状态概率</li>
</ul>
<p>由此,通过$\Pi$和A可以生成隐含状态序列,然后由每一时刻的生成概率B可以产生观察序列。设模型参数为$\lambda$,模型给定时,有</p>
<p>$$<br>P(s|\lambda)=\pi_{i_1}a_{i_1i_2}a_{i_2i_3}…a_{i_{T-1}i_T}<br>$$</p>
<p>$$<br>P(o|s,\lambda)=b_{i_1o_1}b_{i_2o_2}…b_{i_To_T}<br>$$</p>
<p>$$<br>P(o,s|\lambda)=\pi_{i_1}b_{i_1o_1}a_{i_1i_2}b_{i_2o_2}…a_{i_{T-1}}b_{i_To_T}<br>$$</p>
<p>围绕HMM模型有三个基本问题,不同的任务有不同的算法求解。</p>
<h3 id="给定模型计算输出序列概率"><a href="#给定模型计算输出序列概率" class="headerlink" title="给定模型计算输出序列概率"></a>给定模型计算输出序列概率</h3><h4 id="暴力计算"><a href="#暴力计算" class="headerlink" title="暴力计算"></a>暴力计算</h4><p>列举出所有可能的状态序列,计算每个序列的联合概率$P(o,s|\lambda)$,然后全部相加,得到观测序列的概率$\sum_i{P(o,s|\lambda)}$。这样的计算方式时间复杂度高。</p>
<h4 id="前向算法"><a href="#前向算法" class="headerlink" title="前向算法"></a>前向算法</h4><p>引入前向变量$\alpha_t(i)$,表示在t时刻状态为i时,输出序列为$o_1,o_2,…,o_T$的概率。这样,$P(o)=\alpha_t(1)+\alpha_t(2)+…+\alpha_t(i)$.而t时刻的前向变量可以通过t-1时刻的前向变量计算得到,以此类推,初始化计算出第一次的前向变量时,通过迭代最终可以计算出观察序列的概率。</p>
<p>$$<br>\alpha_t(i+1)=(\sum_{i=1}^{N}{\alpha_t(i)})b_j{(o_{t+1})},1\leq{t}\leq{T-1}<br>$$</p>
<h4 id="后向算法"><a href="#后向算法" class="headerlink" title="后向算法"></a>后向算法</h4><p>跟前向算法差不多,引入了后向变量$\beta_t(i)$</p>
<h3 id="给定模型和特定观察序列求概率最大的状态序列"><a href="#给定模型和特定观察序列求概率最大的状态序列" class="headerlink" title="给定模型和特定观察序列求概率最大的状态序列"></a>给定模型和特定观察序列求概率最大的状态序列</h3><h4 id="维特比算法"><a href="#维特比算法" class="headerlink" title="维特比算法"></a>维特比算法</h4><p>可以将此类问题看作是动态规划,用维特比算法求解。<br>在状态序列中,每一时刻的状态都有N种取值,假设序列长度为T,则一共有$N^T$种可能的状态序列,长序列时很难使用暴力计算来求解最大概率的路径。<br>维特比算法认为概率最大的路径P经过序列上的任意时刻,从起始点到该时刻的这段路径一定也是到当前时刻的概率最大的路径,因此维特比从第一个时刻开始,依次考察路径的概率,计算得到的概率最大的路径经过的状态即是这一时刻的状态的取值。因此,每一步计算的复杂度都和相邻两个时刻$S_i$和$S_i+1$各自的节点数目$n_i$,$n_i+1$的乘积成正比,即$O(n_i\cdot{n_{i+1}})$.</p>
<h3 id="估计模型参数"><a href="#估计模型参数" class="headerlink" title="估计模型参数"></a>估计模型参数</h3><h4 id="已知观察序列和对应的状态序列"><a href="#已知观察序列和对应的状态序列" class="headerlink" title="已知观察序列和对应的状态序列"></a>已知观察序列和对应的状态序列</h4><p>观察序列和状态序列都已知的情况下,很好解决。利用样本中出现的各个取值的次数可以计算出大致的模型参数。</p>
<p>$$<br>P(o_t|s_t)\approx\frac{(o_t,s_t)}{(s_t)}<br>$$</p>
<p>$$<br>P(s_t|s_{t-1})\approx\frac{(s_t,s_{t-1})}{(s_{t-1})}<br>$$</p>
<h4 id="已知观察序列"><a href="#已知观察序列" class="headerlink" title="已知观察序列"></a>已知观察序列</h4><p>在已知观察序列未知状态序列的情况下,可以使用鲍姆韦尔奇算法求解,通过期望最大化(EM)算法进行迭代。<br>首先找到一组能够产生输出序列O的模型参数,根据模型参数计算当前概率最大的状态序列的可能值并作为标注数据,按照3.1的方式重新计算模型参数,再次寻找概率最大的状态序列,一直迭代,直至模型性能收敛。</p>
<h2 id="hmmlearn"><a href="#hmmlearn" class="headerlink" title="hmmlearn"></a>hmmlearn</h2><p>hmmlearn是用于学习HMM模型的python库,与scikit-learn的API相似,依赖于scikit-learn,NumPy,SciPy,matplotlib等库。在<a href="http://hmmlearn.readthedocs.io/en/latest/tutorial.html" target="_blank" rel="noopener">官方文档</a>中详细描述了hmmlearn中API的使用和一些实例。<br>hmmlearn的安装和其他模块一样。<br><code>pip install hmmlearn</code><br>hmmlearn实现了三种算法的HMM模型,如下:</p>
<table>
<thead>
<tr>
<th style="text-align:left">类</th>
<th style="text-align:left">说明</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">hmm.GaussianHMM</td>
<td style="text-align:left">假设观察量呈高斯分布</td>
</tr>
<tr>
<td style="text-align:left">hmm.GMMHMM</td>
<td style="text-align:left">假设观察量呈高斯混合分布</td>
</tr>
<tr>
<td style="text-align:left">hmm.MultinomialHMM</td>
<td style="text-align:left">观察量离散分布</td>
</tr>
</tbody>
</table>
<p>针对观测序列为连续量的情况,可以使用前两种类,假设观察量呈现高斯分布或高斯混合分布,模型不复杂时使用第一种就足够了。</p>
<h3 id="构建HMM,产生样本"><a href="#构建HMM,产生样本" class="headerlink" title="构建HMM,产生样本"></a>构建HMM,产生样本</h3><p>通过传递参数,可以构建HMM对象。<br>在MultionalHMM类中,主要参数如下</p>
<p><code>class hmmlearn.hmm.MultinomialHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste')</code></p>
<blockquote>
<p>n_components: 状态数量N<br>algorithm: 可选’viterbi’或’map’<br>n_iter: 模型EM迭代的最大次数<br>tol: 收敛阈值,最大似然增益小于tol时停止EM迭代<br>verbose: verbose=True时打印每次迭代的收敛程度<br>params: 控制哪些参数需要在训练中更新,’ste’的组合,分别代表初始概率分布、转移矩阵和生成矩阵<br>init_params: 哪些参数需要初始化</p>
</blockquote>
<p>MultinomialHMM类的属性</p>
<table>
<thead>
<tr>
<th style="text-align:left">attribute</th>
<th style="text-align:left">describe</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">n_features</td>
<td style="text-align:left">模型观察量的数量</td>
</tr>
<tr>
<td style="text-align:left">monitor_</td>
<td style="text-align:left">检验EM收敛的类</td>
</tr>
<tr>
<td style="text-align:left">transmat_</td>
<td style="text-align:left">状态的转移概率矩阵</td>
</tr>
<tr>
<td style="text-align:left">startprob_</td>
<td style="text-align:left">初始概率分布</td>
</tr>
<tr>
<td style="text-align:left">emissionprob_</td>
<td style="text-align:left">生成概率矩阵</td>
</tr>
</tbody>
</table>
<p>在GaussianHMM类中,主要参数如下:</p>
<p><code>class hmmlearn.hmm.GaussianHMM(n_components=1, covariance_type='diag', min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, means_prior=0, means_weight=0, covars_prior=0.01, covars_weight=1, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmc', init_params='stmc')</code><br>GaussianHMM类中,参数与离散HMM略有不同,</p>
<blockquote>
<p>covariance_type : 用于描述协方差的类型,必须是”spherical”,”diag”,”full”,”tied”的一种,<a href="https://www.zhihu.com/question/33467075" target="_blank" rel="noopener">详见</a><br>params: ‘cmte’,cm分别代表高斯分布的方差和均值<br>init_params: ‘cmte’</p>
</blockquote>
<p>在属性中,因为是连续量,所以没有emissionprob_,变成了means_和covars_.</p>
<p>下面是构建GaussianHMM实例:<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">>>> </span><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line"><span class="meta">>>> </span><span class="keyword">from</span> hmmlearn <span class="keyword">import</span> hmm</span><br><span class="line"><span class="meta">>>> </span>np.random.seed(<span class="number">42</span>)</span><br><span class="line"></span><br><span class="line"><span class="meta">>>> </span>model = hmm.GaussianHMM(n_components=<span class="number">3</span>, covariance_type=<span class="string">"full"</span>)</span><br><span class="line"><span class="meta">>>> </span>model.startprob_ = np.array([<span class="number">0.6</span>, <span class="number">0.3</span>, <span class="number">0.1</span>])</span><br><span class="line"><span class="meta">>>> </span>model.transmat_ = np.array([[<span class="number">0.7</span>, <span class="number">0.2</span>, <span class="number">0.1</span>],</span><br><span class="line"><span class="meta">... </span> [<span class="number">0.3</span>, <span class="number">0.5</span>, <span class="number">0.2</span>],</span><br><span class="line"><span class="meta">... </span> [<span class="number">0.3</span>, <span class="number">0.3</span>, <span class="number">0.4</span>]])</span><br><span class="line"><span class="meta">>>> </span>model.means_ = np.array([[<span class="number">0.0</span>, <span class="number">0.0</span>], [<span class="number">3.0</span>, <span class="number">-3.0</span>], [<span class="number">5.0</span>, <span class="number">10.0</span>]])</span><br><span class="line"><span class="meta">>>> </span>model.covars_ = np.tile(np.identity(<span class="number">2</span>), (<span class="number">3</span>, <span class="number">1</span>, <span class="number">1</span>))</span><br><span class="line"><span class="meta">>>> </span>X, Z = model.sample(<span class="number">100</span>)</span><br></pre></td></tr></table></figure></p>
<p>建立确定参数的HMM模型时,需要在构建实例后,传入模型的参数,连续模型是cmte,离散模型是ste。<br>代码最后,通过<code>model.sample(100)</code>产生长度为100的样本,X,Z分别代表了状态序列和观测序列。</p>
<h3 id="训练HMM参数,估计状态序列"><a href="#训练HMM参数,估计状态序列" class="headerlink" title="训练HMM参数,估计状态序列"></a>训练HMM参数,估计状态序列</h3><p>可以通过<code>fit</code>方法来训练HMM参数,输入是联立的观测序列和它的长度序列。<br>通过方法<code>score</code>可以计算观测序列的概率。<br>推断状态序列可以用<code>predict</code>方法。</p>
<h3 id="保存和加载模型"><a href="#保存和加载模型" class="headerlink" title="保存和加载模型"></a>保存和加载模型</h3><p>有两种方法,标准的pickle模块和scikit-learn中的joblib模块<br><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">>>> </span><span class="keyword">from</span> sklearn.externals <span class="keyword">import</span> joblib</span><br><span class="line"><span class="meta">>>> </span>joblib.dump(remodel, <span class="string">"filename.pkl"</span>)</span><br><span class="line">[<span class="string">"filename.pkl"</span>]</span><br><span class="line"><span class="meta">>>> </span>joblib.load(<span class="string">"filename.pkl"</span>) </span><br><span class="line">GaussianHMM(algorithm=<span class="string">'viterbi'</span>,...</span><br></pre></td></tr></table></figure></p>
<p>参考文献:</p>
<ol>
<li><a href="https://www.cnblogs.com/pinard/p/7001397.html" target="_blank" rel="noopener">https://www.cnblogs.com/pinard/p/7001397.html</a></li>
<li><a href="http://hmmlearn.readthedocs.io/en/latest/auto_examples/plot_hmm_stock_analysis.html" target="_blank" rel="noopener">http://hmmlearn.readthedocs.io/en/latest/auto_examples/plot_hmm_stock_analysis.html</a></li>
<li>数学之美.吴军</li>
</ol>
</div>
<footer class="post-footer">
<div class="post-eof"></div>
</footer>
</div>
</article>
<article class="post post-type-normal" itemscope itemtype="http://schema.org/Article">
<div class="post-block">
<link itemprop="mainEntityOfPage" href="http://yoursite.com/2018/05/29/给25岁的自己/">
<span hidden itemprop="author" itemscope itemtype="http://schema.org/Person">
<meta itemprop="name" content="唐 赛">
<meta itemprop="description" content="">
<meta itemprop="image" content="/images/avatar.png">
</span>
<span hidden itemprop="publisher" itemscope itemtype="http://schema.org/Organization">
<meta itemprop="name" content="格物 致知">
</span>
<header class="post-header">
<h1 class="post-title" itemprop="name headline">
<a class="post-title-link" href="/2018/05/29/给25岁的自己/" itemprop="url">给25岁的自己</a></h1>
<div class="post-meta">
<span class="post-time">
<span class="post-meta-item-icon">
<i class="fa fa-calendar-o"></i>
</span>
<span class="post-meta-item-text">Posted on</span>
<time title="Post created" itemprop="dateCreated datePublished" datetime="2018-05-29T01:44:12+08:00">
2018-05-29
</time>
</span>
<span class="post-category" >
<span class="post-meta-divider">|</span>
<span class="post-meta-item-icon">
<i class="fa fa-folder-o"></i>
</span>
<span class="post-meta-item-text">In</span>
<span itemprop="about" itemscope itemtype="http://schema.org/Thing">
<a href="/categories/日记/" itemprop="url" rel="index">
<span itemprop="name">日记</span>
</a>
</span>
</span>
<div class="post-wordcount">
<span class="post-meta-item-icon">
<i class="fa fa-file-word-o"></i>
</span>
<span class="post-meta-item-text">Words count in article:</span>
<span title="Words count in article">
1,129 字
</span>
<span class="post-meta-divider">|</span>
<span class="post-meta-item-icon">
<i class="fa fa-clock-o"></i>
</span>
<span class="post-meta-item-text">Reading time ≈</span>
<span title="Reading time">
4 分钟
</span>
</div>
</div>
</header>
<div class="post-body" itemprop="articleBody">
<h1 id="给25岁的自己"><a href="#给25岁的自己" class="headerlink" title="给25岁的自己"></a>给25岁的自己</h1><p>现在是2018年5月29日凌晨1点,写下这篇文章,给25岁即将毕业的我。希望自己以后每年都会有时间在这里写上一段话,以供后来的我瞻仰,额。。。好像不该用瞻仰这个词。</p>
<p>接下来的话全是即兴发挥,恩,可以说是接近乱扯了。</p>
<h4 id="不迷茫"><a href="#不迷茫" class="headerlink" title="不迷茫"></a>不迷茫</h4><p>研究生三年,边学边迷茫,想着要毕业了,应该不迷茫了吧。恩,不迷茫是不可能的。到现在,还是不清楚自己该做些什么。不过,这很重要吗?我一般都不怎么考虑这些大问题,日子慢慢过呗。</p>
<p>本科的时候很清楚自己要读研,因为总想着妈的学了这么多好耍的东西,上班了就用不了了就很亏啊。而且还年轻啊。年轻的时候何必让自己想那么多大人该想的事情,就做自己想做的。所以,继续读了三年书。实话讲,很庆幸这三年的时光,让自己有了很多思考的时间。以前从不会在凌晨的时候还在电脑前看书,写代码,现在,真的是,凌晨不学点东西,我浑身难受!我一直认为,研究生就是学会怎么和孤独相处的过程,每个人都有自己的事情,找到合拍的很难,自己的研究方向别人又很难搞懂,所以,自律就很重要了。哈哈,虽然我并不自律。但也努力地在每个阶段中去完成一些事情,回过头来看看,真的是收获颇丰啊。总结起来,学了很多领域的知识,硬件、软件、控制、图像、文本、数据挖掘,每个领域都很有趣,以后有咸鱼时间再利用这些知识做些好玩的事情吧。</p>
<p>学生时代的生活是在学习中寻找自己的人身价值,很遗憾,学生时代即将过去了,我也不知道我的人生价值是什么,十年之后,我会在哪里,做些什么,我都想象不出来。以前和朋友聊天的时候,我总结了一个观点,人分两种,一种是很清楚的知道每一个当下的自己该干嘛,对未来有个美好蓝图,然后朝着那个目标不断努力,另一种是不清楚自己以后该干嘛,在社会上大抵是被推着走的,就像是游戏里的角色,到了某一个关口,来不及反应,马上又有新的任务派给他了,完成之后又是新的任务。。。大多数人都是第二种吧,我也是,不过我觉得没什么差。何必需要一个蓝图来束缚自己,我的年龄、背景和阅历不足以让我在二十岁的时候意识到这个社会的法则,因此我也不需要当前我就有一个宏伟的目标,我需要的是一个一个的小目标,每完成一个,我就依照自己的兴趣给自己分配下一个。三十而立四十不惑嘛,慢慢地走到了那个阶段了,是什么就是什么,毕竟也有努力过,怎么可能会后悔呢?</p>
<p>所以,只要自己想着要做什么,就做什么。哈哈哈,很佛性啊,不争不抢。</p>
<h4 id="迷茫"><a href="#迷茫" class="headerlink" title="迷茫"></a>迷茫</h4><p>我目前的心性决定了我现在能想到的未来仅是半年后,再远就想不到了,也不需要去想,额,不知道以后会不会变,毕竟,这样的生活,对于在充满竞争的社会上,是出于劣势的。</p>
<p>之前也和朋友聊天说过这个问题,25岁了,没有考虑房子,没有考虑结婚,只想着觉得,恩,这个算法很牛批,我要来试试看,哟,那个框架很炫酷,现在就下载下来看看。这种态度在职场中是不讨好的,之前有过三个月的实习经历,深深觉得,要想升职加薪,真的要到处用心,把握每一个机会,而我,之前很看淡这些,觉得没得撒子意思的嘛,额,甚至呆的久了,觉得这种公司是真滴很没劲,人情世故要考虑,每次开会说什么要慎重。我知道这些在以后都是必备的技能,所以,很矛盾。慢慢来吧</p>
</div>
<footer class="post-footer">
<div class="post-eof"></div>
</footer>
</div>
</article>
</section>
</div>
</div>
<div class="sidebar-toggle">
<div class="sidebar-toggle-line-wrap">
<span class="sidebar-toggle-line sidebar-toggle-line-first"></span>
<span class="sidebar-toggle-line sidebar-toggle-line-middle"></span>
<span class="sidebar-toggle-line sidebar-toggle-line-last"></span>
</div>
</div>
<aside id="sidebar" class="sidebar">
<div class="sidebar-inner">
<section class="site-overview-wrap sidebar-panel sidebar-panel-active">
<div class="site-overview">
<div class="site-author motion-element" itemprop="author" itemscope itemtype="http://schema.org/Person">
<img class="site-author-image" itemprop="image"
src="/images/avatar.png"
alt="唐 赛" />
<p class="site-author-name" itemprop="name">唐 赛</p>
<p class="site-description motion-element" itemprop="description"></p>
</div>
<nav class="site-state motion-element">
<div class="site-state-item site-state-posts">
<a href="/archives">
<span class="site-state-item-count">3</span>
<span class="site-state-item-name">posts</span>
</a>
</div>
<div class="site-state-item site-state-categories">
<a href="/categories/index.html">
<span class="site-state-item-count">3</span>
<span class="site-state-item-name">categories</span>
</a>
</div>
<div class="site-state-item site-state-tags">
<a href="/tags/index.html">
<span class="site-state-item-count">3</span>
<span class="site-state-item-name">tags</span>