Skip to content

Commit c5685c9

Browse files
committed
remove app.labml.ai links
1 parent 97e53c0 commit c5685c9

154 files changed

Lines changed: 4116 additions & 4285 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/adaptive_computation/ponder_net/index.html

Lines changed: 69 additions & 70 deletions
Large diffs are not rendered by default.

docs/capsule_networks/index.html

Lines changed: 49 additions & 49 deletions
Large diffs are not rendered by default.

docs/capsule_networks/readme.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ <h1><a href="https://nn.labml.ai/capsule_networks/index.html">Capsule Networks</
7777
<p>This file holds the implementations of the core modules of Capsule Networks.</p>
7878
<p>I used <a href="https://github.com/jindongwang/Pytorch-CapsuleNet">jindongwang/Pytorch-CapsuleNet</a> to clarify some confusions I had with the paper.</p>
7979
<p>Here&#x27;s a notebook for training a Capsule Network on MNIST dataset.</p>
80-
<p><a href="https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/capsule_networks/mnist.ipynb"><img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg"></a> <a href="https://app.labml.ai/run/e7c08e08586711ebb3e30242ac1c0002"><img alt="View Run" src="https://img.shields.io/badge/labml-experiment-brightgreen"></a> </p>
80+
<p><a href="https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/capsule_networks/mnist.ipynb"><img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg"></a> </p>
8181

8282
</div>
8383
<div class='code'>

docs/cfr/kuhn/index.html

Lines changed: 90 additions & 90 deletions
Large diffs are not rendered by default.

docs/conv_mixer/experiment.html

Lines changed: 31 additions & 32 deletions
Large diffs are not rendered by default.

docs/conv_mixer/index.html

Lines changed: 62 additions & 63 deletions
Large diffs are not rendered by default.

docs/conv_mixer/readme.html

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,8 +75,7 @@ <h1><a href="https://nn.labml.ai/conv_mixer/index.html">Patches Are All You Need
7575
<p>ConvMixer is Similar to <a href="https://nn.labml.ai/transformers/mlp_mixer/index.html">MLP-Mixer</a>. MLP-Mixer separates mixing of spatial and channel dimensions, by applying an MLP across spatial dimension and then an MLP across the channel dimension (spatial MLP replaces the <a href="https://nn.labml.ai/transformers/vit/index.html">ViT</a> attention and channel MLP is the <a href="https://nn.labml.ai/transformers/feed_forward.html">FFN</a> of ViT).</p>
7676
<p>ConvMixer uses a 1x1 convolution for channel mixing and a depth-wise convolution for spatial mixing. Since it&#x27;s a convolution instead of a full MLP across the space, it mixes only the nearby batches in contrast to ViT or MLP-Mixer. Also, the MLP-mixer uses MLPs of two layers for each mixing and ConvMixer uses a single layer for each mixing.</p>
7777
<p>The paper recommends removing the residual connection across the channel mixing (point-wise convolution) and having only a residual connection over the spatial mixing (depth-wise convolution). They also use <a href="https://nn.labml.ai/normalization/batch_norm/index.html">Batch normalization</a> instead of <a href="../normalization/layer_norm/index.html">Layer normalization</a>.</p>
78-
<p>Here&#x27;s <a href="https://nn.labml.ai/conv_mixer/experiment.html">an experiment</a> that trains ConvMixer on CIFAR-10.</p>
79-
<p><a href="https://app.labml.ai/run/0fc344da2cd011ecb0bc3fdb2e774a3d"><img alt="View Run" src="https://img.shields.io/badge/labml-experiment-brightgreen"></a></p>
78+
<p>Here&#x27;s <a href="https://nn.labml.ai/conv_mixer/experiment.html">an experiment</a> that trains ConvMixer on CIFAR-10. </p>
8079

8180
</div>
8281
<div class='code'>

docs/distillation/index.html

Lines changed: 75 additions & 76 deletions
Large diffs are not rendered by default.

docs/distillation/large.html

Lines changed: 36 additions & 37 deletions
Large diffs are not rendered by default.

docs/distillation/readme.html

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,7 @@ <h1><a href="https://nn.labml.ai/distillation/index.html">Distilling the Knowled
7474
<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation/tutorial of the paper <a href="https://papers.labml.ai/paper/1503.02531">Distilling the Knowledge in a Neural Network</a>.</p>
7575
<p>It&#x27;s a way of training a small network using the knowledge in a trained larger network; i.e. distilling the knowledge from the large network.</p>
7676
<p>A large model with regularization or an ensemble of models (using dropout) generalizes better than a small model when trained directly on the data and labels. However, a small model can be trained to generalize better with help of a large model. Smaller models are better in production: faster, less compute, less memory.</p>
77-
<p>The output probabilities of a trained model give more information than the labels because it assigns non-zero probabilities to incorrect classes as well. These probabilities tell us that a sample has a chance of belonging to certain classes. For instance, when classifying digits, when given an image of digit <em>7</em>, a generalized model will give a high probability to 7 and a small but non-zero probability to 2, while assigning almost zero probability to other digits. Distillation uses this information to train a small model better.</p>
78-
<p><a href="https://app.labml.ai/run/d6182e2adaf011eb927c91a2a1710932"><img alt="View Run" src="https://img.shields.io/badge/labml-experiment-brightgreen"></a> </p>
77+
<p>The output probabilities of a trained model give more information than the labels because it assigns non-zero probabilities to incorrect classes as well. These probabilities tell us that a sample has a chance of belonging to certain classes. For instance, when classifying digits, when given an image of digit <em>7</em>, a generalized model will give a high probability to 7 and a small but non-zero probability to 2, while assigning almost zero probability to other digits. Distillation uses this information to train a small model better. </p>
7978

8079
</div>
8180
<div class='code'>

0 commit comments

Comments
 (0)