Skip to content

Commit e8a5feb

Browse files
committed
docs
1 parent 689842a commit e8a5feb

10 files changed

Lines changed: 512 additions & 500 deletions

File tree

docs/diffusion/stable_diffusion/model/unet.html

Lines changed: 141 additions & 137 deletions
Large diffs are not rendered by default.

docs/ja/diffusion/stable_diffusion/model/unet.html

Lines changed: 151 additions & 147 deletions
Large diffs are not rendered by default.

docs/ja/transformers/rope/index.html

Lines changed: 21 additions & 21 deletions
Large diffs are not rendered by default.

docs/neox/utils/llm_int8.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@
7474
<h1>LLM.int() on GPT-NeoX</h1>
7575
<p>This implements a utility function to transform a <code class="highlight"><span></span><span class="n">nn</span><span class="o">.</span><span class="n">Linear</span></code>
7676
layer to LLM.int8() linear layer.</p>
77-
<p><a href="https://papers.labml.ai/paper/eb2bcaee1d0011edaa66a71c10a887e7">LLM.int8() paper</a> shows you can use int8 quantization while handling outliers to reduce memory footprint without performance degradation in large language models. They convert weights and inputs to scaled 8-bit integers and does matrix multiplication producing int32 results which is then converted back to float16 and rescaled. They show that in large language models, some features can give extreme values (outliers) that dominate the model&#x27;s output. These features get clamped in 8-bit integer space which causes the model performance to degrade. As a solution they pick these outliers (greater than a specified threshold) and compute their multiplications separately in float16 space. Since the percentage of outliers is around 0.01% this doesn&#x27;t increase memory usage, and prevents the model from degrading performance.</p>
77+
<p><a href="https://papers.labml.ai/paper/eb2bcaee1d0011edaa66a71c10a887e7">LLM.int8() paper</a> shows you can use int8 quantization while handling outliers to reduce memory footprint without performance degradation in large language models. They convert weights and inputs to scaled 8-bit integers and does matrix multiplication producing int32 results which is then converted back to float16 and rescaled. They show that in large langauge models, some features can give extreme values (outliers) that dominate the model&#x27;s output. These features get clamped in 8-bit integer space which causes the model performance to degrade. As a solution they pick these outliers (greater than a specified threshold) and compute their multiplications separately in float16 space. Since the percentage of outliers is around 0.01% this doesn&#x27;t increase memory usage, and prevents the model from degrading performance.</p>
7878
<p>The code to transform GPT-NoeX layers is defined in <a href="../model.html#post_load_prepare">model.py</a>.</p>
7979
<p>Here are example uses of GPT-NeoX with int8 quantization.</p>
8080
<ul><li><a href="../samples/llm_int8.html">Generate Text</a> </li>
@@ -240,4 +240,4 @@ <h2>Transform a <code class="highlight"><span></span><span class="n">nn</span><
240240
handleImages()
241241
</script>
242242
</body>
243-
</html>
243+
</html>

docs/sitemap.xml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -484,7 +484,7 @@
484484

485485
<url>
486486
<loc>https://nn.labml.ai/index.html</loc>
487-
<lastmod>2023-04-02T16:30:00+00:00</lastmod>
487+
<lastmod>2023-06-30T16:30:00+00:00</lastmod>
488488
<priority>1.00</priority>
489489
</url>
490490

@@ -610,7 +610,7 @@
610610

611611
<url>
612612
<loc>https://nn.labml.ai/diffusion/stable_diffusion/model/unet.html</loc>
613-
<lastmod>2023-01-19T16:30:00+00:00</lastmod>
613+
<lastmod>2023-06-30T16:30:00+00:00</lastmod>
614614
<priority>1.00</priority>
615615
</url>
616616

@@ -932,7 +932,7 @@
932932

933933
<url>
934934
<loc>https://nn.labml.ai/transformers/rope/index.html</loc>
935-
<lastmod>2023-04-02T16:30:00+00:00</lastmod>
935+
<lastmod>2023-06-28T16:30:00+00:00</lastmod>
936936
<priority>1.00</priority>
937937
</url>
938938

docs/transformers/rope/index.html

Lines changed: 21 additions & 21 deletions
Large diffs are not rendered by default.

docs/zh/diffusion/stable_diffusion/model/unet.html

Lines changed: 148 additions & 144 deletions
Large diffs are not rendered by default.

docs/zh/transformers/rope/index.html

Lines changed: 21 additions & 21 deletions
Large diffs are not rendered by default.

translate_cache/diffusion/stable_diffusion/model/unet.ja.json

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"<h2>U-Net model</h2>\n": "<h2>U-\u30cd\u30c3\u30c8\u30e2\u30c7\u30eb</h2>\n",
77
"<h3>Group normalization with float32 casting</h3>\n": "<h3>float32 \u30ad\u30e3\u30b9\u30c6\u30a3\u30f3\u30b0\u306b\u3088\u308b\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n",
88
"<h3>Group normalization</h3>\n<p>This is a helper function, with fixed number of groups..</p>\n": "<h3>\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n<p>\u3053\u308c\u306f\u30b0\u30eb\u30fc\u30d7\u6570\u304c\u56fa\u5b9a\u3055\u308c\u305f\u30d8\u30eb\u30d1\u30fc\u95a2\u6570\u3067\u3059\u3002</p>\n",
9-
"<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules suck as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u304c\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u8907\u6570\u306e\u30e2\u30b8\u30e5\u30fc\u30eb\u304b\u3089\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u305d\u308c\u305e\u308c\u3092\u4e00\u81f4\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<span translate=no>_^_2_^_</span></p>\n",
9+
"<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules such as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u306e\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u3001<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\u306a\u3069\u306e\u3055\u307e\u3056\u307e\u306a\u30e2\u30b8\u30e5\u30fc\u30eb\u3067\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001\u305d\u308c\u3089\u3092\u5bfe\u5fdc\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002</p>\n",
1010
"<h3>Up-sampling layer</h3>\n": "<h3>\u30a2\u30c3\u30d7\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u30ec\u30a4\u30e4\u30fc</h3>\n",
1111
"<p> </p>\n": "<p></p>\n",
1212
"<p> Test sinusoidal time step embeddings</p>\n": "<p>\u6b63\u5f26\u6ce2\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30c6\u30b9\u30c8</p>\n",
@@ -52,7 +52,7 @@
5252
"<ul><li><span translate=no>_^_0_^_</span> is the input feature map of shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> are the time steps of shape <span translate=no>_^_3_^_</span> </li>\n<li><span translate=no>_^_4_^_</span> conditioning of shape <span translate=no>_^_5_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30b7\u30a7\u30a4\u30d7\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u3067\u3059 <span translate=no>_^_3_^_</span></li>\n<li><span translate=no>_^_4_^_</span>\u5f62\u72b6\u306e\u30b3\u30f3\u30c7\u30a3\u30b7\u30e7\u30cb\u30f3\u30b0 <span translate=no>_^_5_^_</span></li></ul>\n",
5353
"<ul><li><span translate=no>_^_0_^_</span> is the input feature map with shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> is the time step embeddings of shape <span translate=no>_^_3_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u3067\u3059 <span translate=no>_^_3_^_</span></li></ul>\n",
5454
"<ul><li><span translate=no>_^_0_^_</span> is the input feature map with shape <span translate=no>_^_1_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li></ul>\n",
55-
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span> is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span> is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span> number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span> are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span> are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span> the number of attention heads in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_2_^_</span>\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u7559\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3092\u5411\u3051\u308b\u3079\u304d\u30ec\u30d9\u30eb\u306f</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u7b97\u4fc2\u6570\u3067\u3059</li>\n<li><span translate=no>_^_6_^_</span>\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570</li></ul>\n",
55+
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span> is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span> is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span> number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span> are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span> are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span> is the number of attention heads in the transformers </li>\n<li><span translate=no>_^_7_^_</span> is the number of transformer layers in the transformers </li>\n<li><span translate=no>_^_8_^_</span> is the size of the conditional embedding in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u3001\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_2_^_</span>\u306f\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u5dee\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3059\u3079\u304d\u30ec\u30d9\u30eb\u306f\u3069\u308c\u3050\u3089\u3044\u306e\u30ec\u30d9\u30eb\u304b</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u6cd5\u4fc2\u6570</li>\n<li><span translate=no>_^_6_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570\u3067\u3059</li>\n<li><span translate=no>_^_7_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u5909\u5727\u5668\u5c64\u306e\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_8_^_</span>\u306f\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u5185\u306e\u6761\u4ef6\u4ed8\u304d\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba\u3067\u3059</li></ul>\n",
5656
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30c1\u30e3\u30cd\u30eb\u6570</li></ul>\n",
5757
"<ul><li><span translate=no>_^_0_^_</span> the number of input channels </li>\n<li><span translate=no>_^_1_^_</span> the size of timestep embeddings </li>\n<li><span translate=no>_^_2_^_</span> is the number of out channels. defaults to `channels.</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5165\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_1_^_</span>\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba</li>\n</ul><li><span translate=no>_^_2_^_</span>\u306f\u51fa\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u306e\u6570\u3067\u3059\u3002\u30c7\u30d5\u30a9\u30eb\u30c8\u306f `channels\u3067\u3059\u3002</li>\n",
5858
"Annotated PyTorch implementation/tutorial of the U-Net in stable diffusion.": "\u5b89\u5b9a\u7248\u62e1\u6563\u306b\u304a\u3051\u308bU-Net\u306e\u6ce8\u91c8\u4ed8\u304dPyTorch\u5b9f\u88c5/\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3002",

0 commit comments

Comments
 (0)