labmlai
diff --git a/‎docs/diffusion/stable_diffusion/model/unet.html‎
Lines changed: 141 additions & 137 deletions b/‎docs/diffusion/stable_diffusion/model/unet.html‎
Lines changed: 141 additions & 137 deletions
diff --git a/‎docs/ja/diffusion/stable_diffusion/model/unet.html‎
Lines changed: 151 additions & 147 deletions b/‎docs/ja/diffusion/stable_diffusion/model/unet.html‎
Lines changed: 151 additions & 147 deletions
diff --git a/‎docs/ja/transformers/rope/index.html‎
Lines changed: 21 additions & 21 deletions b/‎docs/ja/transformers/rope/index.html‎
Lines changed: 21 additions & 21 deletions
diff --git a/‎docs/neox/utils/llm_int8.html‎
Lines changed: 2 additions & 2 deletions b/‎docs/neox/utils/llm_int8.html‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/sitemap.xml‎
Lines changed: 3 additions & 3 deletions b/‎docs/sitemap.xml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/transformers/rope/index.html‎
Lines changed: 21 additions & 21 deletions b/‎docs/transformers/rope/index.html‎
Lines changed: 21 additions & 21 deletions
diff --git a/‎docs/zh/diffusion/stable_diffusion/model/unet.html‎
Lines changed: 148 additions & 144 deletions b/‎docs/zh/diffusion/stable_diffusion/model/unet.html‎
Lines changed: 148 additions & 144 deletions
diff --git a/‎docs/zh/transformers/rope/index.html‎
Lines changed: 21 additions & 21 deletions b/‎docs/zh/transformers/rope/index.html‎
Lines changed: 21 additions & 21 deletions
diff --git a/‎translate_cache/diffusion/stable_diffusion/model/unet.ja.json‎
Lines changed: 2 additions & 2 deletions b/‎translate_cache/diffusion/stable_diffusion/model/unet.ja.json‎
Lines changed: 2 additions & 2 deletions
@@ -74,7 +74,7 @@
             <h1>LLM.int() on GPT-NeoX</h1>
 <p>This implements a utility function to transform a <code  class="highlight"><span></span><span class="n">nn</span><span class="o">.</span><span class="n">Linear</span></code>
  layer to LLM.int8() linear layer.</p>
-<p><a href="https://papers.labml.ai/paper/eb2bcaee1d0011edaa66a71c10a887e7">LLM.int8() paper</a>  shows you can use int8 quantization while handling outliers to reduce memory footprint without performance degradation in large language models. They convert weights and inputs to scaled 8-bit integers and does matrix multiplication producing int32 results which is then converted back to float16 and rescaled. They show that in large language models, some features can give extreme values (outliers) that dominate the model&#x27;s output. These features get clamped in 8-bit integer space which causes the model performance to degrade. As a solution they pick these outliers (greater than a specified threshold) and compute their multiplications separately in float16 space. Since the percentage of outliers is around 0.01% this doesn&#x27;t increase memory usage, and prevents the model from degrading performance.</p>
+<p><a href="https://papers.labml.ai/paper/eb2bcaee1d0011edaa66a71c10a887e7">LLM.int8() paper</a>  shows you can use int8 quantization while handling outliers to reduce memory footprint without performance degradation in large language models. They convert weights and inputs to scaled 8-bit integers and does matrix multiplication producing int32 results which is then converted back to float16 and rescaled. They show that in large langauge models, some features can give extreme values (outliers) that dominate the model&#x27;s output. These features get clamped in 8-bit integer space which causes the model performance to degrade. As a solution they pick these outliers (greater than a specified threshold) and compute their multiplications separately in float16 space. Since the percentage of outliers is around 0.01% this doesn&#x27;t increase memory usage, and prevents the model from degrading performance.</p>
 <p>The code to transform GPT-NoeX layers is defined in <a href="../model.html#post_load_prepare">model.py</a>.</p>
 <p>Here are example uses of GPT-NeoX with int8 quantization.</p>
 <ul><li><a href="../samples/llm_int8.html">Generate Text</a> </li>
@@ -240,4 +240,4 @@ <h2>Transform a <code  class="highlight"><span></span><span class="n">nn</span><
     handleImages()
 </script>
 </body>
-</html>
+</html>
@@ -484,7 +484,7 @@
 
     <url>
       <loc>https://nn.labml.ai/index.html</loc>
-      <lastmod>2023-04-02T16:30:00+00:00</lastmod>
+      <lastmod>2023-06-30T16:30:00+00:00</lastmod>
       <priority>1.00</priority>
     </url>
 
@@ -610,7 +610,7 @@
 
     <url>
       <loc>https://nn.labml.ai/diffusion/stable_diffusion/model/unet.html</loc>
-      <lastmod>2023-01-19T16:30:00+00:00</lastmod>
+      <lastmod>2023-06-30T16:30:00+00:00</lastmod>
       <priority>1.00</priority>
     </url>
 
@@ -932,7 +932,7 @@
 
     <url>
       <loc>https://nn.labml.ai/transformers/rope/index.html</loc>
-      <lastmod>2023-04-02T16:30:00+00:00</lastmod>
+      <lastmod>2023-06-28T16:30:00+00:00</lastmod>
       <priority>1.00</priority>
     </url>
 
 
@@ -6,7 +6,7 @@
  "<h2>U-Net model</h2>\n": "<h2>U-\u30cd\u30c3\u30c8\u30e2\u30c7\u30eb</h2>\n",
  "<h3>Group normalization with float32 casting</h3>\n": "<h3>float32 \u30ad\u30e3\u30b9\u30c6\u30a3\u30f3\u30b0\u306b\u3088\u308b\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n",
  "<h3>Group normalization</h3>\n<p>This is a helper function, with fixed number of groups..</p>\n": "<h3>\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n<p>\u3053\u308c\u306f\u30b0\u30eb\u30fc\u30d7\u6570\u304c\u56fa\u5b9a\u3055\u308c\u305f\u30d8\u30eb\u30d1\u30fc\u95a2\u6570\u3067\u3059\u3002</p>\n",
- "<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules suck as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u304c\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u8907\u6570\u306e\u30e2\u30b8\u30e5\u30fc\u30eb\u304b\u3089\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u305d\u308c\u305e\u308c\u3092\u4e00\u81f4\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<span translate=no>_^_2_^_</span></p>\n",
+ "<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules such as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u306e\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u3001<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\u306a\u3069\u306e\u3055\u307e\u3056\u307e\u306a\u30e2\u30b8\u30e5\u30fc\u30eb\u3067\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001\u305d\u308c\u3089\u3092\u5bfe\u5fdc\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002</p>\n",
  "<h3>Up-sampling layer</h3>\n": "<h3>\u30a2\u30c3\u30d7\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u30ec\u30a4\u30e4\u30fc</h3>\n",
  "<p> </p>\n": "<p></p>\n",
  "<p> Test sinusoidal time step embeddings</p>\n": "<p>\u6b63\u5f26\u6ce2\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30c6\u30b9\u30c8</p>\n",
@@ -52,7 +52,7 @@
  "<ul><li><span translate=no>_^_0_^_</span>  is the input feature map of shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span>  are the time steps of shape <span translate=no>_^_3_^_</span> </li>\n<li><span translate=no>_^_4_^_</span>  conditioning of shape <span translate=no>_^_5_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30b7\u30a7\u30a4\u30d7\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u3067\u3059 <span translate=no>_^_3_^_</span></li>\n<li><span translate=no>_^_4_^_</span>\u5f62\u72b6\u306e\u30b3\u30f3\u30c7\u30a3\u30b7\u30e7\u30cb\u30f3\u30b0 <span translate=no>_^_5_^_</span></li></ul>\n",
  "<ul><li><span translate=no>_^_0_^_</span>  is the input feature map with shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span>  is the time step embeddings of shape <span translate=no>_^_3_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u3067\u3059 <span translate=no>_^_3_^_</span></li></ul>\n",
  "<ul><li><span translate=no>_^_0_^_</span>  is the input feature map with shape <span translate=no>_^_1_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li></ul>\n",
- "<ul><li><span translate=no>_^_0_^_</span>  is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span>  is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span>  is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span>  number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span>  are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span>  are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span>  the number of attention heads in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_2_^_</span>\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u7559\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3092\u5411\u3051\u308b\u3079\u304d\u30ec\u30d9\u30eb\u306f</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u7b97\u4fc2\u6570\u3067\u3059</li>\n<li><span translate=no>_^_6_^_</span>\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570</li></ul>\n",
+ "<ul><li><span translate=no>_^_0_^_</span>  is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span>  is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span>  is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span>  number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span>  are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span>  are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span>  is the number of attention heads in the transformers </li>\n<li><span translate=no>_^_7_^_</span>  is the number of transformer layers in the transformers </li>\n<li><span translate=no>_^_8_^_</span>  is the size of the conditional embedding in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u3001\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_2_^_</span>\u306f\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u5dee\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3059\u3079\u304d\u30ec\u30d9\u30eb\u306f\u3069\u308c\u3050\u3089\u3044\u306e\u30ec\u30d9\u30eb\u304b</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u6cd5\u4fc2\u6570</li>\n<li><span translate=no>_^_6_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570\u3067\u3059</li>\n<li><span translate=no>_^_7_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u5909\u5727\u5668\u5c64\u306e\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_8_^_</span>\u306f\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u5185\u306e\u6761\u4ef6\u4ed8\u304d\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba\u3067\u3059</li></ul>\n",
  "<ul><li><span translate=no>_^_0_^_</span>  is the number of channels</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30c1\u30e3\u30cd\u30eb\u6570</li></ul>\n",
  "<ul><li><span translate=no>_^_0_^_</span>  the number of input channels </li>\n<li><span translate=no>_^_1_^_</span>  the size of timestep embeddings </li>\n<li><span translate=no>_^_2_^_</span>  is the number of out channels. defaults to `channels.</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5165\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_1_^_</span>\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba</li>\n</ul><li><span translate=no>_^_2_^_</span>\u306f\u51fa\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u306e\u6570\u3067\u3059\u3002\u30c7\u30d5\u30a9\u30eb\u30c8\u306f `channels\u3067\u3059\u3002</li>\n",
  "Annotated PyTorch implementation/tutorial of the U-Net in stable diffusion.": "\u5b89\u5b9a\u7248\u62e1\u6563\u306b\u304a\u3051\u308bU-Net\u306e\u6ce8\u91c8\u4ed8\u304dPyTorch\u5b9f\u88c5/\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3002",