|
6 | 6 | "<h2>U-Net model</h2>\n": "<h2>U-\u30cd\u30c3\u30c8\u30e2\u30c7\u30eb</h2>\n", |
7 | 7 | "<h3>Group normalization with float32 casting</h3>\n": "<h3>float32 \u30ad\u30e3\u30b9\u30c6\u30a3\u30f3\u30b0\u306b\u3088\u308b\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n", |
8 | 8 | "<h3>Group normalization</h3>\n<p>This is a helper function, with fixed number of groups..</p>\n": "<h3>\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n<p>\u3053\u308c\u306f\u30b0\u30eb\u30fc\u30d7\u6570\u304c\u56fa\u5b9a\u3055\u308c\u305f\u30d8\u30eb\u30d1\u30fc\u95a2\u6570\u3067\u3059\u3002</p>\n", |
9 | | - "<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules suck as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u304c\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u8907\u6570\u306e\u30e2\u30b8\u30e5\u30fc\u30eb\u304b\u3089\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u305d\u308c\u305e\u308c\u3092\u4e00\u81f4\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<span translate=no>_^_2_^_</span></p>\n", |
| 9 | + "<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules such as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u306e\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u3001<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\u306a\u3069\u306e\u3055\u307e\u3056\u307e\u306a\u30e2\u30b8\u30e5\u30fc\u30eb\u3067\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001\u305d\u308c\u3089\u3092\u5bfe\u5fdc\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002</p>\n", |
10 | 10 | "<h3>Up-sampling layer</h3>\n": "<h3>\u30a2\u30c3\u30d7\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u30ec\u30a4\u30e4\u30fc</h3>\n", |
11 | 11 | "<p> </p>\n": "<p></p>\n", |
12 | 12 | "<p> Test sinusoidal time step embeddings</p>\n": "<p>\u6b63\u5f26\u6ce2\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30c6\u30b9\u30c8</p>\n", |
|
52 | 52 | "<ul><li><span translate=no>_^_0_^_</span> is the input feature map of shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> are the time steps of shape <span translate=no>_^_3_^_</span> </li>\n<li><span translate=no>_^_4_^_</span> conditioning of shape <span translate=no>_^_5_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30b7\u30a7\u30a4\u30d7\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u3067\u3059 <span translate=no>_^_3_^_</span></li>\n<li><span translate=no>_^_4_^_</span>\u5f62\u72b6\u306e\u30b3\u30f3\u30c7\u30a3\u30b7\u30e7\u30cb\u30f3\u30b0 <span translate=no>_^_5_^_</span></li></ul>\n", |
53 | 53 | "<ul><li><span translate=no>_^_0_^_</span> is the input feature map with shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> is the time step embeddings of shape <span translate=no>_^_3_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u3067\u3059 <span translate=no>_^_3_^_</span></li></ul>\n", |
54 | 54 | "<ul><li><span translate=no>_^_0_^_</span> is the input feature map with shape <span translate=no>_^_1_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li></ul>\n", |
55 | | - "<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span> is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span> is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span> number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span> are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span> are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span> the number of attention heads in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_2_^_</span>\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u7559\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3092\u5411\u3051\u308b\u3079\u304d\u30ec\u30d9\u30eb\u306f</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u7b97\u4fc2\u6570\u3067\u3059</li>\n<li><span translate=no>_^_6_^_</span>\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570</li></ul>\n", |
| 55 | + "<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span> is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span> is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span> number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span> are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span> are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span> is the number of attention heads in the transformers </li>\n<li><span translate=no>_^_7_^_</span> is the number of transformer layers in the transformers </li>\n<li><span translate=no>_^_8_^_</span> is the size of the conditional embedding in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u3001\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_2_^_</span>\u306f\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u5dee\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3059\u3079\u304d\u30ec\u30d9\u30eb\u306f\u3069\u308c\u3050\u3089\u3044\u306e\u30ec\u30d9\u30eb\u304b</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u6cd5\u4fc2\u6570</li>\n<li><span translate=no>_^_6_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570\u3067\u3059</li>\n<li><span translate=no>_^_7_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u5909\u5727\u5668\u5c64\u306e\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_8_^_</span>\u306f\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u5185\u306e\u6761\u4ef6\u4ed8\u304d\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba\u3067\u3059</li></ul>\n", |
56 | 56 | "<ul><li><span translate=no>_^_0_^_</span> is the number of channels</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30c1\u30e3\u30cd\u30eb\u6570</li></ul>\n", |
57 | 57 | "<ul><li><span translate=no>_^_0_^_</span> the number of input channels </li>\n<li><span translate=no>_^_1_^_</span> the size of timestep embeddings </li>\n<li><span translate=no>_^_2_^_</span> is the number of out channels. defaults to `channels.</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5165\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_1_^_</span>\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba</li>\n</ul><li><span translate=no>_^_2_^_</span>\u306f\u51fa\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u306e\u6570\u3067\u3059\u3002\u30c7\u30d5\u30a9\u30eb\u30c8\u306f `channels\u3067\u3059\u3002</li>\n", |
58 | 58 | "Annotated PyTorch implementation/tutorial of the U-Net in stable diffusion.": "\u5b89\u5b9a\u7248\u62e1\u6563\u306b\u304a\u3051\u308bU-Net\u306e\u6ce8\u91c8\u4ed8\u304dPyTorch\u5b9f\u88c5/\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3002", |
|
0 commit comments