Skip to content

Commit 510e8d2

Browse files
vertex-mg-botcopybara-github
authored andcommitted
Update link to Cloud Quotas page to correct location
PiperOrigin-RevId: 840478912
1 parent 4ab197a commit 510e8d2

60 files changed

Lines changed: 77 additions & 77 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

notebooks/community/model_garden/model_garden_advanced_features.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -109,9 +109,9 @@
109109
"- Custom model serving TPU v5e cores per region\n",
110110
"- Custom model serving Nvidia A100 80GB GPUs per region\n",
111111
"\n",
112-
"By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4, which is sufficient for serving the Llama 3.1 8B model. The Llama 3.1 70B model requires 16 TPU v5e cores. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
112+
"By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4, which is sufficient for serving the Llama 3.1 8B model. The Llama 3.1 70B model requires 16 TPU v5e cores. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
113113
"\n",
114-
"The quota for A100_80GB deployment `Custom model serving Nvidia A100 80GB GPUs per region` is 0. You need to request at least 4 for 70B model and 1 for 8B model following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota)."
114+
"The quota for A100_80GB deployment `Custom model serving Nvidia A100 80GB GPUs per region` is 0. You need to request at least 4 for 70B model and 1 for 8B model following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota)."
115115
]
116116
},
117117
{
@@ -136,7 +136,7 @@
136136
"\n",
137137
"REGION = \"\" # @param {type:\"string\"}\n",
138138
"\n",
139-
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
139+
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
140140
"\n",
141141
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
142142
"# @markdown | ----------- | ----------- | ----------- |\n",

notebooks/community/model_garden/model_garden_codegemma_deployment_on_vertex.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@
9595
"\n",
9696
"# @markdown 1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n",
9797
"\n",
98-
"# @markdown 2. By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
98+
"# @markdown 2. By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
9999
"\n",
100100
"# @markdown 3. **[Optional]** [Create a Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) for storing experiment outputs. Set the BUCKET_URI for the experiment environment. The specified Cloud Storage bucket (`BUCKET_URI`) should be located in the same region as where the notebook was launched. Note that a multi-region bucket (eg. \"us\") is not considered a match for a single region covered by the multi-region range (eg. \"us-central1\"). If not set, a unique GCS bucket will be created instead.\n",
101101
"\n",

notebooks/community/model_garden/model_garden_deployment_tutorial.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@
9797
"\n",
9898
"REGION = \"\" # @param {type:\"string\"}\n",
9999
"\n",
100-
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
100+
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
101101
"\n",
102102
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
103103
"# @markdown | ----------- | ----------- | ----------- |\n",

notebooks/community/model_garden/model_garden_gemma2_deployment_on_vertex.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@
104104
"source": [
105105
"# @title Request for TPU quota\n",
106106
"\n",
107-
"# @markdown By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota)."
107+
"# @markdown By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota)."
108108
]
109109
},
110110
{
@@ -128,7 +128,7 @@
128128
"\n",
129129
"REGION = \"\" # @param {type:\"string\"}\n",
130130
"\n",
131-
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
131+
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
132132
"\n",
133133
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
134134
"# @markdown | ----------- | ----------- | ----------- |\n",

notebooks/community/model_garden/model_garden_gemma_deployment_on_vertex.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@
9696
"\n",
9797
"# @markdown 1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n",
9898
"\n",
99-
"# @markdown 2. By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
99+
"# @markdown 2. By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. TPU quota is only available in `us-west1`. You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
100100
"\n",
101101
"# @markdown 3. **[Optional]** [Create a Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) for storing experiment outputs. Set the BUCKET_URI for the experiment environment. The specified Cloud Storage bucket (`BUCKET_URI`) should be located in the same region as where the notebook was launched. Note that a multi-region bucket (eg. \"us\") is not considered a match for a single region covered by the multi-region range (eg. \"us-central1\"). If not set, a unique GCS bucket will be created instead.\n",
102102
"\n",

notebooks/community/model_garden/model_garden_hexllm_deep_dive_tutorial.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@
106106
"# @markdown\n",
107107
"# @markdown By default, the quota for TPU deployment `Custom model serving TPU v5e cores per region` is 4. This will be lifted to 16 in the future.\n",
108108
"# @markdown Verify that you have the appropriate TPU quota for your chosen configuration (e.g., 1, 4, 8, or 16 cores) in the selected region.\n",
109-
"# @markdown You can request for higher TPU quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota)."
109+
"# @markdown You can request for higher TPU quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota)."
110110
]
111111
},
112112
{

notebooks/community/model_garden/model_garden_huggingface_pytorch_inference_deployment.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@
101101
"\n",
102102
"REGION = \"\" # @param {type:\"string\"}\n",
103103
"\n",
104-
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
104+
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
105105
"\n",
106106
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
107107
"# @markdown | ----------- | ----------- | ----------- |\n",

notebooks/community/model_garden/model_garden_huggingface_tei_deployment.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@
110110
"\n",
111111
"REGION = \"\" # @param {type:\"string\"}\n",
112112
"\n",
113-
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
113+
"# @markdown 4. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
114114
"\n",
115115
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
116116
"# @markdown | ----------- | ----------- | ----------- |\n",

notebooks/community/model_garden/model_garden_huggingface_tgi_deployment.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@
102102
"\n",
103103
"REGION = \"\" # @param {type:\"string\"}\n",
104104
"\n",
105-
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
105+
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
106106
"\n",
107107
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
108108
"# @markdown | ----------- | ----------- | ----------- |\n",

notebooks/community/model_garden/model_garden_huggingface_vllm_deployment.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@
102102
"\n",
103103
"REGION = \"\" # @param {type:\"string\"}\n",
104104
"\n",
105-
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a higher quota\"](https://cloud.google.com/docs/quota/view-manage#requesting_higher_quota).\n",
105+
"# @markdown 3. If you want to run predictions with A100 80GB or H100 GPUs, we recommend using the regions listed below. **NOTE:** Make sure you have associated quota in selected regions. Click the links to see your current quota for each GPU type: [Nvidia A100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_a100_80gb_gpus), [Nvidia H100 80GB](https://console.cloud.google.com/iam-admin/quotas?metric=aiplatform.googleapis.com%2Fcustom_model_serving_nvidia_h100_gpus). You can request for quota following the instructions at [\"Request a quota adjustment\"](https://cloud.google.com/docs/quotas/view-manage#requesting_higher_quota).\n",
106106
"\n",
107107
"# @markdown | Machine Type | Accelerator Type | Recommended Regions |\n",
108108
"# @markdown | ----------- | ----------- | ----------- |\n",

0 commit comments

Comments
 (0)