Docs: add obstore tutorial by aboydnw · Pull Request #527 · microsoft/PlanetaryComputerDataCatalog

aboydnw · 2026-05-22T19:43:36Z

Tutorial for obstore, based on the outline here: https://docs.google.com/document/d/1LIf6SvMHK3Gr8gSG8eqmjwyROeeoQl3Odg1CVLqp1AI/edit?usp=sharing

Adds a new tutorial walking through reading Planetary Computer data with obstore (auto-refreshing SAS tokens, range reads, async, library composability). Companion notebook lives in PlanetaryComputerExamples at quickstarts/obstore.ipynb and is wired in via external_docs_config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drops the Colab badge (off-brand for PC; Hub is the canonical JupyterLab environment) and replaces the TODO placeholders with real URLs: nbgitpuller deep link to PC Hub and a github.com blob link to the companion notebook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Inlines the Hub and GitHub URLs on the badge line and drops the reference-style defs at the bottom. Also picks up the inline copy edits across the body. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Hub link is the canonical way to open the notebook; the GitHub view duplicates what the docs site already renders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drops Lonboard reference (no obstore integration in Lonboard) and notes that zarr-python access goes through the zarr.storage.ObjectStore adapter rather than direct hand-off. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

kylebarron · 2026-05-26T20:46:15Z

@@ -0,0 +1,164 @@
+# Reading Planetary Computer data with obstore
+
+[obstore](https://developmentseed.org/obstore/) is a Python library for reading and writing cloud object stores (Azure Blob, Amazon S3, Google Cloud Storage) directly through their native APIs. Using obstore, SAS tokens refresh automatically, async I/O is built in, and the same store you build for reading bytes can be handed to higher-level libraries like [async-geotiff](https://github.com/developmentseed/async-geotiff), [Lonboard](https://developmentseed.org/lonboard/), and [zarr-python](https://zarr.dev/) without re-authenticating.


directly through their native APIs

I think this is a bit misleading. I think most users would understand "native" to mean "the raw underlying API specific to each cloud storage provider". That's not what obstore does; if a user wants to use the Azure API directly, they'll use azure.storage directly.

Obstore presents one, unified, abstracted API that is the same across Azure, S3, and GCS. That's the selling point.

kylebarron · 2026-05-26T20:55:11Z

+   ```python
+   import pystac_client
+   from obstore.auth.planetary_computer import PlanetaryComputerCredentialProvider
+
+   catalog = pystac_client.Client.open(
+       "https://planetarycomputer.microsoft.com/api/stac/v1"
+   )
+   item = next(catalog.search(collections=["naip"], max_items=1).items())
+   asset = item.assets["image"]
+   ```
+
+2. Build a credential provider from the asset.
+
+   ```python
+   provider = PlanetaryComputerCredentialProvider.from_asset(asset)
+   ```


This makes me realize that from_asset is a bit annoying if you want to work with a collection instead of an item.

I see that the NAIP Collection JSON defines

"msft:storage_account": "naipeuwest"

so we could potentially have a from_collection constructor too.

Or maybe from_asset should really be renamed to from_stac, and support both Item and Collection? Thoughts?

kylebarron · 2026-05-26T20:56:07Z

+   provider = PlanetaryComputerCredentialProvider.from_asset(asset)
+   ```
+
+3. Build a store using that provider. The store is your reusable connection to that asset.


It's important to note that the store doesn't just connect to one asset; it provides the auth to access anything in that bucket (or I guess "container" in Azure terminology) (except as mentioned below, the prefix on the store is currently mounted to this specific file)

kylebarron · 2026-05-26T21:04:25Z

+2. **Read multiple byte ranges in a single request.** Cuts round-trip latency when you need several non-contiguous slices of the same file (e.g. multiple COG tiles).
+
+   ```python
+   ranges = obstore.get_ranges(


ditto use store.get_ranges

kylebarron · 2026-05-26T21:07:59Z

+async def fetch(start, end):
+    return await obstore.get_range_async(async_store, "", start=start, end=end)
+
+results = await asyncio.gather(*[fetch(i * 4096, (i + 1) * 4096) for i in range(8)])


This is a bad example, because it's making several independent requests for different parts of a file.

For this use case we should be pointing users towards store.get_ranges_async, because under the hood that will combine adjacent ranges into a single network request.

So for example, this example makes independent requests for 0-4096, 4096-8192, etc. But get_ranges_async would automatically make just a single request under the hood for 0-32768, instead of 8 concurrent requests, and that would be a lot faster.

kylebarron · 2026-05-26T21:11:48Z

+from obstore.store import S3Store
+
+s3_store = S3Store(bucket="my-bucket", region="us-west-2")
+buf = obstore.get(s3_store, "path/to/object").bytes()


Actually this doesn't work... obstore.get won't work against the obspec protocol... The obspec protocol is defined in terms of the methods on the class. That's part of why I want to nudge people to use store.get instead of obstore.get

Co-authored-by: Kyle Barron <kylebarron2@gmail.com>

aboydnw and others added 5 commits May 20, 2026 19:41

docs: inline notebook badge URLs and tighten copy

6d4c423

Inlines the Hub and GitHub URLs on the badge line and drops the reference-style defs at the bottom. Also picks up the inline copy edits across the body. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs: drop GitHub badge from obstore tutorial

bee26a1

Hub link is the canonical way to open the notebook; the GitHub view duplicates what the docs site already renders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

aboydnw mentioned this pull request May 22, 2026

Docs: add obstore quickstart microsoft/PlanetaryComputerExamples#313

Open

zacdezgeo approved these changes May 25, 2026

View reviewed changes

aboydnw marked this pull request as ready for review May 26, 2026 15:19

kylebarron reviewed May 26, 2026

View reviewed changes

aboydnw and others added 4 commits May 26, 2026 15:01

Update docs/overview/obstore.md

086d0ec

Co-authored-by: Kyle Barron <kylebarron2@gmail.com>

Update docs/overview/obstore.md

6f44f0f

Co-authored-by: Kyle Barron <kylebarron2@gmail.com>

Update docs/overview/obstore.md

0fcdbea

Co-authored-by: Kyle Barron <kylebarron2@gmail.com>

Update docs/overview/obstore.md

338133c

Co-authored-by: Kyle Barron <kylebarron2@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: add obstore tutorial#527

Docs: add obstore tutorial#527
aboydnw wants to merge 9 commits into
microsoft:developfrom
aboydnw:docs-add-obstore-tutorial

aboydnw commented May 22, 2026

Uh oh!

kylebarron May 26, 2026

Uh oh!

Uh oh!

Uh oh!

kylebarron May 26, 2026

Uh oh!

kylebarron May 26, 2026

Uh oh!

kylebarron May 26, 2026

Uh oh!

Uh oh!

kylebarron May 26, 2026

Uh oh!

Uh oh!

kylebarron May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,164 @@
		# Reading Planetary Computer data with obstore

		[obstore](https://developmentseed.org/obstore/) is a Python library for reading and writing cloud object stores (Azure Blob, Amazon S3, Google Cloud Storage) directly through their native APIs. Using obstore, SAS tokens refresh automatically, async I/O is built in, and the same store you build for reading bytes can be handed to higher-level libraries like [async-geotiff](https://github.com/developmentseed/async-geotiff), [Lonboard](https://developmentseed.org/lonboard/), and [zarr-python](https://zarr.dev/) without re-authenticating.

Conversation

aboydnw commented May 22, 2026

Uh oh!

kylebarron May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kylebarron May 26, 2026

Choose a reason for hiding this comment

Uh oh!

kylebarron May 26, 2026

Choose a reason for hiding this comment

Uh oh!

kylebarron May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kylebarron May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kylebarron May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants