Data¶
Purpose¶
The data module is an optional shortcut for easy downloading official EasyIDP demo datasets. It is not required for normal EasyIDP workflows. Most EasyIDP APIs accept ordinary file paths directly, so users can pass their own .shp, .tif, Pix4D, Metashape, or point-cloud paths without constructing an idp.data object.
The main purpose of this module is to keep example path readable:
import easyidp as idp
lotus = idp.data.Lotus()
roi = idp.ROI(lotus.shp)
ms = idp.Metashape(lotus.metashape.project)
Construction is lightweight. It does not download or extract data. Call download() explicitly when needed:
lotus = idp.data.Lotus()
if not lotus.is_ready():
lotus.download()
Dependencies¶
For downloading demo datasets, need to manually install the optional data backend:
pip install "easyidp[data]"
When developing EasyIDP from the source tree, install all development groups and package extras together:
uv sync --all-groups --all-extras
Configuration¶
The default data directory is default os app storage path:
For windows, it is %APPDATA%/easyidp.data.
For Linux, it is ~/.local/share/easyidp.data.
For MacOS, it is ~/Library/Application Support/easyidp.data.
But users can change the default data directory by updating the data_dir key in the global configuration. The example below shows how to set the data directory to /path/to/easyidp.data.
import easyidp as idp
idp.config.set(data_dir="/path/to/easyidp.data")
lotus = idp.data.Lotus()
REPL Representation¶
Dataset construction is still lightweight but now prints status information
when the object is displayed in a REPL. The repr output includes the
dataset title, size, download status, and the cache directory path:
>>> import easyidp as idp
>>> fb = idp.data.ForestBirds()
>>> fb
<easyidp.data.dataset.ForestBirds object at 0x...>
Official EasyIDP forest birds demo dataset from Florida.
Size: 1.97 GB
Status: not downloaded. call .download() to save at
/home/user/.local/share/easyidp.data/2022_florida_forestbirds
You can change the download location with:
idp.config.set(data_dir="/path/to/easyidp.data")
When the dataset is fully available on disk, the Status line changes to:
Status: available at
/home/user/.local/share/easyidp.data/2022_florida_forestbirds
Note
Changing data_dir through idp.config.set(data_dir=...) does
not migrate or move already-cached datasets to the new location.
New Dataset objects constructed after the change will use the new
path, but existing objects retain the root they were created with.
Mirrors¶
By default, easyidp try to download dataset from Shared Google Drive by gdown package. For users in China mainland, please use OpenXLab mirror for better downloading experience. At current stage, easyidp uses anonymous public dataset CDN URLs. They do not require the OpenXLab SDK, login, Access Key, or Secret Key:
import easyidp as idp
lotus = idp.data.Lotus()
lotus.download(mirror="openxlab")
Classes¶
|
Dataset for the lotus plot in Tanashi, Tokyo. |
|
Dataset for the forest ecology survey in Florida. |
|
Developer and package test dataset. |
Functions¶
Return the names of available EasyIDP demo datasets. |
Advanced API¶
easyidp.data builds short demo-data attributes from JSON manifest keys.
Dotted keys such as metashape.project and metashape.outputs.dom are
expanded into runtime namespaces so users can write lotus.metashape.project
or lotus.metashape.outputs.dom.
The same module also contains explicit downloader helpers for Google Drive, anonymous OpenXLab mirrors, verified streaming downloads, and zip extraction.
The recursive namespace object is implemented as
easyidp.data.dataset._PathNamespace. The objects below are intended for
advanced users and contributors who need to understand manifest parsing,
runtime path expansion, dataset validation, and downloader internals. They are
not exported from easyidp.data unless shown in the public sections above.
Classes¶
|
EasyIDP dataset backed by a JSON manifest. |
|
Recursively expose nested file mappings as path attributes. |
Functions¶
|
Load and parse a JSON manifest file. |
Check that a dotted file key does not conflict with reserved Dataset attributes. |
|
Validate manifest structure and raise |
|
|
Build |
|
Download and extract a dataset. |
|
Extract a zip archive, rejecting path-traversal members. |
|
Select a mirror key. |
|
Download a file from Google Drive via gdown. |
|
Download a file from OpenXLab anonymously via the v3 API. |
Resolve CDN download URL and metadata for an OpenXLab file. |
|
Extract a 64-hex SHA256 from an OpenXLab CDN objects URL path. |
|
|
Stream a file from url to output with verification. |
|
Build a JSON-friendly download result dict. |