DATASET_INFO
Extracts detailed metadata from a LAS/LAZ point cloud using PDAL. The step runs pdal info on the input dataset and produces a JSON document describing the structure and statistics of the point cloud.
This step is a pure inspection step — it does not transform the data. It extracts structured metadata and publishes it as an artifact that can be consumed by downstream steps or external systems.
Typical use: dataset inspection, metadata extraction, quality control, and feeding metadata to CALL_WEBHOOK so external systems receive point cloud statistics at job completion.
Contract
| Type | DATASET_INFO |
| Accepts | input_las: las |
| Produces | metadata: json |
| Params | none |
Inputs
| Slot | Type | Description |
|---|---|---|
input_las | las | Point cloud dataset to inspect |
Outputs
| Slot | Type | Description |
|---|---|---|
metadata | json | PDAL metadata report describing the dataset |
What it does internally
- Downloads the LAS artifact from MinIO
- Runs
pdal info input.las - Saves the JSON output to
info.json - Uploads
info.jsonto MinIO as an artifact - Returns
metadatapointing to the uploaded JSON
The PDAL metadata report includes:
- Bounding boxes (native CRS and EPSG:4326)
- Total point count
- Per-dimension statistics (min, max, mean, stddev) for X, Y, Z, Intensity, etc.
- File size
- PDAL reader information
- Coordinate system information (when available)
Role in standard pipelines
All current MapPrism preset pipelines run DATASET_INFO in parallel with the main conversion step:
json
"recipe": [
{
"id": "build_copc",
"type": "BUILD_COPC",
"inputs": { "input_las": "job:input_las" },
"outputs": { "output_copc": "step:build_copc.output_copc" }
},
{
"id": "dataset_info",
"type": "DATASET_INFO",
"inputs": { "input_las": "job:input_las" },
"outputs": { "metadata": "step:dataset_info.metadata" }
}
]The CALL_WEBHOOK in on_exit waits for step:dataset_info.metadata, ensuring the full job object (including metadata) is available when the webhook fires.
Recipe usage
json
{
"id": "dataset_info",
"type": "DATASET_INFO",
"inputs": { "input_las": "job:input_las" },
"outputs": { "metadata": "step:dataset_info.metadata" }
}Artifact storage path
artifacts/job_{id}/dataset_info/info.json