Skip to content

Memory Management

The BUILD_EPT executor includes adaptive memory management to handle very large point clouds without running out of RAM.

The problem

Entwine's entwine build command loads the entire dataset into memory during indexing. For large MBES or LiDAR surveys, this can easily exceed available RAM and cause the process to be killed.

The solution: adaptive subset sizing

Before building, the BUILD_EPT workflow:

  1. Estimates the point count using pdal info input.las
  2. Computes a safe subset count based on available RAM and estimated memory per point
  3. Rounds up to the nearest power of 4 — Entwine requires subset counts to be powers of 4 (1, 4, 16, 64, ...)
  4. Builds each subset independently using entwine build -s {id}
  5. Merges all subsets into a single EPT tileset using entwine merge

Subset sizing algorithm

The algorithm estimates:

points_per_gib = estimated_points / available_ram_gib
subset_count   = ceil(points_per_gib / baseline_points_per_gib)

Where baseline_points_per_gib is a tuning constant based on empirical testing. The result is then rounded up to the next power of 4.

An imbalance factor is applied to account for uneven point distribution across the dataset — subsets that happen to cover dense areas need more memory than average.

When subsets are used

For small datasets (subset count = 1), Entwine runs a single-pass build. For larger datasets, the workflow automatically switches to multi-subset mode. No configuration is needed.

Why powers of 4?

Entwine's octree spatial indexing divides space in powers of 4. Using non-power-of-4 subset counts would result in an unbalanced or incomplete merge.

Effect on output

The final merged EPT dataset is identical to what a single-pass build would produce. Subsetting is purely an implementation detail of the build process.