Python API reference

User API reference

This is the complete reference of the Python API for the py4dgeo package. It focuses on those aspects relevant to end users that are not interested in algorithm development.

class py4dgeo.Epoch(cloud: ndarray, normals: ndarray = None, additional_dimensions: ndarray = None, timestamp=None, scanpos_info: dict = None)

build_kdtree(leaf_size=10, force_rebuild=False)

Build the search tree index

Parameters:

leaf_size (int) – An internal optimization parameter of the search tree data structure. The algorithm uses a bruteforce search on subtrees of size below the given threshold. Increasing this value speeds up search tree build time, but slows down query times.
force_rebuild (bool) – Rebuild the search tree even if it was already built before.

build_octree(): Build the search octree index

calculate_normals(radius=1.0, orientation_vector: ndarray = array([0, 0, 1]))

Calculate point cloud normals

Parameters:

radius – The radius used to determine the neighborhood of a point.
orientation_vector – A vector to determine orientation of the normals. It should point “up”.

copy(): Copy the epoch object

static load(filename)

Construct an Epoch instance by loading it from a file

Parameters:: filename (str) – The filename to load the epoch from.

property metadata

Provide the metadata of this epoch as a Python dictionary

The return value of this property only makes use of Python built-in data structures such that it can e.g. be serialized using the JSON module. Also, the returned values are understood by Epoch.__init__ such that you can do Epoch(cloud, **other.metadata).

normals_attachment(normals_array)

Attach normals to the epoch object

Parameters:: normals – The point cloud normals of shape (n, 3) where n is the same as the number of points in the point cloud.

save(filename)

Save this epoch to a file

Parameters:: filename (str) – The filename to save the epoch in.

Transform the epoch with an affine transformation

Parameters:

transformation (np.ndarray) – A Transformation object that describes the transformation to apply. If this argument is given, the other arguments are ignored. This parameter is typically used if the transformation was calculated by py4dgeo itself.
affine_transformation – A 4x4 or 3x4 matrix representing the affine transformation. Given as a numpy array. If this argument is given, the rotation and translation arguments are ignored.
rotation (np.ndarray) – A 3x3 matrix specifying the rotation to apply
translation (np.ndarray) – A vector specifying the translation to apply
reduction_point (np.ndarray) – A translation vector to apply before applying rotation and scaling. This is used to increase the numerical accuracy of transformation. If a transformation is given, this argument is ignored.

property transformation

Access the affine transformations that were applied to this epoch

In order to set this property please use the transform method instead, which will make sure to also apply the transformation.

Returns:: Returns a list of applied transformations. These are given as a tuple of a 4x4 matrix defining the affine transformation and the reduction point used when applying it.

py4dgeo.read_from_las(*filenames, normal_columns=[], additional_dimensions={})

Create an epoch from a LAS/LAZ file

Parameters:

filename (str) – The filename to read from. It is expected to be in LAS/LAZ format and will be processed using laspy.
normal_columns (list) – The column names of the normal vector components, e.g. “NormalX”, “nx”, “normal_x” etc., keep in mind that there must be exactly 3 columns. Leave empty, if your data file does not contain normals.
additional_dimensions (dict) – A dictionary, mapping names of additional data dimensions in the input dataset to additional data dimensions in our epoch data structure.

py4dgeo.read_from_xyz(*filenames, xyz_columns=[0, 1, 2], normal_columns=[], additional_dimensions={}, additional_dimensions_dtypes={}, **parse_opts)

Create an epoch from an xyz file

Parameters:

filename (str) – The filename to read from. Each line in the input file is expected to contain three space separated numbers.
xyz_columns (list) – The column indices of X, Y and Z coordinates. Defaults to [0, 1, 2].
normal_columns (list) – The column indices of the normal vector components. Leave empty, if your data file does not contain normals, otherwise exactly three indices for the x, y and z components need to be given.
parse_opts (dict) – Additional options forwarded to numpy.genfromtxt. This can be used to e.g. change the delimiter character, remove header_lines or manually specify which columns of the input contain the XYZ coordinates.
additional_dimensions (dict) – A dictionary, mapping column indices to names of additional data dimensions. They will be read from the file and are accessible under their names from the created Epoch objects. Additional column indexes start with 3.
additional_dimensions_dtypes (dict) – A dictionary, mapping column names to numpy dtypes which should be used in parsing the data.

py4dgeo.save_epoch(epoch, filename)

Save an epoch to a given filename

Parameters:

epoch (Epoch) – The epoch that should be saved.
filename (str) – The filename where to save the epoch

py4dgeo.load_epoch(filename)

Load an epoch from a given filename

Parameters:: filename (str) – The filename to load the epoch from.

class py4dgeo.C2C(epochs: Tuple[Epoch, Epoch] | None = None, corepoints: ndarray | None = None, max_distance: float = inf, correspondence_filter: str = 'none')

Cloud-to-cloud distance based on nearest neighbors.

Parameters

epochs: Pair of point clouds or Epoch objects.
corepoints: Optional source points. If given, distances are computed for these points instead of points from epochs[0].
max_distance: Maximum nearest-neighbor search distance. Distances beyond this threshold are returned as np.nan.
correspondence_filter: Optional correspondence filter. "none" keeps all valid nearest-neighbor matches (default). "mutual_nearest_neighbors" accepts a match only if nearest-neighbor relation is mutual.

run(): Main entry point for running the algorithm.

py4dgeo.statistical_outlier_removal(epoch: Epoch, k: int = 8, std_dev_multiplier: float = 1.0, remove_points: bool = False) → tuple[Epoch, ndarray]

Statistical Outlier Removal (SOR) filter.

Parameters

epochpy4dgeo.Epoch: The point cloud to filter.
kint: Number of nearest neighbors to consider.
std_dev_multiplierfloat: Standard deviation multiplier for the outlier threshold.
remove_pointsbool: If True, return an Epoch containing only inlier points.

Returns

epochpy4dgeo.Epoch: Original epoch, or filtered epoch if remove_points is True.
inlier_outliernp.ndarray: Array with 0 for inliers and 1 for outliers. If remove_points is True, this array is aligned with the filtered epoch.

class py4dgeo.M3C2(normal_radii: List[float] = None, orientation_vector: ndarray = array([0, 0, 1]), corepoint_normals: ndarray = None, cloud_for_normals: Epoch = None, **kwargs)

Bases: M3C2LikeAlgorithm

calculate_distances(epoch1, epoch2, searchtree: str | None = None): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.CloudCompareM3C2(**params)

Bases: M3C2

calculate_distances(epoch1, epoch2, searchtree: str | None = None): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.SpatiotemporalAnalysis(filename, compress=True, allow_pickle=True, force=False)

add_epochs(*epochs): Add a numbers of epochs to the existing analysis

property corepoints: Access the corepoints of this analysis

property distances: Access the M3C2 distances of this analysis

property distances_for_compute

Retrieve the distance array used for computation

This might be the raw data or smoothed data, based on whether a smoothing was provided by the user.

invalidate_results(seeds=True, objects=True, smoothed_distances=False)

Invalidate (and remove) calculated results

This is automatically called when new epochs are added or when an algorithm sets the force option.

property m3c2: Access the M3C2 algorithm of this analysis

property objects: The list of objects by change for this analysis

property reference_epoch: Access the reference epoch of this analysis

property seeds: The list of seed candidates for this analysis

property timedeltas: Access the sequence of time stamp deltas for the time series

property uncertainties: Access the M3C2 uncertainties of this analysis

class py4dgeo.RegionGrowingAlgorithm(seed_subsampling=1, seed_candidates=None, window_width=24, window_min_size=12, window_jump=1, window_penalty=1.0, minperiod=24, height_threshold=0.0, use_unfinished=True, intermediate_saving=0, resume_from_seed=0, stop_at_seed=inf, write_nr_seeds=False, **kwargs)

Bases: RegionGrowingAlgorithmBase

property analysis

Access the analysis object that the algorithm operates on

This is only available after run has been called.

distance_measure()

Distance measure between two time series

Expected to return a function that accepts two time series and returns the distance.

filter_objects(obj): A filter for objects produced by the region growing algorithm

find_seedpoints(): Calculate seedpoints for the region growing algorithm

run(analysis, force=False)

Calculate the _segmentation

Parameters:

analysis (py4dgeo.segmentation.SpatiotemporalAnalysis) – The analysis object we are working with.
force – Force recalculation of results. If false, some intermediate results will be restored from the analysis object instead of being recalculated.

seed_sorting_scorefunction(): Neighborhood similarity sorting function

py4dgeo.regular_corepoint_grid(lowerleft, upperright, num_points, zval=0.0)

A helper function to create a regularly spaced grid for the analysis

Parameters:

lowerleft (np.ndarray) – The lower left corner of the grid. Given as a 2D coordinate.
upperright (np.ndarray) – The upper right corner of the grid. Given as a 2D coordinate.
num_points (tuple) – A tuple with two entries denoting the number of points to be used in x and y direction
zval (double) – The value to fill for the z-direction.

py4dgeo.set_py4dgeo_logfile(filename)

Set the logfile used by py4dgeo

All log messages produced by py4dgeo are logged into this file in addition to be logged to stdout/stderr. By default, that file is called ‘py4dgeo.log’.

Parameters:: filename (str) – The name of the logfile to use

py4dgeo.set_memory_policy(policy: MemoryPolicy)

Globally set py4dgeo’s memory policy

For details about the memory policy, see MemoryPolicy. Use this once before performing any operations. Changing the memory policy in the middle of the computation results in undefined behaviour.

Parameters:: policy (MemoryPolicy) – The policy value to globally set

class py4dgeo.MemoryPolicy

A descriptor for py4dgeo’s memory usage policy

This can be used to describe the memory usage policy that py4dgeo should follow. The implementation of py4dgeo checks the currently set policy whenever it would make a memory allocation of the same order of magnitude as the input pointcloud or the set of corepoints. To globally set the policy, use set_memory_policy().

Currently the following policies are available:

STRICT: py4dgeo is not allowed to do additional memory allocations. If such an allocation would be required, an error is thrown.
MINIMAL: py4dgeo is allowed to do additional memory allocations if and only if they are necessary for a seemless operation of the library.
COREPOINTS: py4dgeo is allowed to do additional memory allocations as part of performance trade-off considerations (e.g. precompute vs. recompute), but only if the allocation is on the order of the number of corepoints. This is the default behaviour of py4dgeo.
RELAXED: py4dgeo is allowed to do additional memory allocations as part of performance trade-off considerations (e.g. precompute vs. recompute).

py4dgeo.get_num_threads()

Get the number of threads currently used by py4dgeo

Returns:: The number of threads
Return type:: int

py4dgeo.set_num_threads(num_threads: int)

Set the number of threads to use in py4dgeo

Parameters:: num_threads – The number of threads to use

“type num_threads: int

class py4dgeo.PBM3C2(registration_error=0.0)

Correspondence-driven plane-based M3C2 for lower uncertainty in 3D topographic change quantification.

This class implements the PBM3C2 algorithm as described in Zahs et al. (2022).

apply(apply_ids, search_radius=3.0): Apply trained classifier to find correspondences.

apply_nearest_neighbor(apply_ids, search_radius=10.0, correspondence_filter='none')

Find correspondences by nearest-neighbor matching of segment CoGs.

This mirrors the C2C nearest-neighbor logic with optional mutual-nearest-neighbor filtering.

get_new_ids(orig_epoch0_id=None, orig_epoch1_id=None): Retrieve new internal segment IDs from original IDs.

get_original_ids(new_epoch0_id=None, new_epoch1_id=None): Retrieve original segment IDs from new internal IDs.

static preprocess_epochs(epoch0, epoch1, correspondences_file=None)

Assign globally unique segment IDs using independent sequential numbering. Optionally map correspondence file IDs to the new ID scheme.

Parameters

epoch0, epoch1Epoch: Input epochs with segment_id in additional_dimensions
correspondences_filestr, optional: Path to CSV file with correspondence data

Returns

tuple

(processed_epoch0, processed_epoch1, remapped_correspondences,: epoch0_id_mapping, epoch1_id_mapping, epoch0_reverse_mapping, epoch1_reverse_mapping)

run(epoch0, epoch1, correspondences_file=None, apply_ids=None, search_radius=3.0, correspondence_method='random_forest', correspondence_filter='none')

Execute complete PBM3C2 workflow.

Parameters

epoch0, epoch1Epoch: Input point cloud epochs with segment_id
correspondences_filestr, optional: Path to CSV with training correspondences. Required for correspondence_method=”random_forest”.
apply_idsarray-like: Segment IDs to find correspondences for (using original IDs). If None, all segment IDs from epoch0 are used.
search_radiusfloat: Spatial search radius in meters
correspondence_methodstr: Correspondence strategy. One of: “random_forest” (default), “nearest_neighbor”.
correspondence_filterstr: Used when correspondence_method=”nearest_neighbor”. One of: “none”, “mutual_nearest_neighbors”.

Returns

DataFrame: Correspondences with distances and uncertainties (using original IDs)

train(correspondences): Train Random Forest classifier on labeled correspondences.

visualize_correspondences(epoch0_segment_id=None, use_original_ids=True, show_all=False, num_samples=10, figsize=(12, 10), elev=30, azim=45)

Visualize matched plane segments and their correspondences.

Parameters

epoch0_segment_idint, optional: Specific segment ID to visualize (original or new ID based on use_original_ids)
use_original_idsbool, optional: If True, interpret epoch0_segment_id as original ID (default: True)
show_allbool, optional: Plot all correspondences (default: False)
num_samplesint, optional: Number of random samples if show_all=False (default: 10)
figsizetuple, optional: Figure size in inches
elevfloat, optional: Elevation angle for 3D view in degrees
azimfloat, optional: Azimuth angle for 3D view in degrees

Returns

tuple: (fig, ax) matplotlib figure and axis objects

class py4dgeo.Vapc(epoch: Epoch, voxel_size: float, origin: list = None)

compute_bitemporal_mahalanobis(other: Vapc, alpha: float = 0.999, min_points: int = 30)

Compute per-voxel bitemporal Mahalanobis distance.

Parameters:

other (Vapc) – Other Vapc with the same voxel_size and origin.
alpha (float) – Significance level for the chi-square test.
min_points (int) – Minimum points per voxel required to compute the statistic.

Returns:

Vapc containing the union of voxels with the following outputs: change_type (1=only self, 2=only other, 3=shared), mahalanobis (distance for shared voxels, NaN otherwise), p_value (NaN for non-shared voxels), significance (1 if significant change on shared voxels, else 0), and changed (1 if occupancy change, significant change, or too few points).

Raises:

ValueError – If voxel_size or origin differ.

compute_centroids()

Compute the centroid of each voxel.

Returns:: Array of voxel centroids with shape (M, 3).

compute_closest_to_centroids()

Compute the point closest to each voxel centroid.

Returns:: Array of representative points with shape (M, 3).

compute_closest_to_voxel_centers()

Compute the point closest to each voxel center.

Returns:: Array of representative points with shape (M, 3).

compute_covariance()

Compute the covariance matrix per voxel.

Returns:: Array of covariance matrices with shape (M, 3, 3).

compute_eigenvalues()

Compute eigenvalues of the per-voxel covariance matrix.

Returns:: Array of eigenvalues with shape (M, 3), sorted in descending order.

compute_features(feature_names: list)

Compute voxel-wise features by name.

Parameters:: feature_names (list[str]) – List of feature names from AVAILABLE_COMPUTATIONS.
Returns:: Dictionary mapping feature names to arrays with per-voxel values.

compute_local_centroids()

Compute centroids relative to voxel centers.

Returns:: Array of local centroids with shape (M, 3).

compute_voxel_centers() → ndarray

Compute the center coordinates of each unique voxel.

Returns:: Array of voxel centers with shape (M, 3).

compute_voxel_indices() → ndarray

Compute voxel indices for each point.

Returns:: Integer voxel indices of shape (N, 3). The result is cached in self.voxel_indices.

delta_vapc(other: Vapc)

Compare voxel occupancy between two Vapc objects.

Parameters:: other (Vapc) – The other Vapc instance to compare against.
Returns:: Vapc containing the union of voxels with out["delta_vapc"] labels (1=only self, 2=only other, 3=shared).
Raises:: ValueError – If voxel_size or origin differ.

filter(mask, overwrite=False)

Filter points by a boolean mask.

Parameters:

mask – Boolean mask of length N indicating which points to keep.
overwrite (bool) – If True, update this Vapc in place and return it.

Returns:

A new Vapc containing the filtered points, or self if overwriting.

group(): Group points into voxels.

group_voxels()

Group points into voxels using integer voxel indices.

Returns:: A tuple (unique_voxels, inverse, counts) where unique_voxels is an array of voxel indices (M, 3), inverse maps each point to its voxel (N,), and counts contains the number of points per voxel (M,).

map_features_to_points() → dict

Map voxel-wise features back to the original point cloud.

Returns:: Dictionary mapping feature names to per-point arrays. The result is stored in self.out and self.mapped is set to True.

reduce_to_feature(feature_name: str) → Vapc

Reduce to one representative point per voxel.

Parameters:: feature_name (str) – Reduction mode: closest_to_centroids, closest_to_voxel_centers, centroid, or voxel_center.
Returns:: A new Vapc with one point per voxel and carried or aggregated attributes.
Raises:: ValueError – If the feature name is unsupported.

save_as_las(outfile: str, las_point_format=7, las_version='1.4', las_scales=None, las_offset=None, skip={})

Save the point cloud and attributes to a LAS/LAZ file.

Parameters:

outfile (str) – Path where the LAS or LAZ file will be stored.
las_point_format – LAS point format number.
las_version – LAS file version string.
las_scales – Scale factors for X, Y, Z. Defaults to [0.00025, 0.00025, 0.00025].
las_offset – Offset values for X, Y, Z. Defaults to the minimum coordinate values.
skip – Dictionary of attribute names to skip when writing.

save_as_ply(outfile: str, mode: str | ndarray = 'grid', features: list[str] = None, shift_to_center: bool = False)

Export occupied voxels as cubes in a PLY mesh.

Parameters:

outfile (str) – Output PLY path.
mode (str or np.ndarray) – Either a mode string (“grid”, “voxel_center”, “centroid”, “closest_to_centroids”, “closest_to_voxel_centers”) or an array of custom centers.
features (list[str] or None) – Feature names to store per vertex.
shift_to_center (bool) – If True, recenter vertices around the mesh centroid.

select_by_mask(vapc_mask: Vapc, segment_in_or_out: str = 'in', overwrite: bool = False) → Vapc

Select points based on overlap with another Vapc mask.

Parameters:

vapc_mask (Vapc) – Vapc whose occupied voxels define the mask.
segment_in_or_out (str) – “in” to keep points in voxels that overlap with the mask, “out” to keep points in voxels that do not overlap with the mask.
overwrite (bool) – If True, overwrite this Vapc’s data with the selection.

Returns:

A tuple (selected_vapc, selection_mask) where selection_mask is a boolean array of length N.

py4dgeo.enable_trace(enable=True)

Enable or disable trace output for decorated functions.

Parameters:: enable (bool) – If True, emit trace output. If False, suppress it.

py4dgeo.enable_timeit(enable=True)

Enable or disable timing output for decorated functions.

Parameters:: enable (bool) – If True, emit timing output. If False, suppress it.

Developer API reference

py4dgeo.epoch.as_epoch(cloud)

Create an epoch from a cloud

Idempotent operation to create an epoch from a cloud.

py4dgeo.epoch.normalize_timestamp(timestamp): Bring a given timestamp into a standardized Python format

class py4dgeo.m3c2.M3C2LikeAlgorithm(epochs: Tuple[Epoch, ...] | None = None, corepoints: ndarray | None = None, cyl_radii: List | None = None, cyl_radius: float | None = None, max_distance: float = 0.0, registration_error: float = 0.0, robust_aggr: bool = False)

calculate_distances(epoch1, epoch2, searchtree: str | None = None): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.fallback.PythonFallbackM3C2(normal_radii: List[float] = None, orientation_vector: ndarray = array([0, 0, 1]), corepoint_normals: ndarray = None, cloud_for_normals: Epoch = None, **kwargs)

Bases: M3C2

An implementation of M3C2 that makes use of Python fallback implementations

calculate_distances(epoch1, epoch2, searchtree: str | None = None): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.segmentation.RegionGrowingAlgorithmBase(neighborhood_radius=1.0, thresholds=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], min_segments=20, max_segments=None)

property analysis

Access the analysis object that the algorithm operates on

This is only available after run has been called.

distance_measure()

Distance measure between two time series

Expected to return a function that accepts two time series and returns the distance.

filter_objects(obj)

A filter for objects produced by the region growing algorithm

Objects are discarded if this method returns False.

find_seedpoints(): Calculate seedpoints for the region growing algorithm

run(analysis, force=False)

Calculate the _segmentation

Parameters:

analysis (py4dgeo.segmentation.SpatiotemporalAnalysis) – The analysis object we are working with.
force – Force recalculation of results. If false, some intermediate results will be restored from the analysis object instead of being recalculated.

seed_sorting_scorefunction()

A function that computes a score for a seed candidate

This function is used to prioritize seed candidates.

class py4dgeo.segmentation.RegionGrowingSeed(index, start_epoch, end_epoch)

class py4dgeo.segmentation.ObjectByChange(data, seed, analysis=None)

Representation a change object in the spatiotemporal domain

property end_epoch: The index of the end epoch of the change object

property indices: The set of corepoint indices that compose the object by change

plot(filename=None)

Create an informative visualization of the Object By Change

Parameters:: filename (str) – The filename to use to store the plot. Can be omitted to only show plot in a Jupyter notebook session.

property start_epoch: The index of the start epoch of the change object

property threshold: The distance threshold that produced this object

py4dgeo.segmentation.check_epoch_timestamp(epoch): Validate an epoch to be used with SpatiotemporalSegmentation

class py4dgeo.util.Py4DGeoError(msg, loggername='py4dgeo')

py4dgeo.util.find_file(filename, fatal=True)

Find a file of given name on the file system.

This function is intended to use in tests and demo applications to locate data files without resorting to absolute paths. You may use it for your code as well.

It looks in the following locations:

If an absolute filename is given, it is used
Check whether the given relative path exists with respect to the current working directory
Check whether the given relative path exists with respect to the specified XDG data directory (e.g. through the environment variable XDG_DATA_DIRS).
Check whether the given relative path exists in downloaded test data.
If still not found, attempt to download test data and search again.

Parameters:

filename (str) – The (relative) filename to search for
fatal (bool) – Whether not finding the file should be a fatal error

Returns:

An absolute filename

py4dgeo.util.as_double_precision(arr: ndarray, policy_check=True)

Ensure that a numpy array is double precision

This is a no-op if the array is already double precision and makes a copy if it is not. It checks py4dgeo’s memory policy before copying.

Parameters:: arr (np.ndarray) – The numpy array

py4dgeo.util.make_contiguous(arr: ndarray)

Make a numpy array contiguous

This is a no-op if the array is already contiguous and makes a copy if it is not. It checks py4dgeo’s memory policy before copying.

Parameters:: arr (np.ndarray) – The numpy array

py4dgeo.util.memory_policy_is_minimum(policy: MemoryPolicy)

Whether or not the globally set memory policy is at least the given one

Parameters:: policy (MemoryPolicy) – The policy value to compare against
Returns:: Whether the globally set policy is at least the given one
Return type:: bool

py4dgeo.util.append_file_extension(filename, extension): Append a file extension if and only if the original filename has none

py4dgeo.util.is_iterable(obj): Whether the object is an iterable (excluding a string)

py4dgeo.logger.create_default_logger(filename=None)

py4dgeo.logger.logger_context(msg, level=20)