Python API reference

User API reference

This is the complete reference of the Python API for the py4dgeo package. It focuses on those aspects relevant to end users that are not interested in algorithm development.

class py4dgeo.Epoch(cloud: ndarray, normals: ndarray | None = None, additional_dimensions: ndarray | None = None, timestamp=None, scanpos_info: dict | None = None)

build_kdtree(leaf_size=10, force_rebuild=False)

Build the search tree index

Parameters:

leaf_size (int) – An internal optimization parameter of the search tree data structure. The algorithm uses a bruteforce search on subtrees of size below the given threshold. Increasing this value speeds up search tree build time, but slows down query times.
force_rebuild (bool) – Rebuild the search tree even if it was already built before.

calculate_normals(radius=1.0, orientation_vector: ndarray = array([0, 0, 1]))

Calculate point cloud normals

Parameters:

radius – The radius used to determine the neighborhood of a point.
orientation_vector – A vector to determine orientation of the normals. It should point “up”.

static load(filename)

Construct an Epoch instance by loading it from a file

Parameters:: filename (str) – The filename to load the epoch from.

property metadata

Provide the metadata of this epoch as a Python dictionary

The return value of this property only makes use of Python built-in data structures such that it can e.g. be serialized using the JSON module. Also, the returned values are understood by Epoch.__init__ such that you can do Epoch(cloud, **other.metadata).

save(filename)

Save this epoch to a file

Parameters:: filename (str) – The filename to save the epoch in.

Transform the epoch with an affine transformation

Parameters:

transformation (np.ndarray) – A Transformation object that describes the transformation to apply. If this argument is given, the other arguments are ignored. This parameter is typically used if the transformation was calculated by py4dgeo itself.
affine_transformation – A 4x4 or 3x4 matrix representing the affine transformation. Given as a numpy array. If this argument is given, the rotation and translation arguments are ignored.
rotation (np.ndarray) – A 3x3 matrix specifying the rotation to apply
translation (np.ndarray) – A vector specifying the translation to apply
reduction_point (np.ndarray) – A translation vector to apply before applying rotation and scaling. This is used to increase the numerical accuracy of transformation. If a transformation is given, this argument is ignored.

property transformation

Access the affine transformations that were applied to this epoch

In order to set this property please use the transform method instead, which will make sure to also apply the transformation.

Returns:: Returns a list of applied transformations. These are given as a tuple of a 4x4 matrix defining the affine transformation and the reduction point used when applying it.

py4dgeo.read_from_las(*filenames, normal_columns=[], additional_dimensions={})

Create an epoch from a LAS/LAZ file

Parameters:

filename (str) – The filename to read from. It is expected to be in LAS/LAZ format and will be processed using laspy.
normal_columns (list) – The column names of the normal vector components, e.g. “NormalX”, “nx”, “normal_x” etc., keep in mind that there must be exactly 3 columns. Leave empty, if your data file does not contain normals.
additional_dimensions (dict) – A dictionary, mapping names of additional data dimensions in the input dataset to additional data dimensions in our epoch data structure.

py4dgeo.read_from_xyz(*filenames, xyz_columns=[0, 1, 2], normal_columns=[], additional_dimensions={}, **parse_opts)

Create an epoch from an xyz file

Parameters:

filename (str) – The filename to read from. Each line in the input file is expected to contain three space separated numbers.
xyz_columns (list) – The column indices of X, Y and Z coordinates. Defaults to [0, 1, 2].
normal_columns (list) – The column indices of the normal vector components. Leave empty, if your data file does not contain normals, otherwise exactly three indices for the x, y and z components need to be given.
parse_opts (dict) – Additional options forwarded to numpy.genfromtxt. This can be used to e.g. change the delimiter character, remove header_lines or manually specify which columns of the input contain the XYZ coordinates.
additional_dimensions – A dictionary, mapping column indices to names of additional data dimensions. They will be read from the file and are accessible under their names from the created Epoch objects. Additional column indexes start with 3.

py4dgeo.save_epoch(epoch, filename)

Save an epoch to a given filename

Parameters:

epoch (Epoch) – The epoch that should be saved.
filename (str) – The filename where to save the epoch

py4dgeo.load_epoch(filename)

Load an epoch from a given filename

Parameters:: filename (str) – The filename to load the epoch from.

class py4dgeo.M3C2(normal_radii: List[float] | None = None, orientation_vector: ndarray = array([0, 0, 1]), corepoint_normals: ndarray | None = None, cloud_for_normals: Epoch | None = None, **kwargs)

Bases: M3C2LikeAlgorithm

calculate_distances(epoch1, epoch2): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.CloudCompareM3C2(**params)

Bases: M3C2

calculate_distances(epoch1, epoch2): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.SpatiotemporalAnalysis(filename, compress=True, allow_pickle=True, force=False)

add_epochs(*epochs): Add a numbers of epochs to the existing analysis

property corepoints: Access the corepoints of this analysis

property distances: Access the M3C2 distances of this analysis

property distances_for_compute

Retrieve the distance array used for computation

This might be the raw data or smoothed data, based on whether a smoothing was provided by the user.

invalidate_results(seeds=True, objects=True, smoothed_distances=False)

Invalidate (and remove) calculated results

This is automatically called when new epochs are added or when an algorithm sets the force option.

property m3c2: Access the M3C2 algorithm of this analysis

property objects: The list of objects by change for this analysis

property reference_epoch: Access the reference epoch of this analysis

property seeds: The list of seed candidates for this analysis

property timedeltas: Access the sequence of time stamp deltas for the time series

property uncertainties: Access the M3C2 uncertainties of this analysis

class py4dgeo.RegionGrowingAlgorithm(seed_subsampling=1, seed_candidates=None, window_width=24, window_min_size=12, window_jump=1, window_penalty=1.0, minperiod=24, height_threshold=0.0, **kwargs)

Bases: RegionGrowingAlgorithmBase

property analysis

Access the analysis object that the algorithm operates on

This is only available after run has been called.

distance_measure()

Distance measure between two time series

Expected to return a function that accepts two time series and returns the distance.

filter_objects(obj): A filter for objects produced by the region growing algorithm

find_seedpoints(): Calculate seedpoints for the region growing algorithm

run(analysis, force=False)

Calculate the _segmentation

Parameters:

analysis (py4dgeo.segmentation.SpatiotemporalAnalysis) – The analysis object we are working with.
force – Force recalculation of results. If false, some intermediate results will be restored from the analysis object instead of being recalculated.

seed_sorting_scorefunction(): Neighborhood similarity sorting function

py4dgeo.regular_corepoint_grid(lowerleft, upperright, num_points, zval=0.0)

A helper function to create a regularly spaced grid for the analysis

Parameters:

lowerleft (np.ndarray) – The lower left corner of the grid. Given as a 2D coordinate.
upperright (np.ndarray) – The upper right corner of the grid. Given as a 2D coordinate.
num_points (tuple) – A tuple with two entries denoting the number of points to be used in x and y direction
zval (double) – The value to fill for the z-direction.

py4dgeo.set_py4dgeo_logfile(filename)

Set the logfile used by py4dgeo

All log messages produced by py4dgeo are logged into this file in addition to be logged to stdout/stderr. By default, that file is called ‘py4dgeo.log’.

Parameters:: filename (str) – The name of the logfile to use

py4dgeo.set_memory_policy(policy: MemoryPolicy)

Globally set py4dgeo’s memory policy

For details about the memory policy, see MemoryPolicy. Use this once before performing any operations. Changing the memory policy in the middle of the computation results in undefined behaviour.

Parameters:: policy (MemoryPolicy) – The policy value to globally set

class py4dgeo.MemoryPolicy

A descriptor for py4dgeo’s memory usage policy

This can be used to describe the memory usage policy that py4dgeo should follow. The implementation of py4dgeo checks the currently set policy whenever it would make a memory allocation of the same order of magnitude as the input pointcloud or the set of corepoints. To globally set the policy, use set_memory_policy().

Currently the following policies are available:

STRICT: py4dgeo is not allowed to do additional memory allocations. If such an allocation would be required, an error is thrown.
MINIMAL: py4dgeo is allowed to do additional memory allocations if and only if they are necessary for a seemless operation of the library.
COREPOINTS: py4dgeo is allowed to do additional memory allocations as part of performance trade-off considerations (e.g. precompute vs. recompute), but only if the allocation is on the order of the number of corepoints. This is the default behaviour of py4dgeo.
RELAXED: py4dgeo is allowed to do additional memory allocations as part of performance trade-off considerations (e.g. precompute vs. recompute).

py4dgeo.get_num_threads()

Get the number of threads currently used by py4dgeo

Returns:: The number of threads
Return type:: int

py4dgeo.set_num_threads(num_threads: int)

Set the number of threads to use in py4dgeo

Parameters:: num_threads – The number of threads to use

“type num_threads: int

class py4dgeo.PBM3C2(per_point_computation=PerPointComputation(), segmentation=Segmentation(), second_segmentation=Segmentation(), extract_segments=ExtractSegments(), classifier=ClassifierWrapper())

compute_distances(epoch0: Epoch | ndarray | None = None, epoch1: Epoch | None = None, alignment_error: float = 1.1, **kwargs) → Tuple[ndarray, ndarray] | None

Compute the distance between 2 epochs. It also adds the following properties at the end of the computation:: distances, corepoints (corepoints of epoch0), epochs (epoch0, epoch1), uncertainties

Parameters:

epoch0 – Epoch object. | np.ndarray
epoch1 – Epoch object.
alignment_error – alignment error reg between point clouds.
kwargs –
Used for customize the default pipeline parameters.

Getting the default parameters: e.g. “get_pipeline_options”

In case this parameter is True, the method will print the pipeline options as kwargs.

e.g. “output_file_name” (of a specific step in the pipeline) default value is “None”.
In case of setting it, the result of computation at that step is dump as xyz file.

e.g. “distance_3D_threshold” (part of Segmentation Transform)

this process is stateless

Returns:

tuple [distances, uncertainties]

’distances’ is np.array (nr_similar_pairs, 1) ‘uncertainties’ is np.array (nr_similar_pairs,1) and it has the following structure:

dtype=np.dtype(

[
(“lodetection”, “<f8”), (“spread1”, “<f8”), (“num_samples1”, “<i8”), (“spread2”, “<f8”), (“num_samples2”, “<i8”),

]

)

None

export_segmented_point_cloud_and_segments(epoch0: Epoch | None = None, epoch1: Epoch | None = None, x_y_z_id_epoch0_file_name: str | None = 'x_y_z_id_epoch0.xyz', x_y_z_id_epoch1_file_name: str | None = 'x_y_z_id_epoch1.xyz', extracted_segments_file_name: str | None = 'extracted_segments.seg', concatenate_name='', **kwargs) → Tuple[ndarray, ndarray | None, ndarray] | None

For each epoch, it returns the segmentation of the point cloud as a numpy array (n_points, 4) and it also serializes them using the provided file names. where each row has the following structure: x, y, z, segment_id

It also generates a numpy array of segments of the form:: X_COLUMN, Y_COLUMN, Z_COLUMN, -> Center of Gravity EPOCH_ID_COLUMN, -> 0/1 EIGENVALUE0_COLUMN, EIGENVALUE1_COLUMN, EIGENVALUE2_COLUMN, EIGENVECTOR_0_X_COLUMN, EIGENVECTOR_0_Y_COLUMN, EIGENVECTOR_0_Z_COLUMN, EIGENVECTOR_1_X_COLUMN, EIGENVECTOR_1_Y_COLUMN, EIGENVECTOR_1_Z_COLUMN, EIGENVECTOR_2_X_COLUMN, EIGENVECTOR_2_Y_COLUMN, EIGENVECTOR_2_Z_COLUMN, -> Normal vector LLSV_COLUMN, -> lowest local surface variation SEGMENT_ID_COLUMN, STANDARD_DEVIATION_COLUMN, NR_POINTS_PER_SEG_COLUMN,

Parameters:

epoch0 – Epoch object | None
epoch1 – Epoch object | None
x_y_z_id_epoch0_file_name – The output file name for epoch0, point cloud segmentation, saved as a numpy array (n_points, 4) (x,y,z, segment_id) | None
x_y_z_id_epoch1_file_name – The output file name for epoch1, point cloud segmentation, saved as a numpy array (n_points, 4) (x,y,z, segment_id) | None
extracted_segments_file_name – The output file name for the file containing the segments, saved as a numpy array containing the column structure introduced above. | None
concatenate_name – String that is utilized to uniquely identify the same transformer between multiple pipelines.
kwargs –
Used for customize the default pipeline parameters.

Getting the default parameters: e.g. “get_pipeline_options”

In case this parameter is True, the method will print the pipeline options as kwargs.

e.g. “output_file_name” (of a specific step in the pipeline) default value is “None”.
In case of setting it, the result of computation at that step is dump as xyz file.

e.g. “distance_3D_threshold” (part of Segmentation Transform)

this process is stateless

Returns:

tuple [ x_y_z_id_epoch0, x_y_z_id_epoch1 | None, extracted_segments ] | None

generate_extended_labels_interactively(epoch0: ~py4dgeo.epoch.Epoch | None = None, epoch1: ~py4dgeo.epoch.Epoch | None = None, builder_extended_y: ~py4dgeo.pbm3c2.BuilderExtended_y_Visually = <py4dgeo.pbm3c2.BuilderExtended_y_Visually object>, **kwargs) → Tuple[ndarray, ndarray] | None

Given 2 Epochs, it builds a pair of (segments and ‘extended y’).

Parameters:

epoch0 – Epoch object.
epoch1 – Epoch object.
builder_extended_y – The object is used for generating ‘extended y’, visually.
kwargs –
Used for customize the default pipeline parameters.

Getting the default parameters: e.g. “get_pipeline_options”

In case this parameter is True, the method will print the pipeline options as kwargs.

e.g. “output_file_name” (of a specific step in the pipeline) default value is “None”.
In case of setting it, the result of computation at that step is dump as xyz file.

e.g. “distance_3D_threshold” (part of Segmentation Transform)

this process is stateless

Returns:

tuple [Segments, ‘extended y’] | None

where:

’Segments’ has the following column structure:: X_COLUMN, Y_COLUMN, Z_COLUMN, -> Center of Gravity EPOCH_ID_COLUMN, -> 0/1 EIGENVALUE0_COLUMN, EIGENVALUE1_COLUMN, EIGENVALUE2_COLUMN, EIGENVECTOR_0_X_COLUMN, EIGENVECTOR_0_Y_COLUMN, EIGENVECTOR_0_Z_COLUMN, EIGENVECTOR_1_X_COLUMN, EIGENVECTOR_1_Y_COLUMN, EIGENVECTOR_1_Z_COLUMN, EIGENVECTOR_2_X_COLUMN, EIGENVECTOR_2_Y_COLUMN, EIGENVECTOR_2_Z_COLUMN, -> Normal vector LLSV_COLUMN, -> lowest local surface variation SEGMENT_ID_COLUMN, STANDARD_DEVIATION_COLUMN, NR_POINTS_PER_SEG_COLUMN,

’extended y’ has the following structure: (tuples of index segment from epoch0, index segment from epoch1, label(0/1)) used for learning.

predict(epoch0: Epoch | None = None, epoch1: Epoch | None = None, epoch_additional_dimensions_lookup: Dict[str, str] | None = None, **kwargs) → ndarray | None

After extracting the segments from epoch0 and epoch1, it returns a numpy array of corresponding pairs of segments between epoch 0 and epoch 1.

Parameters:

epoch0 – Epoch object.
epoch1 – Epoch object.
epoch_additional_dimensions_lookup –
A dictionary that maps between the names of the columns used internally and the names of the columns used by both epoch0 and epoch1.

No additional column is used.
kwargs –
Used for customize the default pipeline parameters.

Getting the default parameters: e.g. “get_pipeline_options”

In case this parameter is True, the method will print the pipeline options as kwargs.

e.g. “output_file_name” (of a specific step in the pipeline) default value is “None”.
In case of setting it, the result of computation at that step is dump as xyz file.

e.g. “distance_3D_threshold” (part of Segmentation Transform)

this process is stateless

Returns:

A numpy array ( n_pairs, segment_features_size*2 ) where each row contains a pair of segments. | None

training(segments: ndarray | None = None, extended_y: ndarray | None = None, extracted_segments_file_name: str = 'extracted_segments.seg', extended_y_file_name: str = 'extended_y.csv') → None

It applies the training algorithm for the input pairs of Segments ‘segments’ and extended labels ‘extended_y’.

Parameters:

segments –
‘Segments’ numpy array of shape (n_segments, segment_size)

It has the following column structure:
X_COLUMN, Y_COLUMN, Z_COLUMN, -> Center of Gravity EPOCH_ID_COLUMN, -> 0/1 EIGENVALUE0_COLUMN, EIGENVALUE1_COLUMN, EIGENVALUE2_COLUMN, EIGENVECTOR_0_X_COLUMN, EIGENVECTOR_0_Y_COLUMN, EIGENVECTOR_0_Z_COLUMN, EIGENVECTOR_1_X_COLUMN, EIGENVECTOR_1_Y_COLUMN, EIGENVECTOR_1_Z_COLUMN, EIGENVECTOR_2_X_COLUMN, EIGENVECTOR_2_Y_COLUMN, EIGENVECTOR_2_Z_COLUMN, -> Normal vector LLSV_COLUMN, -> lowest local surface variation SEGMENT_ID_COLUMN, STANDARD_DEVIATION_COLUMN, NR_POINTS_PER_SEG_COLUMN,
extended_y – numpy array of shape (m_labels, 3) has the following structure: (tuples of index segment from epoch0, index segment from epoch1, label(0/1))
extracted_segments_file_name – In case ‘X’ is None segments are loaded using ‘extracted_segments_file_name’.
extended_y_file_name – In case ‘extended_y’ is None, this file is used as input fallback.

Developer API reference

py4dgeo.epoch.as_epoch(cloud)

Create an epoch from a cloud

Idempotent operation to create an epoch from a cloud.

py4dgeo.epoch.normalize_timestamp(timestamp): Bring a given timestamp into a standardized Python format

class py4dgeo.m3c2.M3C2LikeAlgorithm(epochs: Tuple[Epoch, ...] | None = None, corepoints: ndarray | None = None, cyl_radii: List[float] | None = None, max_distance: float = 0.0, registration_error: float = 0.0, robust_aggr: bool = False)

calculate_distances(epoch1, epoch2): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.fallback.PythonFallbackM3C2(normal_radii: List[float] | None = None, orientation_vector: ndarray = array([0, 0, 1]), corepoint_normals: ndarray | None = None, cloud_for_normals: Epoch | None = None, **kwargs)

Bases: M3C2

An implementation of M3C2 that makes use of Python fallback implementations

calculate_distances(epoch1, epoch2): Calculate the distances between two epochs

callback_distance_calculation(): The callback used to calculate the distance between two point clouds

callback_workingset_finder(): The callback used to determine the point cloud subset around a corepoint

directions(): The normal direction(s) to use for this algorithm.

run(): Main entry point for running the algorithm

class py4dgeo.segmentation.RegionGrowingAlgorithmBase(neighborhood_radius=1.0, thresholds=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], min_segments=20, max_segments=None)

property analysis

Access the analysis object that the algorithm operates on

This is only available after run has been called.

distance_measure()

Distance measure between two time series

Expected to return a function that accepts two time series and returns the distance.

filter_objects(obj)

A filter for objects produced by the region growing algorithm

Objects are discarded if this method returns False.

find_seedpoints(): Calculate seedpoints for the region growing algorithm

run(analysis, force=False)

Calculate the _segmentation

Parameters:

analysis (py4dgeo.segmentation.SpatiotemporalAnalysis) – The analysis object we are working with.
force – Force recalculation of results. If false, some intermediate results will be restored from the analysis object instead of being recalculated.

seed_sorting_scorefunction()

A function that computes a score for a seed candidate

This function is used to prioritize seed candidates.

class py4dgeo.segmentation.RegionGrowingSeed(index, start_epoch, end_epoch)

class py4dgeo.segmentation.ObjectByChange(data, seed, analysis=None)

Representation a change object in the spatiotemporal domain

property end_epoch: The index of the end epoch of the change object

property indices: The set of corepoint indices that compose the object by change

plot(filename=None)

Create an informative visualization of the Object By Change

Parameters:: filename (str) – The filename to use to store the plot. Can be omitted to only show plot in a Jupyter notebook session.

property start_epoch: The index of the start epoch of the change object

property threshold: The distance threshold that produced this object

py4dgeo.segmentation.check_epoch_timestamp(epoch): Validate an epoch to be used with SpatiotemporalSegmentation

class py4dgeo.util.Py4DGeoError(msg, loggername='py4dgeo')

py4dgeo.util.find_file(filename, fatal=True)

Find a file of given name on the file system.

This function is intended to use in tests and demo applications to locate data files without resorting to absolute paths. You may use it for your code as well.

It looks in the following locations:

If an absolute filename is given, it is used
Check whether the given relative path exists with respect to the current working directory
Check whether the given relative path exists with respect to the specified XDG data directory (e.g. through the environment variable XDG_DATA_DIRS).
Check whether the given relative path exists in downloaded test data.

Parameters:

filename (str) – The (relative) filename to search for
fatal (bool) – Whether not finding the file should be a fatal error

Returns:

An absolute filename

py4dgeo.util.as_double_precision(arr: ndarray, policy_check=True)

Ensure that a numpy array is double precision

This is a no-op if the array is already double precision and makes a copy if it is not. It checks py4dgeo’s memory policy before copying.

Parameters:: arr (np.ndarray) – The numpy array

py4dgeo.util.make_contiguous(arr: ndarray)

Make a numpy array contiguous

This is a no-op if the array is already contiguous and makes a copy if it is not. It checks py4dgeo’s memory policy before copying.

Parameters:: arr (np.ndarray) – The numpy array

py4dgeo.util.memory_policy_is_minimum(policy: MemoryPolicy)

Whether or not the globally set memory policy is at least the given one

Parameters:: policy (MemoryPolicy) – The policy value to compare against
Returns:: Whether the globally set policy is at least the given one
Return type:: bool

py4dgeo.util.append_file_extension(filename, extension): Append a file extension if and only if the original filename has none

py4dgeo.util.is_iterable(obj): Whether the object is an iterable (excluding a string)

py4dgeo.logger.create_default_logger(filename=None)

py4dgeo.logger.logger_context(msg, level=20)