Correspondence-driven plane-based M3C2 (PBM3C2)

[1]:

import py4dgeo

In this notebook, we present how the Correspondence-driven plane-based M3C2 (PB-M3C2, Zahs et al., 2022) algorithm for point cloud distance computation using the py4dgeo package.

As PB-M3C2 is a learning algorithm, it requires user-labelled input data in the process. This input can either be provided through external tools or be generated using a simple graphical user interface. For the graphical user interface to work best from Jupyter notebooks, we select the vtk backend.

[2]:

py4dgeo.set_interactive_backend("vtk")

We will work on the same demonstrator data we used in the explanation of the M3C2 algorithm:

[3]:

epoch0, epoch1 = py4dgeo.read_from_xyz(
    "plane_horizontal_t1.xyz", "plane_horizontal_t2.xyz"
)

[2024-05-14 13:04:09][INFO] Reading point cloud from file '/home/docs/.cache/py4dgeo/./plane_horizontal_t1.xyz'
[2024-05-14 13:04:09][INFO] Reading point cloud from file '/home/docs/.cache/py4dgeo/./plane_horizontal_t2.xyz'

Again, we instantiate an instance of the algorithm class. For now, we use only the defaults for its parameters and leave explanation of customization aspects for later.

[4]:

alg = py4dgeo.PBM3C2()

In a first step, PB-M3C2 will run a plane segmentation algorithm on the provided input point clouds. As a learning algorithm, it then requires user input about corresponding planes. py4dgeo offers two ways of doing this: * You can export the segmentation data in XYZ format with four columns: x, y and z of the point cloud, as well as the segment_id of the segment the point is associated with. Using that data, you can determine correspondance using your favorite tools or existing workflows. Your input is again expected in a comma-separated text file (CSV). It should contain three columns: The segment_id from the first point cloud, the segment_id from the second point cloud and a value of 0 or 1 depending on whether the two segments matched. The APIs for this case are shown in this notebook. * You can interactively build the correspondence information in an interactive session. For this, you can call alg.build_labelled_similarity_features_interactively().

Here, we use the first method of using an external tool for labelling. This call will write a total of three files: The above mentioned XYZ files for both epochs, as well as a third file that contains the entire results of the segmentation process. This will allow you to start computation later on without rerunning the segmentation part of the algorithm. You can modify the default file names by passing them to the respective arguments.

[5]:

xyz_epoch0, xyz_epoch1, segment_id = alg.export_segmented_point_cloud_and_segments(
    epoch0=epoch0,
    epoch1=epoch1,
)

[2024-05-14 13:04:09][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:09][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:09][INFO] Transformer Fit
[2024-05-14 13:04:09][INFO] Transformer Transform
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Transformer Fit
[2024-05-14 13:04:09][INFO] Transformer Transform
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Transformer Fit
[2024-05-14 13:04:09][INFO] Transformer Transform
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Transformer Transform
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:09][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] ----
 The pipeline parameters after restoration are:
{'_Transform_PerPointComputation': PerPointComputation(),
 '_Transform_PerPointComputation__columns': <class 'py4dgeo.pbm3c2.LLSV_PCA_COLUMNS'>,
 '_Transform_PerPointComputation__output_file_name': None,
 '_Transform_PerPointComputation__radius': 10,
 '_Transform_PerPointComputation__skip': False,
 '_Transform_Second_Segmentation': Segmentation(with_previously_computed_segments=True),
 '_Transform_Second_Segmentation__angle_diff_threshold': 1,
 '_Transform_Second_Segmentation__columns': <class 'py4dgeo.pbm3c2.SEGMENTED_POINT_CLOUD_COLUMNS'>,
 '_Transform_Second_Segmentation__distance_3D_threshold': 1.5,
 '_Transform_Second_Segmentation__distance_orthogonal_threshold': 1.5,
 '_Transform_Second_Segmentation__llsv_threshold': 1,
 '_Transform_Second_Segmentation__max_nr_points_neighborhood': 100,
 '_Transform_Second_Segmentation__min_nr_points_per_segment': 5,
 '_Transform_Second_Segmentation__output_file_name': None,
 '_Transform_Second_Segmentation__radius': 2,
 '_Transform_Second_Segmentation__roughness_threshold': 5,
 '_Transform_Second_Segmentation__skip': False,
 '_Transform_Second_Segmentation__with_previously_computed_segments': True,
 '_Transform_Segmentation': Segmentation(),
 '_Transform_Segmentation__angle_diff_threshold': 1,
 '_Transform_Segmentation__columns': <class 'py4dgeo.pbm3c2.SEGMENTED_POINT_CLOUD_COLUMNS'>,
 '_Transform_Segmentation__distance_3D_threshold': 1.5,
 '_Transform_Segmentation__distance_orthogonal_threshold': 1.5,
 '_Transform_Segmentation__llsv_threshold': 1,
 '_Transform_Segmentation__max_nr_points_neighborhood': 100,
 '_Transform_Segmentation__min_nr_points_per_segment': 5,
 '_Transform_Segmentation__output_file_name': None,
 '_Transform_Segmentation__radius': 2,
 '_Transform_Segmentation__roughness_threshold': 5,
 '_Transform_Segmentation__skip': False,
 '_Transform_Segmentation__with_previously_computed_segments': False,
 'memory': None,
 'steps': [('_Transform_PerPointComputation', PerPointComputation()),
           ('_Transform_Segmentation', Segmentation()),
           ('_Transform_Second_Segmentation',
            Segmentation(with_previously_computed_segments=True))],
 'verbose': False}
----

[2024-05-14 13:04:10][INFO] ----
 The pipeline parameters after restoration are:
{'_Transform_ExtractSegments': ExtractSegments(),
 '_Transform_ExtractSegments__columns': <class 'py4dgeo.pbm3c2.SEGMENT_COLUMNS'>,
 '_Transform_ExtractSegments__output_file_name': None,
 '_Transform_ExtractSegments__skip': False,
 'memory': None,
 'steps': [('_Transform_ExtractSegments', ExtractSegments())],
 'verbose': False}
----

After doing the labelling using your preferred method, you can read it into py4dgeo. We pass the previously exported segmentation information and the externally produced CSV file to the traingin procedure. In this test case, we are distributing the labelled data with the test data:

[6]:

alg.training(
    extracted_segments_file_name="extracted_segments.seg",
    extended_y_file_name="testdata-labelling.csv",
)

[2024-05-14 13:04:10][INFO] Reading segments from file '/home/docs/checkouts/readthedocs.org/user_builds/py4dgeo/checkouts/latest/doc/extracted_segments.seg'
[2024-05-14 13:04:10][INFO] Reading tuples of (segment epoch0, segment epoch1, label) from file '/home/docs/.cache/py4dgeo/./testdata-labelling.csv'
[2024-05-14 13:04:10][INFO] Fit ClassifierWrapper

We have now trained the algorithm using a scikit-learn classifier. By default, this is a random forest tree. We are now ready to compute the distances analoguous to how distances in standard M3C2 are calculated. This will run the prediction with the trained model and derive distance and uncertainty information from the results:

[7]:

distances, uncertainties = alg.compute_distances(epoch0=epoch0, epoch1=epoch1)

[2024-05-14 13:04:10][INFO] PBM3C2.compute_distances(...)
[2024-05-14 13:04:10][INFO] PBM3C2._compute_distances(...)
[2024-05-14 13:04:10][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:10][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] ----
 The pipeline parameters after restoration are:
{'epoch0_Transform_PerPointComputation': PerPointComputation(),
 'epoch0_Transform_PerPointComputation__columns': <class 'py4dgeo.pbm3c2.LLSV_PCA_COLUMNS'>,
 'epoch0_Transform_PerPointComputation__output_file_name': None,
 'epoch0_Transform_PerPointComputation__radius': 10,
 'epoch0_Transform_PerPointComputation__skip': False,
 'epoch0_Transform_Second_Segmentation': Segmentation(with_previously_computed_segments=True),
 'epoch0_Transform_Second_Segmentation__angle_diff_threshold': 1,
 'epoch0_Transform_Second_Segmentation__columns': <class 'py4dgeo.pbm3c2.SEGMENTED_POINT_CLOUD_COLUMNS'>,
 'epoch0_Transform_Second_Segmentation__distance_3D_threshold': 1.5,
 'epoch0_Transform_Second_Segmentation__distance_orthogonal_threshold': 1.5,
 'epoch0_Transform_Second_Segmentation__llsv_threshold': 1,
 'epoch0_Transform_Second_Segmentation__max_nr_points_neighborhood': 100,
 'epoch0_Transform_Second_Segmentation__min_nr_points_per_segment': 5,
 'epoch0_Transform_Second_Segmentation__output_file_name': None,
 'epoch0_Transform_Second_Segmentation__radius': 2,
 'epoch0_Transform_Second_Segmentation__roughness_threshold': 5,
 'epoch0_Transform_Second_Segmentation__skip': False,
 'epoch0_Transform_Second_Segmentation__with_previously_computed_segments': True,
 'epoch0_Transform_Segmentation': Segmentation(),
 'epoch0_Transform_Segmentation__angle_diff_threshold': 1,
 'epoch0_Transform_Segmentation__columns': <class 'py4dgeo.pbm3c2.SEGMENTED_POINT_CLOUD_COLUMNS'>,
 'epoch0_Transform_Segmentation__distance_3D_threshold': 1.5,
 'epoch0_Transform_Segmentation__distance_orthogonal_threshold': 1.5,
 'epoch0_Transform_Segmentation__llsv_threshold': 1,
 'epoch0_Transform_Segmentation__max_nr_points_neighborhood': 100,
 'epoch0_Transform_Segmentation__min_nr_points_per_segment': 5,
 'epoch0_Transform_Segmentation__output_file_name': None,
 'epoch0_Transform_Segmentation__radius': 2,
 'epoch0_Transform_Segmentation__roughness_threshold': 5,
 'epoch0_Transform_Segmentation__skip': False,
 'epoch0_Transform_Segmentation__with_previously_computed_segments': False,
 'memory': None,
 'steps': [('epoch0_Transform_PerPointComputation', PerPointComputation()),
           ('epoch0_Transform_Segmentation', Segmentation()),
           ('epoch0_Transform_Second_Segmentation',
            Segmentation(with_previously_computed_segments=True))],
 'verbose': False}
----

[2024-05-14 13:04:10][INFO] ----
 The pipeline parameters after restoration are:
{'epoch0_Transform_ExtractSegments': ExtractSegments(),
 'epoch0_Transform_ExtractSegments__columns': <class 'py4dgeo.pbm3c2.SEGMENT_COLUMNS'>,
 'epoch0_Transform_ExtractSegments__output_file_name': None,
 'epoch0_Transform_ExtractSegments__skip': False,
 'memory': None,
 'steps': [('epoch0_Transform_ExtractSegments', ExtractSegments())],
 'verbose': False}
----

[2024-05-14 13:04:10][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:10][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:10][INFO] Transformer Fit
[2024-05-14 13:04:10][INFO] Transformer Transform
[2024-05-14 13:04:10][INFO] ----
 The pipeline parameters after restoration are:
{'epoch1_Transform_PerPointComputation': PerPointComputation(),
 'epoch1_Transform_PerPointComputation__columns': <class 'py4dgeo.pbm3c2.LLSV_PCA_COLUMNS'>,
 'epoch1_Transform_PerPointComputation__output_file_name': None,
 'epoch1_Transform_PerPointComputation__radius': 10,
 'epoch1_Transform_PerPointComputation__skip': False,
 'epoch1_Transform_Second_Segmentation': Segmentation(with_previously_computed_segments=True),
 'epoch1_Transform_Second_Segmentation__angle_diff_threshold': 1,
 'epoch1_Transform_Second_Segmentation__columns': <class 'py4dgeo.pbm3c2.SEGMENTED_POINT_CLOUD_COLUMNS'>,
 'epoch1_Transform_Second_Segmentation__distance_3D_threshold': 1.5,
 'epoch1_Transform_Second_Segmentation__distance_orthogonal_threshold': 1.5,
 'epoch1_Transform_Second_Segmentation__llsv_threshold': 1,
 'epoch1_Transform_Second_Segmentation__max_nr_points_neighborhood': 100,
 'epoch1_Transform_Second_Segmentation__min_nr_points_per_segment': 5,
 'epoch1_Transform_Second_Segmentation__output_file_name': None,
 'epoch1_Transform_Second_Segmentation__radius': 2,
 'epoch1_Transform_Second_Segmentation__roughness_threshold': 5,
 'epoch1_Transform_Second_Segmentation__skip': False,
 'epoch1_Transform_Second_Segmentation__with_previously_computed_segments': True,
 'epoch1_Transform_Segmentation': Segmentation(),
 'epoch1_Transform_Segmentation__angle_diff_threshold': 1,
 'epoch1_Transform_Segmentation__columns': <class 'py4dgeo.pbm3c2.SEGMENTED_POINT_CLOUD_COLUMNS'>,
 'epoch1_Transform_Segmentation__distance_3D_threshold': 1.5,
 'epoch1_Transform_Segmentation__distance_orthogonal_threshold': 1.5,
 'epoch1_Transform_Segmentation__llsv_threshold': 1,
 'epoch1_Transform_Segmentation__max_nr_points_neighborhood': 100,
 'epoch1_Transform_Segmentation__min_nr_points_per_segment': 5,
 'epoch1_Transform_Segmentation__output_file_name': None,
 'epoch1_Transform_Segmentation__radius': 2,
 'epoch1_Transform_Segmentation__roughness_threshold': 5,
 'epoch1_Transform_Segmentation__skip': False,
 'epoch1_Transform_Segmentation__with_previously_computed_segments': False,
 'memory': None,
 'steps': [('epoch1_Transform_PerPointComputation', PerPointComputation()),
           ('epoch1_Transform_Segmentation', Segmentation()),
           ('epoch1_Transform_Second_Segmentation',
            Segmentation(with_previously_computed_segments=True))],
 'verbose': False}
----

[2024-05-14 13:04:10][INFO] ----
 The pipeline parameters after restoration are:
{'epoch1_Transform_ExtractSegments': ExtractSegments(),
 'epoch1_Transform_ExtractSegments__columns': <class 'py4dgeo.pbm3c2.SEGMENT_COLUMNS'>,
 'epoch1_Transform_ExtractSegments__output_file_name': None,
 'epoch1_Transform_ExtractSegments__skip': False,
 'memory': None,
 'steps': [('epoch1_Transform_ExtractSegments', ExtractSegments())],
 'verbose': False}
----

[2024-05-14 13:04:10][INFO] No pipeline parameter is overwritten
[2024-05-14 13:04:11][INFO] Building KDTree structure with leaf parameter 10
[2024-05-14 13:04:12][INFO] ----
 The pipeline parameters after restoration are:
{'Classifier': ClassifierWrapper(),
 'Classifier__classifier': RandomForestClassifier(),
 'Classifier__classifier__bootstrap': True,
 'Classifier__classifier__ccp_alpha': 0.0,
 'Classifier__classifier__class_weight': None,
 'Classifier__classifier__criterion': 'gini',
 'Classifier__classifier__max_depth': None,
 'Classifier__classifier__max_features': 'sqrt',
 'Classifier__classifier__max_leaf_nodes': None,
 'Classifier__classifier__max_samples': None,
 'Classifier__classifier__min_impurity_decrease': 0.0,
 'Classifier__classifier__min_samples_leaf': 1,
 'Classifier__classifier__min_samples_split': 2,
 'Classifier__classifier__min_weight_fraction_leaf': 0.0,
 'Classifier__classifier__monotonic_cst': None,
 'Classifier__classifier__n_estimators': 100,
 'Classifier__classifier__n_jobs': None,
 'Classifier__classifier__oob_score': False,
 'Classifier__classifier__random_state': None,
 'Classifier__classifier__verbose': 0,
 'Classifier__classifier__warm_start': False,
 'Classifier__columns': <class 'py4dgeo.pbm3c2.SEGMENT_COLUMNS'>,
 'Classifier__diff_between_most_similar_2': 0.1,
 'Classifier__neighborhood_search_radius': 3,
 'Classifier__threshold_probability_most_similar': 0.8,
 'memory': None,
 'steps': [('Classifier', ClassifierWrapper())],
 'verbose': False}
----

References

Zahs, V., Winiwarter, L., Anders, K., Williams, J.G., Rutzinger, M. & Höfle, B. (2022): Correspondence-driven plane-based M3C2 for lower uncertainty in 3D topographic change quantification. ISPRS Journal of Photogrammetry and Remote Sensing, 183, pp. 541-559. DOI: 10.1016/j.isprsjprs.2021.11.018.

[ ]: