API Reference

This section provides detailed documentation for all Twocan classes and functions.

Core Classes

twocan.RegEstimator([...])

A scikit-learn compatible estimator for multimodal image registration.

Preprocessors

`twocan.IFProcessor`([binarize, ...])	Preprocessing pipeline for Immunofluorescence (IF) images.
`twocan.IMCProcessor`([arcsinh_normalize, ...])	Preprocessing pipeline for Imaging Mass Cytometry (IMC) images.

Optimization Functions

`twocan.iou_corr_single_objective`(trial, ...)	Objective function that optimizes for IoU (Intersection over Union).
`twocan.iou_corr_multi_objective`(trial, ...)	Multi-objective function that optimizes for both correlation and IoU.
`twocan.registration_trial`(trial, images, ...)

Callbacks

`twocan.SaveTrialsDFCallback`(filepath[, ...])	Callback to save optimization trials to a CSV file during study execution.
`twocan.ThresholdReachedCallback`(threshold[, ...])	Callback to stop optimization when a target metric threshold is reached.
`twocan.MatrixConvergenceCallback`([...])	Callback to monitor transformation matrix convergence during optimization.

Utilities

`twocan.stretch_255`(image)	Convert an image to 8-bit grayscale by stretching its range to [0, 255].
`twocan.read_M`(M)	Parse a string representation of an affine transformation matrix.
`twocan.multi_channel_corr`(source, target)	Calculate correlation coefficients between all pairs of channels from two images.
`twocan.get_aligned_coordinates`(...[, ...])	Apply a transformation to align two spatial elements in a new coordinate system.
`twocan.prep_zarr`(IF_arr, IMC_arr, IF_panel, ...)	Create a SpatialData object from IF and IMC arrays with their channel panels.
`twocan.pick_best_registration`(study_df)	Calculate triangle score and return best trial from optimization results.

Plotting

`twocan.plot_cartoon_affine`(w1, h1, M, w2, h2)	Plot a cartoon representation of an affine transformation.
`twocan.get_merge`(source, target)	Merge two images into a color-coded overlay.

Complete API

Twocan: Bayesian optimization for multimodal registration of spatial proteomics.

Twocan is a Python package that uses Bayesian optimization (via Optuna) to automatically find optimal parameters for registering images from spatial proteomics technologies such as IF (Immunofluorescence), IMC (Imaging Mass Cytometry), FISH, IMS, etc.

The package provides: - Automated parameter optimization for preprocessing and registration - Modality-specific preprocessing functions - Quality metrics for registration assessment - Visualization tools for registration results - Extensible design for custom objectives and preprocessors

class twocan.AsinhNorm(vmin=0, vmax=100, cofactor=5, clip=True)[source]

Bases: Normalize

inverse(value)[source]

Maps the normalized value (i.e., index in the colormap) back to image data value.

Parameters:: value – Normalized value.

class twocan.IFProcessor(binarize=True, binarization_threshold=0.1, sigma=1)[source]

Bases: object

Preprocessing pipeline for Immunofluorescence (IF) images.

This class provides a standardized preprocessing pipeline for IF images including channel summation, normalization, Gaussian blurring, and optional binarization. The processor can be configured with trial parameters from Optuna optimization.

Parameters:

binarize (bool, default=True) – Whether to apply binarization after preprocessing.
binarization_threshold (float, default=0.1) – Threshold value for binarization (0-1 range after normalization).
sigma (float, default=1) – Standard deviation for Gaussian blur kernel.

binarize

Whether binarization is enabled.

Type:: bool

binarization_threshold

Current binarization threshold.

Type:: float

sigma

Current Gaussian blur sigma value.

Type:: float

configure(trial_params)[source]

Configure processor parameters from Optuna trial parameters.

This method updates the processor parameters based on values suggested by an Optuna trial. It looks for specific parameter names in the trial params dictionary and updates the corresponding attributes.

Parameters:: trial_params (dict) – Dictionary of trial parameters from Optuna optimization. Expected keys: ‘IF_binarization_threshold’, ‘IF_gaussian_sigma’, ‘binarize_images’.
Returns:: self – Returns self for method chaining.
Return type:: IFProcessor

class twocan.IMCProcessor(arcsinh_normalize=True, arcsinh_cofactor=5, winsorize_limits=[None, None], binarize=True, binarization_threshold=2, sigma=1)[source]

Bases: object

Preprocessing pipeline for Imaging Mass Cytometry (IMC) images.

This class provides a comprehensive preprocessing pipeline specifically designed for IMC images, including arcsinh transformation for variance stabilization, winsorization for outlier handling, normalization, Gaussian blurring, and optional binarization.

Parameters:

arcsinh_normalize (bool, default=True) – Whether to apply arcsinh transformation for variance stabilization.
arcsinh_cofactor (float, default=5) – Cofactor for arcsinh transformation. Lower values increase the transformation strength.
winsorize_limits (list of float, default=[None, None]) – Lower and upper percentile limits for winsorization. [0.01, 0.01] means clip bottom 1% and top 1% of values.
binarize (bool, default=True) – Whether to apply binarization after preprocessing.
binarization_threshold (float, default=2) – Threshold value for binarization.
sigma (float, default=1) – Standard deviation for Gaussian blur kernel.

arcsinh_normalize

Whether arcsinh transformation is enabled.

Type:: bool

arcsinh_cofactor

Current arcsinh cofactor value.

Type:: float

winsorize_limits

Current winsorization limits.

Type:: list

binarize

Whether binarization is enabled.

Type:: bool

binarization_threshold

Current binarization threshold.

Type:: float

sigma

Current Gaussian blur sigma value.

Type:: float

Notes

The arcsinh transformation is particularly useful for IMC data because it stabilizes variance across the intensity range, which is important for count-based mass spectrometry data.

configure(trial_params)[source]

Configure processor parameters from Optuna trial parameters.

This method updates the processor parameters based on values suggested by an Optuna trial. It looks for specific IMC parameter names in the trial params dictionary.

Parameters:: trial_params (dict) – Dictionary of trial parameters from Optuna optimization. Expected keys: ‘IMC_arcsinh_normalize’, ‘IMC_arcsinh_cofactor’, ‘IMC_winsorization_lower_limit’, ‘IMC_winsorization_upper_limit’, ‘IMC_binarization_threshold’, ‘IMC_gaussian_sigma’, ‘binarize_images’.
Returns:: self – Returns self for method chaining.
Return type:: IMCProcessor

class twocan.MatrixConvergenceCallback(window_size: int = 10, tolerance: float = 0.01, min_trials: int = 20)[source]

Bases: object

Callback to monitor transformation matrix convergence during optimization.

This callback tracks the transformation matrices from recent trials and stops optimization when they converge (have low variance), indicating that the registration has stabilized. Useful for detecting when further optimization is unlikely to improve results.

Parameters:

window_size (int, default=10) – Number of recent trials to consider for convergence assessment.
tolerance (float, default=0.01) – Maximum allowed variance in matrix elements for convergence. Lower values require tighter convergence.
min_trials (int, default=20) – Minimum number of trials before convergence checking begins.

window_size

Current window size for convergence assessment.

Type:: int

tolerance

Current tolerance for matrix element variance.

Type:: float

min_trials

Minimum trials before convergence checking.

Type:: int

matrices

List storing recent transformation matrices.

Type:: List[np.ndarray]

class twocan.RegEstimator(registration_max_features: int = 10000, registration_percentile: float = 0.9)[source]

Bases: TransformerMixin, BaseEstimator, ABC

A scikit-learn compatible estimator for multimodal image registration.

This class implements feature-based image registration using OpenCV’s ORB (Oriented FAST and Rotated BRIEF) detector and a partial affine transformation model. It follows scikit-learn’s estimator API with fit, transform, and fit_transform methods, making it easy to integrate into machine learning pipelines.

The registration process consists of: 1. Feature detection using ORB on both images 2. Feature matching using brute-force Hamming distance 3. Affine transformation estimation using RANSAC 4. Image transformation using the estimated parameters

Parameters:

registration_max_features (int, default=10000) – Maximum number of features to detect in each image using ORB. Higher values can improve registration accuracy but increase computation time.
registration_percentile (float, default=0.9) – Percentile of features to keep after sorting by match quality (0-1). Only the top percentile of matches by distance are used for transformation estimation, which helps remove outliers.

M_

The estimated 2x3 affine transformation matrix after fitting. Shape is (2, 3) representing the transformation [R|t] where R is rotation/scaling and t is translation.

Type:: np.ndarray

y_shape_

Shape (height, width) of the target image used during fitting. Used as the default output shape for transformations.

Type:: Tuple[int, int]

Notes

The estimator automatically converts input images to 8-bit grayscale for feature detection using the stretch_255 utility function. This ensures consistent feature detection regardless of input image dynamic range.

The partial affine transformation model allows rotation, scaling, and translation but not shearing, which is appropriate for most microscopy registration tasks where imaging geometry is approximately preserved.

For best results: - Ensure sufficient overlap between images - Use images with distinct features (not uniform regions) - Consider preprocessing to enhance relevant structures - Adjust max_features based on image complexity and computational budget

fit(X: ndarray, y: ndarray) → RegEstimator[source]

Estimate the affine transformation matrix between source (X) and target (y) images.

This method detects features in both images using ORB, matches them, and estimates the best affine transformation that maps source features to target features using OpenCV’s robust estimation.

Parameters:

X (np.ndarray) – Source image to be registered. Shape can be (H, W) for single-channel or (C, H, W) for multi-channel. If multi-channel, all channels are summed for feature detection.
y (np.ndarray) – Target (reference) image to register to. Shape can be (H, W) for single-channel or (C, H, W) for multi-channel. If multi-channel, all channels are summed for feature detection.

Returns:

self – The fitted estimator with estimated transformation matrix in self.M_.

Return type:

RegEstimator

Raises:

cv2.error – If affine transformation cannot be estimated, typically due to insufficient or poorly matched features.

Notes

The fitting process: 1. Convert images to 8-bit for ORB compatibility 2. Detect up to max_features keypoints in each image 3. Compute ORB descriptors for each keypoint 4. Match descriptors using brute-force Hamming distance 5. Keep top percentile of matches by distance 6. Estimate partial affine transformation using RANSAC

The method uses OpenCV’s estimateAffinePartial2D which finds the optimal similarity transformation (rotation, scaling, translation) rather than a full affine transformation. This is more robust for most registration scenarios.

fit_transform(X: ndarray, y: ndarray) → ndarray[source]

Fit to data, then transform it.

This convenience method combines fitting and transformation in a single call. It’s equivalent to calling fit(X, y).transform(X, y) but slightly more efficient.

Parameters:

X (np.ndarray) – Source image to fit the transformation to and then transform.
y (np.ndarray) – Target image to fit the transformation against. This image will also be included in the output stack.

Returns:

Stacked array containing the transformed X channels followed by the original y channels. Shape is (C_x + C_y, H_y, W_y) where C_x, C_y are the channel counts and H_y, W_y are target dimensions.

Return type:

np.ndarray

score(source: ndarray, target: ndarray) → Dict[str, float][source]

Calculate registration quality metrics between source and target images.

This method computes various metrics to assess the quality of registration between binary or continuous-valued images. Metrics are calculated only in regions where both images have valid data after transformation.

Parameters:

source (np.ndarray) – Source image, shape (H, W). Should be the same image used for fitting or a similar image from the same modality.
target (np.ndarray) – Target image, shape (H, W). Should be the same image used for fitting or a similar image from the same modality.

Returns:

Dictionary containing registration quality metrics:

’and’float
Count of pixels where both source and target are positive (logical AND operation). Higher values indicate better overlap.
’or’float
Count of pixels where either source or target is positive (logical OR operation).
’xor’float
Count of pixels where source and target disagree (logical XOR operation). Lower values indicate better agreement.
’iou’float
Intersection over Union ratio (and/or). Values range from 0-1 with 1 indicating perfect overlap. Returns 0.0 if no positive pixels exist.
’source_sum’float
Sum of all source pixel intensities in the overlap region.
’target_sum’float
Sum of all target pixel intensities in the overlap region.

Return type:

Dict[str, float]

Notes

Metrics are computed only in the intersection region where both images have valid data after transformation. This ensures fair comparison and avoids edge effects from the transformation.

For binary images, the metrics have intuitive interpretations: - IoU is the standard Jaccard index - ‘and’ counts overlapping positive pixels - ‘xor’ counts disagreement pixels

For continuous images, the logical operations are applied after implicit conversion to boolean (non-zero values are True).

set_score_request(*, source: bool | None | str = '$UNCHANGED$', target: bool | None | str = '$UNCHANGED$') → RegEstimator

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

source (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for source parameter in score.
target (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for target parameter in score.

Returns:

self – The updated object.

Return type:

object

transform(X: ndarray, y: ndarray | None = None) → ndarray[source]

Apply the estimated transformation to the source image(s).

This method transforms the source image(s) using the affine transformation estimated during fitting. Optionally, a target image can be provided which will be stacked with the transformed source without transformation.

Parameters:

X (np.ndarray) – Source image(s) to transform. Shape can be (H, W) for single-channel or (C, H, W) for multi-channel. All channels are transformed using the same transformation matrix.
y (Optional[np.ndarray], default=None) – Target image to stack with transformed source. If provided, this image is NOT transformed but is included in the output for direct comparison. Shape should be (H, W) or (C, H, W).

Returns:

Transformed image(s). If y is None, returns transformed X with shape (C, H_out, W_out) where H_out, W_out match the target image from fitting. If y is provided, returns stacked array with transformed X channels followed by untransformed y channels.

Return type:

np.ndarray

Raises:

NotFittedError – If transform is called before fitting the estimator.
AssertionError – If the stored transformation matrix has invalid shape.

Notes

The transformation uses scikit-image’s warp function with the inverse transformation matrix. This ensures proper interpolation and handles edge cases automatically.

Output image dimensions match the target image used during fitting unless a different y image is provided during transformation.

class twocan.SaveTrialsDFCallback(filepath: str, save_frequency: int = 1)[source]

Bases: object

Callback to save optimization trials to a CSV file during study execution.

This callback writes trial results to a CSV file after each trial completion, allowing for real-time monitoring and persistent storage of optimization progress. Useful for long-running optimizations or when resuming studies.

Parameters:

filepath (str) – Path to the CSV file where trial data will be saved. File will be created if it doesn’t exist.
save_frequency (int, default=1) – Number of trials between saves. Set to 1 to save after every trial, higher values for less frequent saves to reduce I/O overhead.

filepath

Current filepath for saving trial data.

Type:: str

save_frequency

Current save frequency setting.

Type:: int

trial_count

Internal counter tracking number of completed trials.

Type:: int

class twocan.ThresholdReachedCallback(threshold: float, metric_name: str = 'iou', direction: str = 'maximize')[source]

Bases: object

Callback to stop optimization when a target metric threshold is reached.

This callback monitors a specified metric during optimization and raises optuna.TrialPruned to stop the study when the threshold is reached. Useful for stopping optimization early when satisfactory results are achieved.

Parameters:

threshold (float) – Target threshold value for the monitored metric.
metric_name (str, default='iou') – Name of the metric to monitor in trial.user_attrs. Must be a key present in the user attributes of trials.
direction (str, default='maximize') – Whether to stop when metric goes ‘above’ (‘maximize’) or ‘below’ (‘minimize’) the threshold.

threshold

Current threshold value.

Type:: float

metric_name

Current metric being monitored.

Type:: str

direction

Current direction (‘maximize’ or ‘minimize’).

Type:: str

twocan.iou_corr_multi_objective(trial, images, registration_channels, moving_image='IMC', static_image='IF', moving_preprocesser=<twocan.utils.IMCProcessor object>, static_preprocesser=<twocan.utils.IFProcessor object>)[source]: Multi-objective function that optimizes for both correlation and IoU.

twocan.iou_corr_single_objective(trial, images, registration_channels, moving_image='IMC', static_image='IF', moving_preprocesser=<twocan.utils.IMCProcessor object>, static_preprocesser=<twocan.utils.IFProcessor object>)[source]: Objective function that optimizes for IoU (Intersection over Union).

twocan.multi_channel_corr(source: ndarray, target: ndarray) → ndarray[source]

Calculate correlation coefficients between all pairs of channels from two images.

This function computes the Pearson correlation coefficient between every channel in the source image and every channel in the target image. The images are first normalized (z-scored) channel-wise before correlation calculation.

Parameters:

source (np.ndarray) – Source image array of shape (n_channels_1, n_pixels). Each row represents a flattened channel.
target (np.ndarray) – Target image array of shape (n_channels_2, n_pixels). Each row represents a flattened channel.

Returns:

Correlation matrix of shape (n_channels_1, n_channels_2). Element (i, j) contains the correlation between source channel i and target channel j.

Return type:

np.ndarray

Notes

The function assumes the input arrays have the same number of pixels (same second dimension). Channels are normalized to have zero mean and unit standard deviation before correlation calculation.

twocan.pick_best_registration(study_df)[source]

Calculate triangle score and return best trial from optimization results.

This function implements a balanced scoring approach for selecting the best registration trial from a set of optimization results. It combines three key metrics (logical AND, IoU, and correlation) using a triangular scoring scheme that balances all three aspects of registration quality.

The triangle score is calculated as:

(1/3) * |norm_and * norm_corr + norm_corr * norm_iou + norm_iou * norm_and|

where each metric is normalized to [0,1] within the group.

Parameters:: study_df (pd.DataFrame) – DataFrame containing trial results with required columns: - ‘user_attrs_logical_and’: Logical AND overlap between images - ‘user_attrs_logical_iou’: Intersection over Union score - ‘user_attrs_reg_image_max_corr’: Maximum correlation between channels
Returns:: DataFrame row containing the trial with the highest balanced score. The returned series includes all original columns plus computed normalization columns and the final ‘balanced_score’.
Return type:: pd.Series

Notes

The logical AND values are log-transformed before normalization because they typically span several orders of magnitude. The triangle score approach ensures that no single metric dominates the selection, leading to more robust registration quality assessment.

Missing or NaN values in any of the required columns will result in NaN normalized scores, which may affect the final ranking.

twocan.plot_cartoon_affine(w1: float, h1: float, M: ndarray, w2: float, h2: float, ax: Axes | None = None, show_source: bool = False, source_color: str = 'green', target_color: str = 'purple') → Tuple[Axes, List[Line2D]][source]

Plot a cartoon representation of an affine transformation.

Visualizes how a rectangle is transformed by an affine matrix, useful for understanding registration transformations.

Parameters:

w1 (float) – Width and height of source rectangle.
h1 (float) – Width and height of source rectangle.
M (np.ndarray) – 2x3 affine transformation matrix.
w2 (float) – Width and height of target rectangle.
h2 (float) – Width and height of target rectangle.
ax (Optional[Axes], default=None) – Matplotlib axes for plotting. If None, current axes will be used.
show_source (bool, default=False) – Whether to show the original source rectangle.
source_color (str, default='green') – Color for source rectangle and its transformation.
target_color (str, default='purple') – Color for target rectangle.

Returns:

The matplotlib axes object and list of plotted lines.

Return type:

Tuple[Axes, List[Line2D]]

twocan.prep_zarr(IF_arr: ndarray, IMC_arr: ndarray, IF_panel: List[str], IMC_panel: List[str]) → SpatialData[source]

Create a SpatialData object from IF and IMC arrays with their channel panels.

This function takes raw image arrays and their corresponding channel names to create a properly formatted SpatialData object containing both modalities. The function handles both 2D and 3D input arrays, automatically adding a channel dimension if needed.

Parameters:

IF_arr (np.ndarray) – Immunofluorescence array of shape (H, W) or (C, H, W). If 2D, will be expanded to (1, H, W).
IMC_arr (np.ndarray) – Imaging mass cytometry array of shape (H, W) or (C, H, W). If 2D, will be expanded to (1, H, W).
IF_panel (List[str]) – List of channel names for IF data. Must match the number of channels in IF_arr.
IMC_panel (List[str]) – List of channel names for IMC data. Must match the number of channels in IMC_arr.

Returns:

SpatialData object containing both modalities with proper channel information and coordinate systems.

Return type:

sd.SpatialData

twocan.registration_trial(trial, images, registration_channels, moving_image='IMC', static_image='IF', moving_preprocessor=<twocan.utils.IMCProcessor object>, static_preprocessor=<twocan.utils.IFProcessor object>)[source]