Pytorch PCA API
Main module for PCA.
- class PCA(n_components=None, *, whiten=False, svd_solver='auto', iterated_power='auto', n_oversamples=10, power_iteration_normalizer='auto', random_state=None)
Bases:
objectPrincipal Component Analysis (PCA).
Works with PyTorch tensors. API similar to sklearn.decomposition.PCA.
- Parameters:
n_components (int | float | str | None, optional) –
Number of components to keep.
If int, number of components to keep.
If float (should be between 0.0 and 1.0), the number of components to keep is determined by the cumulative percentage of variance explained by the components until the proportion is reached.
If “mle”, the number of components is selected using Minka’s MLE.
If None, all components are kept: n_components = min(n_samples, n_features).
By default, n_components=None.
svd_solver (str, optional) –
One of {‘auto’, ‘full’, ‘covariance_eigh’}
’auto’: the solver is selected automatically based on the shape of the input.
’full’: Run exact full SVD with torch.linalg.svd
’covariance_eigh’: Compute the covariance matrix and take the eigenvalues decomposition with torch.linalg.eigh. Most efficient for small n_features and large n_samples.
’randomized’: Compute the randomized SVD by the method of Halko et al.
By default, svd_solver=’auto’.
whiten (bool, optional) – If True, the components_ vectors are divided by sqrt(n_samples - 1) and scaled by the singular values to ensure uncorrelated outputs with unit component-wise variances. By default, False.
iterated_power (int | str, optional) – Integer or ‘auto’. Number of iterations for the power method computed by randomized SVD. Must be >= 0. Ignored if svd_solver!=’randomized’. By default, ‘auto’.
n_oversamples (int, optional) – Additional number of random vectors to sample the range of input data in randomized solver to ensure proper conditioning. Ignored if svd_solver!=’randomized’. By default, 10.
power_iteration_normalizer (str, optional) – One of ‘auto’, ‘QR’, ‘LU’, ‘none’. Power iteration normalizer for randomized SVD solver. Ignored if svd_solver!=’randomized’. By default, ‘auto.
random_state (int | None, optional) – Seed of randomized SVD solver. Ignored if svd_solver!=’randomized’. By default, None.
-
explained_variance_:
Optional[Tensor] The amount of variance explained by each of the selected components.
-
explained_variance_ratio_:
Optional[Tensor] Percentage of variance explained by each of the selected components.
-
singular_values_:
Optional[Tensor] Singular values corresponding to each of the selected components.
- fit_transform(inputs, *, determinist=True)
Fit the PCA model and apply the dimensionality reduction.
- Parameters:
inputs (Tensor) – Input data of shape (n_samples, n_features).
determinist (bool, optional) – If True, the SVD solver is deterministic but the gradient cannot be computed through the PCA fit (the PCA transform is always differentiable though). If False, the SVD can be non-deterministic but the gradient can be computed through the PCA fit. By default, determinist=True.
- Returns:
transformed – Transformed data.
- Return type:
Tensor
- fit(inputs, *, determinist=True)
Fit the PCA model and return it.
- Parameters:
inputs (Tensor) – Input data of shape (n_samples, n_features).
determinist (bool, optional) – If True, the SVD solver is deterministic but the gradient cannot be computed through the PCA fit (the PCA transform is always differentiable though). If False, the SVD can be non-deterministic but the gradient can be computed through the PCA fit. By default, determinist=True.
- Returns:
The PCA model fitted on the input data.
- Return type:
- transform(inputs, center='fit')
Apply dimensionality reduction to X.
- Parameters:
inputs (Tensor) – Input data of shape (n_samples, n_features).
center (str) –
One of ‘fit’, ‘input’ or ‘none’. Precise how to center the data.
’fit’: center the data using the mean fitted during fit (default).
’input’: center the data using the mean of the input data.
’none’: do not center the data.
By default, ‘fit’ (as sklearn PCA implementation)
- Returns:
transformed – Transformed data of shape (n_samples, n_components).
- Return type:
Tensor
- inverse_transform(inputs)
De-transform transformed data.
- Parameters:
inputs (Tensor) – Transformed data of shape (n_samples, n_components).
- Returns:
de_transformed – De-transformed data of shape (n_samples, n_features) where n_features is the number of features in the input data before applying transform.
- Return type:
Tensor
- get_covariance()
Compute data covariance with the generative model.
- Return type:
Tensor
- get_exp_variance_diff()
Get explained variance difference (from noise).
- Return type:
Tuple[Tensor,Tensor]
- get_precision()
Compute data precision matrix with the generative model.
It is the inverse the covariance matrix but the method is more efficient than computing it directly.
- Return type:
Tensor
- score_samples(inputs)
Compute score of each sample based on log-likelihood.
- Returns:
log_likelihood – Log-likelihood of each sample under the current model, of shape (n_samples,)
- Return type:
Tensor
- score(inputs)
Return the average score (log-likelihood) of all samples.
- Return type:
Tensor
- to(*args, **kwargs)
Move the model to the specified device/dtype.
Call the native PyTorch .to() method on all tensors, parameters and NN modules to move the model to the specified device and/or dtype.
- Parameters:
args (Any) – Positional arguments to pass to the .to() method.
kwargs (Any) –
Keyword arguments to pass to the .to() method. They can be: device : DeviceLikeType
Device to move the model to.
- dtypetorch.dtype
Data type to move the model to.
- non_blockingbool, optional
If True, the operation will be non-blocking. By default, False.
copy : bool, optional memory_format : torch.memory_format, optional
- Return type:
Note
By default, the parameters dtype and device are the same as the input data dtype and device during the fit. This method is used if want you to change the dtype and/or device of the model after the fit. For instance if you fit the model on GPU and want to make inference on CPU.
Warning
Require the model to be fitted first.