参考:
对于高维features,常常需要在supervised之前unsuperviseddimensionality reduction。
下面三节的翻译会在之后附上。
4.4.1. PCA: principal component analysis
looks for a combination of features that capture well the variance of the original features. SeeDecomposing signals in components (matrix factorization problems). 翻译文章参考:。
Examples
Faces recognition example using eigenfaces and SVMs
4.4.2. Random projections
The module:random_projectionprovides several toolsfor data reduction by random projections. See the relevant section of the documentation:Random Projection.翻译文章参考:。
Examples
The Johnson-Lindenstrauss bound for embedding with random projections
4.4.3. Feature agglomeration(特征集聚)
appliesHierarchical clusteringto group together features that behave similarly.
Examples
Feature agglomeration vs. univariate selection
Feature scaling
Note that if features have very different scaling or statistical properties,may not be able to capture the links between related features. Using acan be useful in these settings.
Pipelining:The unsupervised data reduction and the supervised estimator can be chained in one step. SeePipeline: chaining estimators.
版权声明:本文为博主原创文章,未经博主允许不得转载。
,于千万年之中,于千万人之中,在时间无涯的荒野中,