bdranalytics.sklearn.preprocessing package¶
-
class
bdranalytics.sklearn.preprocessing.
ScaledRegressor
(scaler, estimator)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.RegressorMixin
Allows a regressor to work with a scaled target if it does not allow scaling itself.
When fitting, the y will be transform using the scaler, before being passed to the model.fit. When predicting, the predicted y will be inverse transformed to obtain a y_hat in the original range of values.
For example, this allows your regressor to predict manipulated targets (ie log(y)), without additional pre and postprocessing outside your sklearn pipeline
- scaler : TransformerMixin
- The transformer which will be applied on the target before it is passed to the model
- estimator : RegressorMixin
- The regressor which will work in transformed target space
Examples >>> from sklearn.linear_model import LinearRegression >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.pipeline import Pipeline >>> n_rows = 10 >>> X = np.random.rand(n_rows, 2) >>> y = np.random.rand(n_rows) >>> regressor = LinearRegression() >>> scaler = StandardScaler() >>> pipeline = Pipeline([(“predict”, ScaledRegressor(scaler, regressor))]) >>> y_hat = pipeline.fit(X, y).predict(X)
-
class
bdranalytics.sklearn.preprocessing.
WeightOfEvidenceEncoder
(verbose=0, cols=None, return_df=True, smooth=0.5, fillna=0, dependent_variable_values=None)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Feature-engineering class that transforms a high-capacity categorical value into Weigh of Evidence scores. Can be used in sklearn pipelines.
Parameters: smooth – value for additive smoothing, to prevent divide by zero
-
class
bdranalytics.sklearn.preprocessing.
StringIndexer
[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
-
class
bdranalytics.sklearn.preprocessing.
LeaveOneOutEncoder
(with_stdevs=True)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Submodules¶
bdranalytics.sklearn.preprocessing.encoding module¶
-
class
bdranalytics.sklearn.preprocessing.encoding.
LeaveOneOutEncoder
(with_stdevs=True)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
-
class
bdranalytics.sklearn.preprocessing.encoding.
StringIndexer
[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
-
class
bdranalytics.sklearn.preprocessing.encoding.
WeightOfEvidenceEncoder
(verbose=0, cols=None, return_df=True, smooth=0.5, fillna=0, dependent_variable_values=None)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.TransformerMixin
Feature-engineering class that transforms a high-capacity categorical value into Weigh of Evidence scores. Can be used in sklearn pipelines.
Parameters: smooth – value for additive smoothing, to prevent divide by zero
bdranalytics.sklearn.preprocessing.preprocessing module¶
bdranalytics.sklearn.preprocessing.scaling module¶
-
class
bdranalytics.sklearn.preprocessing.scaling.
ScaledRegressor
(scaler, estimator)[source]¶ Bases:
sklearn.base.BaseEstimator
,sklearn.base.RegressorMixin
Allows a regressor to work with a scaled target if it does not allow scaling itself.
When fitting, the y will be transform using the scaler, before being passed to the model.fit. When predicting, the predicted y will be inverse transformed to obtain a y_hat in the original range of values.
For example, this allows your regressor to predict manipulated targets (ie log(y)), without additional pre and postprocessing outside your sklearn pipeline
- scaler : TransformerMixin
- The transformer which will be applied on the target before it is passed to the model
- estimator : RegressorMixin
- The regressor which will work in transformed target space
Examples >>> from sklearn.linear_model import LinearRegression >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.pipeline import Pipeline >>> n_rows = 10 >>> X = np.random.rand(n_rows, 2) >>> y = np.random.rand(n_rows) >>> regressor = LinearRegression() >>> scaler = StandardScaler() >>> pipeline = Pipeline([(“predict”, ScaledRegressor(scaler, regressor))]) >>> y_hat = pipeline.fit(X, y).predict(X)