bdranalytics.sklearn.preprocessing package¶

class bdranalytics.sklearn.preprocessing.ScaledRegressor(scaler, estimator)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

Allows a regressor to work with a scaled target if it does not allow scaling itself.

When fitting, the y will be transform using the scaler, before being passed to the model.fit. When predicting, the predicted y will be inverse transformed to obtain a y_hat in the original range of values.

For example, this allows your regressor to predict manipulated targets (ie log(y)), without additional pre and postprocessing outside your sklearn pipeline

scaler : TransformerMixin: The transformer which will be applied on the target before it is passed to the model
estimator : RegressorMixin: The regressor which will work in transformed target space

Examples >>> from sklearn.linear_model import LinearRegression >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.pipeline import Pipeline >>> n_rows = 10 >>> X = np.random.rand(n_rows, 2) >>> y = np.random.rand(n_rows) >>> regressor = LinearRegression() >>> scaler = StandardScaler() >>> pipeline = Pipeline([(“predict”, ScaledRegressor(scaler, regressor))]) >>> y_hat = pipeline.fit(X, y).predict(X)

fit(X, y)[source]¶

predict(X)[source]¶

class bdranalytics.sklearn.preprocessing.WeightOfEvidenceEncoder(verbose=0, cols=None, return_df=True, smooth=0.5, fillna=0, dependent_variable_values=None)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Feature-engineering class that transforms a high-capacity categorical value into Weigh of Evidence scores. Can be used in sklearn pipelines.

Parameters:	smooth – value for additive smoothing, to prevent divide by zero

fit(X, y)[source]¶

transform(X, y=None)[source]¶

class bdranalytics.sklearn.preprocessing.StringIndexer[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(X, y=None)[source]¶

transform(X)[source]¶

class bdranalytics.sklearn.preprocessing.LeaveOneOutEncoder(with_stdevs=True)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(X, y=None)[source]¶

fit_transform(X, y)[source]¶: will be used during pipeline fit

transform(X)[source]¶

Submodules¶

bdranalytics.sklearn.preprocessing.encoding module¶

class bdranalytics.sklearn.preprocessing.encoding.LeaveOneOutEncoder(with_stdevs=True)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(X, y=None)[source]¶

fit_transform(X, y)[source]¶: will be used during pipeline fit

transform(X)[source]¶

class bdranalytics.sklearn.preprocessing.encoding.StringIndexer[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(X, y=None)[source]¶

transform(X)[source]¶

class bdranalytics.sklearn.preprocessing.encoding.WeightOfEvidenceEncoder(verbose=0, cols=None, return_df=True, smooth=0.5, fillna=0, dependent_variable_values=None)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Feature-engineering class that transforms a high-capacity categorical value into Weigh of Evidence scores. Can be used in sklearn pipelines.

Parameters:	smooth – value for additive smoothing, to prevent divide by zero

fit(X, y)[source]¶

transform(X, y=None)[source]¶

bdranalytics.sklearn.preprocessing.preprocessing module¶

class bdranalytics.sklearn.preprocessing.preprocessing.ColumnSelector(columns)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

fit(X, y=None)[source]¶

transform(X)[source]¶

bdranalytics.sklearn.preprocessing.scaling module¶

class bdranalytics.sklearn.preprocessing.scaling.ScaledRegressor(scaler, estimator)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

Allows a regressor to work with a scaled target if it does not allow scaling itself.

When fitting, the y will be transform using the scaler, before being passed to the model.fit. When predicting, the predicted y will be inverse transformed to obtain a y_hat in the original range of values.

For example, this allows your regressor to predict manipulated targets (ie log(y)), without additional pre and postprocessing outside your sklearn pipeline

scaler : TransformerMixin: The transformer which will be applied on the target before it is passed to the model
estimator : RegressorMixin: The regressor which will work in transformed target space

Examples >>> from sklearn.linear_model import LinearRegression >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.pipeline import Pipeline >>> n_rows = 10 >>> X = np.random.rand(n_rows, 2) >>> y = np.random.rand(n_rows) >>> regressor = LinearRegression() >>> scaler = StandardScaler() >>> pipeline = Pipeline([(“predict”, ScaledRegressor(scaler, regressor))]) >>> y_hat = pipeline.fit(X, y).predict(X)

fit(X, y)[source]¶

predict(X)[source]¶