Objective:
Learning how evaluation workflows are derived from the main ML pipeline.
Principles:
Evaluation method based on part of a training dataset being withheld for testing the predictions.
from sklearn import metrics
from forml import evaluation, project
EVALUATION = project.Evaluation(
evaluation.Function(metrics.log_loss), # LogLoss metric function
evaluation.HoldOut( # HoldOut evaluation method
test_size=0.2, stratify=True, random_state=42
),
)
Based on the known SOURCE
and PIPELINE
components, ForML can produce a task graph to evaluate that solution using the provided definition:
from forml.pipeline import payload, wrap
from dummycatalog import Foo
with wrap.importer():
from sklearn.linear_model import LogisticRegression
SOURCE = project.Source.query(Foo.select(Foo.Value), Foo.Label)
PIPELINE = LogisticRegression(random_state=42)
SOURCE.bind(PIPELINE, evaluation=EVALUATION).launcher(
runner="graphviz"
).eval()
Evaluation method based on a number of independent train-test trials using different parts of the same training dataset.
from sklearn import model_selection
EVALUATION = project.Evaluation(
evaluation.Function(metrics.log_loss), # LogLoss metric function
evaluation.CrossVal( # CrossValidation method
crossvalidator=model_selection.StratifiedKFold(
n_splits=3, shuffle=True, random_state=42
)
),
)
SOURCE.bind(PIPELINE, evaluation=EVALUATION).launcher(
runner="graphviz"
).eval()
This will be demonstrated later in scope of the final solution of Avazu CTR Prediction.