You wrote a Python script that trains and evaluates your machine learning model. Now, you would like to automatically tune hyperparameters to improve its performance? I got you! In this article, I will show you how to convert your script into an objective function that can be optimized with any hyperparameter optimization library. It will take just 3 steps and you will be tuning model parameters like there is no tomorrow. Ready? Let’s go! I suppose your script looks something like this one: main.py pandas pd lightgbm lgb sklearn.model_selection train_test_split

data = pd.read_csv( , nrows= )
X = data.drop([ , ], axis= )
y = data[ ]
(X_train, X_valid, 
y_train, y_valid )= train_test_split(X, y, test_size= , random_state= )

train_data = lgb.Dataset(X_train, label=y_train)
valid_data = lgb.Dataset(X_valid, label=y_valid, reference=train_data)

params = { : , : , : , : , : , : , : }

model = lgb.train(params, train_data,
                  num_boost_round= ,
                  early_stopping_rounds= ,
                  valid_sets=[valid_data],
                  valid_names=[ ])

score = model.best_score[ ][ ]
print( , score) import as import as from import 'data/train.csv' 10000 'ID_code' 'target' 1 'target' 0.2 1234 'objective' 'binary' 'metric' 'auc' 'learning_rate' 0.4 'max_depth' 15 'num_leaves' 20 'feature_fraction' 0.8 'subsample' 0.2 300 30 'valid' 'valid' 'auc' 'validation AUC:' Step 1: Decouple search parameters from code Take the parameters that you want to tune and put them in a dictionary at the top of your script. By doing that you effectively decouple search parameters from the rest of the code. pandas pd lightgbm lgb sklearn.model_selection train_test_split

SEARCH_PARAMS = { : , : , : , : , : }

data = pd.read_csv( , nrows= )
X = data.drop([ , ], axis= )
y = data[ ]
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size= , random_state= )

train_data = lgb.Dataset(X_train, label=y_train)
valid_data = lgb.Dataset(X_valid, label=y_valid, reference=train_data)

params = { : , : ,
          **SEARCH_PARAMS}

model = lgb.train(params, train_data,
                  num_boost_round= ,
                  early_stopping_rounds= ,
                  valid_sets=[valid_data],
                  valid_names=[ ])

score = model.best_score[ ][ ]
print( , score) import as import as from import 'learning_rate' 0.4 'max_depth' 15 'num_leaves' 20 'feature_fraction' 0.8 'subsample' 0.2 '../data/train.csv' 10000 'ID_code' 'target' 1 'target' 0.2 1234 'objective' 'binary' 'metric' 'auc' 300 30 'valid' 'valid' 'auc' 'validation AUC:' Step 2: Wrap training and evaluation into a function Now, you can put the entire training and evaluation logic inside of a function. This function takes parameters as input and outputs the validation score. train_evaluate pandas pd lightgbm lgb sklearn.model_selection train_test_split

SEARCH_PARAMS = { : , : , : , : , : } data = pd.read_csv( , nrows= )
    X = data.drop([ , ], axis= )
    y = data[ ]
    X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size= , random_state= )

    train_data = lgb.Dataset(X_train, label=y_train)
    valid_data = lgb.Dataset(X_valid, label=y_valid, reference=train_data)

    params = { : , : ,
              **search_params}

    model = lgb.train(params, train_data,
                      num_boost_round= ,
                      early_stopping_rounds= ,
                      valid_sets=[valid_data],
                      valid_names=[ ])

    score = model.best_score[ ][ ] score __name__ == :
    score = train_evaluate(SEARCH_PARAMS)
    print( , score) import as import as from import 'learning_rate' 0.4 'max_depth' 15 'num_leaves' 20 'feature_fraction' 0.8 'subsample' 0.2 : def train_evaluate (search_params) '../data/train.csv' 10000 'ID_code' 'target' 1 'target' 0.2 1234 'objective' 'binary' 'metric' 'auc' 300 30 'valid' 'valid' 'auc' return if '__main__' 'validation AUC:' Step 3: Run Hypeparameter Tuning script We are almost there. All you need to do now is to use this function as an objective for the black-box optimization library of your choice. train_evaluate I will use which I have described in great detail in another article but you can use any hyperparameter optimization library out there. Scikit Optimize In a nutshell I: define the search , SPACE create the function that will be minimized, objective run the optimization via function. skopt.forest_minimize In this example, I will try different configurations starting with randomly chosen parameter sets. 100 10 skopt script_step2 train_evaluate

SPACE = [
    skopt.space.Real( , , name= , prior= ),
    skopt.space.Integer( , , name= ),
    skopt.space.Integer( , , name= ),
    skopt.space.Real( , , name= , prior= ),
    skopt.space.Real( , , name= , prior= )] * train_evaluate(params)


results = skopt.forest_minimize(objective, SPACE, n_calls= , n_random_starts= )
best_auc = * results.fun
best_params = results.x

print( , best_auc)
print( , best_params) import from import 0.01 0.5 'learning_rate' 'log-uniform' 1 30 'max_depth' 2 100 'num_leaves' 0.1 1.0 'feature_fraction' 'uniform' 0.1 1.0 'subsample' 'uniform' @skopt.utils.use_named_args(SPACE) : def objective (**params) return -1.0 30 10 -1.0 'best result: ' 'best parameters: ' This is it. The object contains information about the that produced it. results best score and parameters Note: If you want to after it finishes you can add one callback and one function call to to Neptune. visualize your training and save diagnostic charts log every hyperparameter search Just use this . optuna monitoring helper function neptune neptunecontrib.monitoring.skopt sk_utils skopt script_step2 train_evaluate

neptune.init( )
neptune.create_experiment( , upload_source_files=[ ])

SPACE = [
    skopt.space.Real( , , name= , prior= ),
    skopt.space.Integer( , , name= ),
    skopt.space.Integer( , , name= ),
    skopt.space.Real( , , name= , prior= ),
    skopt.space.Real( , , name= , prior= )] * train_evaluate(params)


monitor = sk_utils.NeptuneMonitor()
results = skopt.forest_minimize(objective, SPACE, n_calls= , n_random_starts= , callback=[monitor])
sk_utils.log_results(results)

neptune.stop() import import as import from import 'jakub-czakon/blog-hpo' 'hpo-on-any-script' '*.py' 0.01 0.5 'learning_rate' 'log-uniform' 1 30 'max_depth' 2 100 'num_leaves' 0.1 1.0 'feature_fraction' 'uniform' 0.1 1.0 'subsample' 'uniform' @skopt.utils.use_named_args(SPACE) : def objective (**params) return -1.0 100 10 Now, when you run your parameter sweep you will see the following: Check out the with all the code, charts and results. skopt hyperparameter sweep experiment Final thoughts In this article, you’ve learned how to optimize hyperparameters of pretty much any Python script in just 3 steps. Hopefully, with this knowledge, you will build better machine learning models with less effort. Happy training! This article was originally posted on the Neptune blog . If you liked it, you may like it there :) You can also find me tweeting @Neptune_a i or posting on LinkedIn about ML and Data Science stuff.

Target

How To Run Text Categorization: All Tips and Tricks from 5 Kaggle Competitions

The Ten Must Read NLP/NLU Papers from the ICLR 2020 Conference

Check out our free experiment tracking tool

Hyperparameter Tuning on Any Python Script in 3 Easy Steps [A How-To Guide]

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Complete(ish) Guide to Python Tools You Can Use To Analyse Text Data

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

A Complete(ish) Guide to Python Tools You Can Use To Analyse Text Data

The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

The Noonification: White Man (11/26/2022)

The Noonification: The Metaverse is a Sh*tshow (11/2/2022)

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps