def set_seed(gpu=False):
random.seed(42)
np.random.seed(42)
torch.manual_seed(42)
if gpu:
torch.cuda.manual_seed_all(42)
- Postgre SQL
- Hive
2. Use dask for multiprocessing
- A great tool
- Multiprocessing for df, models, GridSearchCV
- Can be used with cluster
..
- custom CV is usually better. Use dask to parallelize this step