-
Notifications
You must be signed in to change notification settings - Fork 43
Open
Description
Below queries rely on cuML models from for ML GPU . Depending on the performance we need to decide b/w Distributed (dask-ml) vs non distributed (sklearn) implementation for the ML portion of these queries. I suggest benchmarking both and then choosing the one that gives the best performance.
Query-05 GPU:cuml.LogisticRegression
- Non Distributed CPU: sklearn.linear_model.LogisticRegression.LogisticRegression
- Distributed CPU: dask_ml.linear_model.LogisticRegression
Query-20 GPU: cuml.cluster.kmeans
- CPU: sklearn.cluster.KMeans
- Distributed CPU: dask_ml.cluster.Kmeans
Query-25 GPU: cuml.cluster.kmeans
- CPU: sklearn.cluster.KMeans
- Distributed CPU: dask_ml.cluster.Kmeans
Query-26 GPU: cuml.cluster.kmeans
- CPU: sklearn.cluster.KMeans
- Distributed CPU: dask_ml.cluster.Kmeans
Query 28 GPUcuml.dask.naive_bayes
- Distributed CPU CPU Equivalent dask_ml.naive_bayes
CC: @DaceT , @randerzander
Related PRS:
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels