EuroScipy 2024 - Skrub: Bringing everything into the model
Jérome Dockès and I presented the latest development and vision of skrub (opens in a new tab) at EuroScipy 2024 Szczecin (opens in a new tab).
When it comes to designing machine learning predictive models, it is reported that data scientists spend over 80% of their time preparing data for machine learning algorithms.
Currently, no automated solution exists to address this problem. The skrub Python library is here to alleviate some of the daily tasks of data scientists and offer an integration with the scikit-learn machine learning library.
Our main message is that all the preprocessing and tedious dataframe code should be in the model, because:
- It allows you fine tune hyper-parameters of your preprocessors
- It significantly ease deploying and versioning ML models
See how: Slides (opens in a new tab)