Setup
Based on: How to contribute (opens in a new tab)
The major difference between tech companies and open source that in open source you don’t work on the main repository directly.
-
why?
Projects like scikit-learn have hundreds of pull request monthly and an equal number of branches from many different contributors. If people were directly developing on scikit-learn, it will end up with an intractable number of branches on the main repo, and we don’t want that.
Instead, you must develop your branches on your local fork, before submitting your pull request (PR) from your fork to the main branch of the original repo:
Let’s setup your local working environment.
-
Fork the original repository
-
Clone the project on your local machine
On your terminal:
git clone <Your Project URL>
-
Add the original scikit-learn upstream
git remote add upstream git@github.com:scikit-learn/scikit-learn.git
Check that
git remote -v
display the following:origin git@github.com:YourLogin/scikit-learn.git (fetch) origin git@github.com:YourLogin/scikit-learn.git (push) upstream git@github.com:scikit-learn/scikit-learn.git (fetch) upstream git@github.com:scikit-learn/scikit-learn.git (push)
It will allow you to pull the latest change from the original repo, while pushing your commits to your fork.
-
Install mamba from forge (opens in a new tab)
Mamba is a fast front-end from conda.
Conda is useful because it links all dependencies to the same backend, whereas pip setup adhoc backends for each library.
-
Install mamba compilers
mamba install compilers
-
Create and activate a mamba environment
mamba create -n sk mamba activate sk
-
Install scikit dependencies
mamba install cython scipy numpy joblib
-
Install test dependencies
mamba install pytest pytest-cov flake8 mypy numpydoc black==22.3.0
-
Install pre-commit
pip install pre-commit pre-commit install
-
Build and install sklearn locally
pip install --no-build-isolation -e . -v
When developing Cython files, you need to recompile them by running this command before testing
-
Check your installation
python -c "import sklearn; sklearn.show_versions()"
If the command is successful, go to the next tutorial to start contributing!