Comment
Search
Duplicate
Try Notion

Setup

The major difference between tech companies and open source that in open source you don’t work on the main repository directly.
why?
Projects like scikit-learn have hundreds of pull request monthly and an equal number of branches from many different contributors. If people were directly developing on scikit-learn, it will end up with an intractable number of branches on the main repo, and we don’t want that.
Instead, you must develop your branches on your local fork, before submitting your pull request (PR) from your fork to the main branch of the original repo:
Figure from author
Let’s setup your local working environment.
Fork the original repository
Clone the project on your local machine
On your terminal:
Shell
Copy
git clone <Your Project URL>
​
Add the original scikit-learn upstream
Shell
Copy
git remote add upstream git@github.com:scikit-learn/scikit-learn.git
​
Check that git remote -v display the following:
Shell
Copy
origin git@github.com:YourLogin/scikit-learn.git (fetch) origin git@github.com:YourLogin/scikit-learn.git (push) upstream git@github.com:scikit-learn/scikit-learn.git (fetch) upstream git@github.com:scikit-learn/scikit-learn.git (push)
​
It will allow you to pull the latest change from the original repo, while pushing your commits to your fork.
Mamba is a fast front-end from conda
Conda is useful because it links all dependencies to the same backend, whereas pip setup adhoc backends for each library
Install mamba compilers
Shell
Copy
mamba install compilers
​
Create and activate a mamba environment
Shell
Copy
mamba create -n sk mamba activate sk
​
Install scikit dependencies
Shell
Copy
mamba install cython scipy numpy joblib
​
Install test dependencies
Shell
Copy
mamba install pytest pytest-cov flake8 mypy numpydoc black==22.3.0
​
Install pre-commit
Shell
Copy
pip install pre-commit pre-commit install
​
Build and install sklearn locally
Shell
Copy
pip install --no-build-isolation -e . -v
​
When developing Cython files, you need to recompile them by running this command before testing
Check your installation
Shell
Copy
python -c "import sklearn; sklearn.show_versions()"
​
If the command is successful, go to the next tutorial to start contributing!