Project Environment Preparation: Essential Steps for Smooth Workflow

1 Mar 2024

m de lecture

Machine Learning

Melanie

Introduction

If you are getting started on a Data Science project, this article will be of interest to you!

Before starting a Machine Learning project, it’s crucial to prepare and organise your development environment properly!

In this article, we’ll show you how to set up and prepare your environment in several stages.

a) Settings

Before getting started, it’s important to make a few adjustments to your working environment. This step is an essential prerequisite before tackling the root of the problem.

The first question is which code support you are going to use: there are several solutions for installing Python.

You can use Anaconda, Spyder or any other suitable IDE (or development environment).

Generally speaking, we recommend using Anaconda for several reasons:

Firstly, Anaconda’s features integrate well into your environment and are very easy to use.
Secondly, the Anaconda Desktop user interface is relatively easy to learn.
Finally, Anaconda has a number of packages and libraries that are essential for carrying out the most basic tasks.

	On Windows	On Mac	On Linux
Python	https://www.python.org/downloads/		sudo apt-get install python
Spyder	https://docs.spyder-ide.org/3/installation.html		sudo apt-get install spyder

Python and Spyder installation commands

b) Virtual environment

Once python has been installed, it is important to create a virtual environment.

What for?

It is important to ensure that only the packages you are using are installed in your environment. The virtual environment allows the Python interpreter to integrate installed libraries and scripts in isolation.

Virtualenv (venv) is one of the most widely used tools for this. In most versions of Anaconda, it is pre-integrated. If not, you will need to download it before creating a new environment in the project folder.

The benefits

You can create and change virtual environments as often as you like. Each of them can have different libraries and associated packages.

You can also create environments in Anaconda quite easily and import the different packages you need directly by searching and installing them.

How do we do it?

There are specific commands for using, activating and deactivating a virtual environment from the terminal on your machine:

	Windows	Mac OS
Creation	$ conda create -n nom_env python=3.6	$ mkvirtualenv nom_env
Activation	$ conda activate nom_env	$ source activate nom_env
Deactivation	$ deactivate	(nom_env)$ source deactivate

c) Text editors

If you’re working directly with .py python files for script import, for example, there are various text editors available, such as Spyder, VSCode, PyCharm, Eclipse with PyDev, and so on.

We recommend using Spyder or VSCode because they have a built-in Python kernel and a library browser, which makes it easier to check code and manipulate installed modules.

It is also possible to integrate extensions quite easily, for example for formatting code. There are Black and Flake8 for example.

If, on the other hand, you only want to use Jupyter Notebook, you can install the following formatting extensions: AutoPep8, isort, Flake8_nb…

💡Related articles:

Folium: Discover the open source Python library

Matplotlib: Master Data Visualization in Python

Python Crash Course: Get started

Mastering Machine Learning in Python: Data-Driven Success

Python Programming for Beginners – Episode 3

d) Organising the file

The last step before moving on to the project itself is to organise the various python files linked to the project in the same directory (where you created the virtual environment). You will need to add a requirements.txt file containing all the libraries used for documentation purposes.

How do I display all the installed packages? By using the freeze command

display pip freeze

You then need to store all these packages in a file in text format.

display pip freeze > requirements.txt

Now imagine that you want to take over a colleague’s project. To be able to run all his code, you need to install the installed packages locally. It couldn’t be simpler: you can run the following command:

display pip install -r requirements.txt

Conclusion

The aim of this article was to help you configure your production environment. You’ve done it!

You can now tackle any Machine Learning project.

If you’d like to know more about the different stages of a Machine Learning project, we’ll see you in the next article.

See you soon!

DataScientest News

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!

Data Analyst

Analytics Engineer

Data Scientist

AI / Machine Learning Engineer

Data Engineer

Cloud Engineer

DevOps Engineer

Data Marketing & AI

MLOps

ETL Developer

Data Ops Engineer

Amazon Web Services (AWS)

Microsoft Power BI

Project Environment Preparation: Essential Steps for Smooth Workflow

Introduction

a) Settings

b) Virtual environment

What for?

The benefits

How do we do it?

c) Text editors

d) Organising the file

Conclusion

You are not available?

Related articles

How to Build an effective dashboard?

SAP Fieldglass: Manage your suppliers effectively with this ERP

Hyperautomation: definition, challenges, concrete examples

SAP IBP: What is it? How does it work?

Data Analyst

Analytics Engineer

Data Scientist

AI / Machine Learning Engineer

Data Engineer

Cloud Engineer

DevOps Engineer

Data Marketing & AI

MLOps

ETL Developer

Data Ops Engineer

Amazon Web Services (AWS)

Microsoft Power BI

Project Environment Preparation: Essential Steps for Smooth Workflow

Introduction

a) Settings

b) Virtual environment

What for?

The benefits

How do we do it?

c) Text editors

d) Organising the file

Conclusion

You are not available?

Related articles

How to Build an effective dashboard?

SAP Fieldglass: Manage your suppliers effectively with this ERP

Hyperautomation: definition, challenges, concrete examples

SAP IBP: What is it? How does it work?

DataNews