Setup – Introduction to Conda for (Data) Scientists (2023)

This lesson is part of The Carpentries Incubator, a place to share and use each other's Carpentries-style lessons. This lesson has not been reviewed by and is not endorsed by The Carpentries.

Check to see if Conda is already installed

If you have ever installed the Anaconda Python distribution on your local machine, then you already have Conda installed! Mac and Linux users can check whether Conda is installed by running the following command in a terminal.

$ which conda/Users/$USERNAME/miniconda3/bin/conda

If Conda has already been installed on your machine, then this command should return the absolute path to the conda executable.

Windows users should search for “Anaconda” to see if the “Anaconda Command Prompt” shows up as an option, if it does then you already have Conda installed.

Old version of Anaconda?

If you previously installed the Anaconda Python distribution you may have an old version of Conda. Youcan check your version of Conda with the following command.

$ conda --version

If you have a version of Conda that is 4.5 (or older), then it is probably best to uninstall your Anaconda Python distribution and then reinstall the most recent version.

Install Python 3 version of Miniconda

If Conda has not been installed on your machine, then install the Python 3 version of Miniconda for your OS. As the name suggests, Miniconda is a “mini” version of the Anaconda Python distribution that includes only Conda, a Python 3 distribution, and any necessary OS-specific dependencies.

For convenience here are links to the 64-bit Miniconda installers.

(Video) Introduction to Conda for (Data) Scientists (Fall 2022)

Prefer Miniconda to Anaconda

I suggest installing Miniconda which combines Conda with Python 3 (and a small number of core systems packages) instead of the full Anaconda distribution. Installing only Miniconda will encourage you to create separate environments for each project (and to install only those packages that you actually need for each project!). Project specific environments enhance portability and reproducibility of your research and workflows.

Besides, if you really want the full Anaconda distribution you can always create an new conda environment and install it using the following command.

We will discuss the above command in great depth in the workshop.

Windows installation

After you downloaded the Windows GUI installer, double click on it and follow the instructions (accept license, etc.).You can use the defaults except for the “Advanced Installation Options” where you would tick on “Add Miniconda3 to my PATH environment variable”.

Mac OSX installation

After you downloaded the Mac OSX GUI installer, double click on it and follow the instructions (accept license, etc.). When you are asked where to install Miniconda, you should leave the default option to “install for me only”. If you get the error message “You cannot install Miniconda in this location,” reselect “Install for me only”. Then you should be able to continue to the next.The default option will modify your PATH in ~/.bash_profile. If you open the terminal after installation is over, you would see “(base)” on the left side of prompt.

Linux installation

I will walk through the steps for installing on Linux systems below as installing on Linux systems is slightly more involved. First, download the 64-bit Python 3 install script for Miniconda (clicking the link above will download the same script!).

wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Run the Miniconda install script. Follow the prompts on the installer screens. If you are unsure about any setting, accept the defaults (you can change them later if necessary).

(Video) Introduction to Conda for (Data) Scientists Tutorial | SciPy 2020 | David Pugh

bash Miniconda3-latest-Linux-x86_64.sh

Once the install script completes, you can remove it.

rm Miniconda3-latest-Linux-x86_64.sh

Verifying your Conda installation

In order to verify that you have installed Conda correctly run the conda help command. Output of the command should look similar to the following.

$ conda helpusage: conda [-h] [-V] command ...conda is a tool for managing and deploying applications, environments and packages.Options:positional arguments: command clean Remove unused packages and caches. config Modify configuration values in .condarc. This is modeled after the git config command. Writes to the user .condarc file (/Users/drpugh/.condarc) by default. create Create a new conda environment from a list of specified packages. help Displays a list of available conda commands and their help strings. info Display information about current conda install. init Initialize conda for shell interaction. [Experimental] install Installs a list of packages into a specified conda environment. list List linked packages in a conda environment. package Low-level conda package utility. (EXPERIMENTAL) remove Remove a list of packages from a specified conda environment. uninstall Alias for conda remove. run Run an executable in a conda environment. [Experimental] search Search for packages and display associated information. The input is a MatchSpec, a query language for conda packages. See examples below. update Updates conda packages to the latest compatible version. upgrade Alias for conda update.optional arguments: -h, --help Show this help message and exit. -V, --version Show the conda version number and exit.conda commands available from other packages: env

At the bottom of the help menu you will see a section with some optional arguments for the conda command. In particular you can pass the --version flag which will return the version number. Again output should look similar to the following.

$ conda --versionconda 4.8.2
(Video) Tutorial: Introduction to Conda for (Data) Scientists | David Pugh | SciPy Japan 2020

Make sure you have the most recent version

Once Conda exists on your machine, then run the following command to make sure that you have the most recent version and patches.

$ conda update --name base --channel defaults --yes conda

You can re-run this command at any time to update to the most recent version of Conda.

Initializing your shell for Conda

Key parts of Conda’s functionality require that it interact directly with the shell within which Conda commands are being invoked as such each shell must be configured to make use of them. The conda init command initializes a shell for use with Conda by making changes to your system that are specific and customized for each shell. Conda supports a number of different shells and you can run conda init --help to see the complete list.

Mac OSX and Linux users will want to initialize Conda for Bash as follows. If you are installing on Linux, then you may be prompted to initialize Conda for your shell when running the installation script. If so, then you can safely skip this step.

$ conda init bash

Windows users can either use the Anaconda Command Prompt or the Anaconda Powershell Prompt which are already initialized for Conda or they can initialize Conda for Powershell as follows.

> conda init powershell
(Video) Introduction to Conda for (Data) Scientists (Spring 2021)

After running conda init you will need to close and restart your shell for changes to take effect. Alternatively, Mac OS and Linux users can reload your ~/.bashrc profile (which was changed by running the conda init command). To reload your ~/.bashrc profile, use the following command.

$ source ~/.bashrc

If you want to reverse or “undo” the changes made by conda init, then you can re-run the conda init command and pass the --reverse option. Again, in order for the reversal to take effect you will likely need to close and restart your shell session.

Use of Binder instead of installing Conda (Optional)

If you wish to get started with this course without installing Conda, then you can use a pre-configured instance running on Binder by clicking on the link below.

Setup – Introduction to Conda for (Data) Scientists (1)

Workspace for Conda environments

In order to maintain a consistent workspace for all your conda environment, we will create a newintroduction-to-conda-for-data-scientists directory on your Desktop and store our conda environment in this directory.On Mac OSX and Linux running following commands in theTerminal will create the required directory on the Desktop.

$ cd ~/Desktop$ mkdir introduction-to-conda-for-data-scientists$ cd introduction-to-conda-for-data-scientists

For Windows users you may need to reverse the direction of the slash and run the commands from the command prompt.

> cd ~\Desktop> mkdir introduction-to-conda-for-data-scientists> cd introduction-to-conda-for-data-scientists
(Video) How to Set Up Your Data Science Environment (Anaconda Beginner)

Alternatively, you can always “right-click” and “create new folder” on your Desktop. All the commands that are run during the workshop should be run in a terminal within the introduction-to-conda-for-data-scientists directory.

FAQs

Is Anaconda good for data science? ›

Anaconda is best tool for development ML and Data Science

One set of tools have multiple tools like Jupyter, Spyder, Glue Viz etc. Multiple type of development easy from this tool. Most packages available when install to Anaconda. Very easy to update packages as well as user friendly for development.

What is conda in data science? ›

Conda is a platform agnostic, open source package and environment management system. Using a package and environment management tool facilitates portability and reproducibility of (data) science workflows. Conda solves both the package and environment management problems and targets multiple programming languages.

Can I use Python without conda? ›

You can use conda without Anaconda, but using Anaconda always involves the conda tool. module load python/3.4. x-anaconda Python 3 is the latest version of the language and python 2 is considered legacy. Generally you should choose Python 3 for new projects whenever possible.

Is Mamba better than conda? ›

mamba is a drop-in replacement and uses the same commands and configuration options as conda . The only difference is that you should still use conda for activation and deactivation.

Can Anaconda beat Python? ›

An anaconda would win in a fight against a python. These two creatures are so similar in every facet except for length, thickness, and weight, and those are the ones we have to use to determine who would win if they faced off.

Is Anaconda good for beginners? ›

Anaconda Python is the perfect platform for beginners who want to learn Python. It's easy to install, and you can get started quickly with the included Jupyter Notebook. Plus, Anaconda Python has many features and libraries that you can use for your projects.

Why conda is used in Python? ›

Conda as a package manager helps you find and install packages. If you need a package that requires a different version of Python, you do not need to switch to a different environment manager, because conda is also an environment manager.

Why is conda used for data science? ›

Conda's benefits include: Providing prebuilt packages which avoid the need to deal with compilers or figuring out how to set up a specific tool. Managing one-step installation of tools that are more challenging to install (such as TensorFlow or IRAF).

How do you use conda in Python? ›

Managing Python

If you want to use a different version of Python, for example Python 3.5, simply create a new environment and specify the version of Python that you want. When conda asks if you want to proceed, type "y" and press Enter. Activate the new environment: Windows: conda activate snakes.

Should I install Python or Anaconda first? ›

Anaconda recommends downloading the latest version of Anaconda prior to creating a Python 3.5 (or 3.6) environment. Or download the latest version of Anaconda and run the following command to install Python 3.5 (or 3.6) in the root environment: conda install python=3.5 or conda install python=3.6 .

What is the difference between Anaconda and conda? ›

conda is a virtual environment manager, a software that allows you to create, removing or packaging virtual environments as well as installing software, while Anaconda (and Miniconda) includes conda along with some pre-downloaded libraries.

Is conda different from Anaconda? ›

Conda is a package manager. It helps you take care of your different packages by handling installing, updating and removing them. Anaconda contains all of the most common packages (tools) a data scientist needs and can be considered the hardware store of data science tools.

Why is conda so slow? ›

Unlike many package managers, Anaconda's repositories generally don't filter or remove old packages from the index. This allows old environments to be easily recreated. However, it does mean that the index metadata is always growing, and thus conda becomes slower as the number of packages increases.

Is conda still used? ›

In most of the real world Data Science projects, conda based package and environments are widely used and I personally preferred conda based package installation and maintenance of project then installing and maintaining directly PIP based packages.

Should I use pip or conda? ›

It's fully recommended to use pip inside of conda. It's better to install using conda, but for any packages that don't have a conda build, it's perfectly acceptable to use pip.

What kills anaconda? ›

Other predators of anacondas within their first two years of life include crab-eating foxes, tegu lizards and crested caracaras. Larger green anacondas have also been known to kill and eat juveniles.

Can python swallow human? ›

Considering the known maximum prey size, a full-grown reticulated python can open its jaws wide enough to swallow a human, but the width of the shoulders of some adult Homo sapiens can pose a problem for even a snake with sufficient size.

What animal kills a python? ›

Pythons have predators. Small, young pythons may be attacked and eaten by a variety of birds, wild dogs and hyenas, large frogs, large insects and spiders, and even other snakes. But adult pythons are also at risk from birds of prey and even lions and leopards.

How much RAM does an Anaconda need? ›

You need a minimum RAM size of 32 GB, or 16 GB RAM with 1600 MHz DDR3.

What is difference between Anaconda and Jupyter? ›

Anaconda is a Python distribution with many software tools in it. Spyder is an IDE and Jupyter Notebook is a web-based program to code Python for scientific purposes in Anaconda. PyCharm is a popular Python IDE for general purposes.

Is PyCharm better than Anaconda? ›

Though they are independent tools, PyCharm and AnaConda can be used together for projects that can benefit from both tools. PyCharm is an IDE built to make it easier to write Python code, by providing a text editor and debugging, among other features. Anaconda is a Python distribution focused on data driven projects.

What is the advantage of conda? ›

Conda is Better at Dependency Management

Instead, pip may allow incompatible dependencies to be installed depending on the order you install packages. Conda instead uses what they call a “satisfiability solver”, which checks that all dependencies are met at all times.

Why should I install conda? ›

Using Conda you can quickly install commonly used data science libraries and tools, such as R, NumPy, SciPy, Scikit-learn, Dask, TensorFlow, PyTorch, Fast.ai, NVIDIA RAPIDS, and more built using optimized, hardware specific libraries (such as Intel's MKL or NVIDIA's CUDA), which provides a speedup without having to ...

What's the difference between Python and conda? ›

Thus, the main difference between Python and Anaconda is that the former is a programming language and the latter is software to install and manage Python and other programming languages (such as R). In this article, we'll discuss how to use Anaconda to manage and install packages as well as when to use pip or conda .

How do you set up an Anaconda for data science? ›

Graphical installation of Anaconda
  1. Choose either the python 3. X or the python 2. X version. Please make sure to chose the version specified in the tutorial you want to run. If you don't know, take the python 3. ...
  2. If you're using Windows or Linux, make sure to pick the 64 bit installer if you have a 64 bit system.

Is conda only for Python? ›

Conda is a package, dependency, and environment management system which was originally developed for Python but was later extended for use with languages like Python, R, Java, Scala, FORTRAN, C/C++, etc.

Which Python is conda using? ›

Conda treats Python the same as any other package, so it is easy to manage and update multiple installations. Anaconda supports Python 3.7, 3.8, 3.9 and 3.10. The current default is Python 3.9.

Can I use conda in Jupyter Notebook? ›

Jupyter Notebook can easily be installed using conda. Our plan is to only install it in the base environment, and then just switch between sub-environments to avoid setting up Jupyter Lab in each environment.

Can I install Python using conda? ›

The Earth Engine Python API can be installed to a local machine via conda, a Python package and environment manager. Conda is bundled with Anaconda and Miniconda Python distributions. Anaconda is a data science programming platform that includes 1500+ packages, while Miniconda includes only conda and its dependencies.

How do I enable conda in Python? ›

To activate your Conda environment, type source activate <yourenvironmentname> . Note that conda activate will not work on Discovery with this version. To install a specific package, type conda install -n <yourenvironmentname> [package] . To deactivate the current, active Conda environment, type conda deactivate .

Should I install Anaconda or Jupyter Notebook? ›

Installing Jupyter using Anaconda and conda

For new users, we highly recommend installing Anaconda. Anaconda conveniently installs Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science.

Do I need to download Python separately from Anaconda? ›

Anaconda will not only included Python, R also will be included. Spider or Jupiter notebook can be used for edit your Python scripts.

Which Python software is best for beginners? ›

Beginner - IDLE, Thonny would be the perfect choice for first-time programmers who are just getting into Python. Intermediate - For intermediate level users PyCharm, VS Code, Atom, Sublime Text 3 are good options.

Is conda a programming language? ›

Anaconda is an open-source distribution of the Python and R programming languages for data science that aims to simplify package management and deployment.

Is conda a Python framework? ›

Conda is an open source, cross-platform, language-agnostic package manager and environment management system that installs, runs, and updates packages and their dependencies. It was created for Python programs, but it can package and distribute software for any language (e.g., R), including multi-language projects.

Is conda owned by Anaconda? ›

The Conda package and environment manager is included in all versions of Anaconda, Miniconda, and Anaconda Repository. Conda is a NumFOCUS affiliated project.

Is Anaconda good for deep learning? ›

Anaconda distribution is a free and open-source platform for Python/R programming languages. It can be easily installed on any OS such as Windows, Linux, and MAC OS. It provides more than 1500 Python/R data science packages which are suitable for developing machine learning and deep learning models.

Is Anaconda no longer free? ›

Of course, if you're developing with Anaconda for your own personal project, or a small business (<200 employees) and have no intention to use it for commercial purposes, Anaconda is still free.
...
ActiveState vs Anaconda.
ActiveState $0/yearAnaconda $150/year
Development PricingFree$150/seat/year
10 more rows
24 Dec 2021

Is conda and pip same? ›

The fundamental difference between pip and Conda packaging is what they put in packages. Pip packages are Python libraries like NumPy or matplotlib . Conda packages include Python libraries (NumPy or matplotlib ), C libraries ( libjpeg ), and executables (like C compilers, and even the Python interpreter itself).

Is conda same as Docker? ›

Originally Answered: What is the difference between conda and docker? It's not apple to apple. You can say Conda a package manager, it's like NPM or Yarn. Otherwise Docker is container platform that let you package your environment in a isolated container.

Is conda better than VENV? ›

Lastly, Conda is both an environments manager as well as a package manager like PIP. Useful comparison table here. In short, if you don't have a strong preference already, conda is more robust than venv or pip, can be combined with pip, and is probably the better default option.

Is it bad to use both pip and conda? ›

Unfortunately, issues can arise when conda and pip are used together to create an environment, especially when the tools are used back-to-back multiple times, establishing a state that can be hard to reproduce.

Can I install both pip and conda? ›

Both pip and conda are included in Anaconda and Miniconda, so you do not need to install them separately. Conda environments replace virtualenv, so there is no need to activate a virtualenv before using pip. It is possible to have pip installed outside a conda environment or inside a conda environment.

Which Python is best for data science? ›

Top 10 Python Libraries for Data Science
  • TensorFlow.
  • NumPy.
  • SciPy.
  • Pandas.
  • Matplotlib.
  • Keras.
  • SciKit-Learn.
  • PyTorch.
18 Nov 2022

Which Python IDE is best for data science? ›

6 Best Python IDEs for Data Science & Machine Learning [2023]
  • Spyder.
  • Thonny.
  • JupyterLab.
  • PyCharm. Explore our Popular Data Science Courses.
  • Visual Code.
  • Atom.
23 Sept 2022

Is Jupyter notebook good for data science? ›

Jupyter Notebook has been the staple of any data scientists and data analysts out there who work with Python. In fact, most online Python and data science courses are taught using Jupyter Notebook.

Which Python version is best for data science? ›

Any Python 3. x will do. Most important is to start.

Is Python or C++ better for data science? ›

While it can perform machine learning and data analysis, it is no match for Python. Python's friendly approach in terms of syntax makes it a better option for beginners. C++ requires knowledge of various programming conventions and needs more research and time to learn.

Is Python or SQL better for data science? ›

Using SQL vs Python: Case Study

If someone is really looking to start their career as a developer, then they should start with SQL because it's a standard language and an easy-to-understand structure makes the developing and coding process even faster. On the other hand, Python is for skilled developers.

Is Python alone enough for data science? ›

Python is the programming language of choice for data scientists. Although it wasn't the first primary programming language, its popularity has grown throughout the years. In 2016, it overtook R on Kaggle, the premier platform for data science competitions.

Which programming language is the #1 choice for data scientists? ›

Python. Python is the most widely used data science programming language in the world today. It is an open-source, easy-to-use language that has been around since the year 1991. This general-purpose and dynamic language is inherently object-oriented.

What is the fastest Python IDE? ›

Top Python IDEs
  • Visual Studio Code. ...
  • Sublime Text 3. ...
  • Atom. ...
  • Jupyter. ...
  • Spyder. ...
  • PyDev. ...
  • Thonny. Thonny is an IDE ideal for teaching and learning Python programming. ...
  • Wing. The wing is also a popular IDE that provides a lot of good features to ensure a productive environment.
20 Oct 2022

Why Python is a first choice for data scientist? ›

Thanks to Python's focus on simplicity and readability, it boasts a gradual and relatively low learning curve. This ease of learning makes Python an ideal tool for beginning programmers. Python offers programmers the advantage of using fewer lines of code to accomplish tasks than one needs when using older languages.

Why Python is better than Java for data science? ›

Java vs Python for Data Science- Syntax

Python is a dynamically typed language, whereas Java is a strongly typed language. This means that in the case of Python, the data type of a variable is determined at runtime and can also change throughout the life of the program.

How much RAM do I need for data science laptop? ›

If you're strictly cloud-based or using clusters, big RAM matters less. Some pros claim to get by with 4GB, but most data science warriors like a minimum of 8GB, with 16GB as the sweet spot.

Should I learn SQL or Python first? ›

One thing to remember is that SQL is a big first step to some more complex languages (Python, R, JavaScript, etc.). Once you understand how a computer thinks, it is easy to learn a new programming language to analyze your data.

Why is R better than Python? ›

Python Vs R: Full Comparison

R is a statistical language used for the analysis and visual representation of data. Python is better suitable for machine learning, deep learning, and large-scale web applications. R is suitable for statistical learning having powerful libraries for data experiment and exploration.

Videos

1. Introduction to Conda for (Data) Scientists (Fall 2020)
(KAUST Visualization Core Lab)
2. Introduction to Conda for (Data) Scientists (Fall 2021)
(KAUST Visualization Core Lab)
3. Introduction to Conda for (Data) Scientists (Fall 2020, Redux)
(KAUST Visualization Core Lab)
4. The only CONDA tutorial you'll need to watch to get started
(Coding Professor)
5. How to Manage Conda Environments for Data Science
(Dave Ebbelaar)
6. Introduction to Conda for Data Scientists - 3.5 Hours | Data Science | Python | Jupyter | Software
(Scientific Programming School)
Top Articles
Latest Posts
Article information

Author: Ouida Strosin DO

Last Updated: 11/16/2022

Views: 6166

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Ouida Strosin DO

Birthday: 1995-04-27

Address: Suite 927 930 Kilback Radial, Candidaville, TN 87795

Phone: +8561498978366

Job: Legacy Manufacturing Specialist

Hobby: Singing, Mountain biking, Water sports, Water sports, Taxidermy, Polo, Pet

Introduction: My name is Ouida Strosin DO, I am a precious, combative, spotless, modern, spotless, beautiful, precious person who loves writing and wants to share my knowledge and understanding with you.