We have the answers to your questions! - Don't miss our next open house about the data universe!

PyPI: The complete guide to the Python third-party repository

- Reading Time: 3 minutes
PyPI: The complete guide to the Python third-party repository

If you program in Python, you've inevitably drawn tools from PyPI. What began as a public platform for sharing Python services has become the central hub of this ecosystem. A Python programmer will benefit from understanding how this ready-to-use reservoir works.

What is PyPI?

The Python Package Index (PyPI) is a centralized repository of open source packages written in Python, freely accessible to all.

Any developer can draw from it building blocks for one of their projects: Machine Learning, statistics, scientific computing, graphical data representation tools…

Conversely, those who have created packages can enjoy sharing their creations with the Python community. PyPI now boasts almost 500,000 projects.

How was PyPI created?

The PyPI project was initiated in 2002 by Australian developer Richard Jones during a series of exchanges on the python-dev discussion forum. The aim was to standardize the free distribution of Python packages.

A particularly active member of the Python community, Richard Jones is a regular participant at PyCon AU, an annual Python conference held in Australia. He sometimes gives talks there, as he did in 2021.

Richard Jones released the first version of PyPI in 2003, and gradually many other volunteer contributors joined the adventure. The Python Software Foundation helped fund PyPI.

What's in PyPI?

PyPI is accessible from the Web at this address.

Each individual package has its own page, showing information such as :

  • description,
  • metadata,
  • dependencies,
  • version history…

As most of the packages are open source, PyPI has not only fostered a culture of collaboration and sharing, but also stimulated innovation, whether by building programs on these existing solutions, or by proposing alternatives to them. Anyone looking to extract data from the Web (known as Web Scraping) will find Beautiful Soup, Scrapy and many other packages on PyPI.

But that’s not all. PyPI offers a standard API for integrating packages into a program. PyPI also boasts a robust server infrastructure with a reinforced security layer.

How does PyPI work?

As a Python programmer, you’ve used PyPI, probably without realizing it, when you placed the command :

pip install <package>

When this program was executed, PyPI was called upon.

The default installation tool, ‘pip’ establishes a connection between your computer and PyPI. It not only locates the requested package, but also installs a version adapted to your computer, its operating system and your version of Python. The same applies to dependencies: if a package requires a specific version of another package, this dependency is handled automatically.

That’s the beauty of it. The simple ‘pip’ command allows anyone to install and use PyPI packages, some of which are highly developed, without having to worry about internal details: PyPI itself manages the activities required for the package to function properly. This ease of access made possible by ‘pip’ has greatly contributed to the growth of Python.

pip’ is not the only tool available to PyPI users.

Others include :

  • twine’, dedicated to the download by a developer of a package he has created to PyPI;
    setuptools’,
  • which lets you define a package’s metadata, specify its dependencies and any C or C++ extensions, etc.

Is PyPI safe?

In March and April 2023, the Python Software Foundation received three subpoenas from the US Department of Justice. The reason? The U.S. government is seeking to identify actors who may have infected PyPI with malicious code. As it happens, PyPI, like any other platform, is not immune to threats.

New packages are regularly audited to ensure compliance with high security standards. However, caution is still called for when dealing with packages that have just arrived on PyPI.

PyPI's major users

Since PyPI is the main index for Python packages, a very large number of major companies use it. While they usually develop their own tools in-house, it is common practice for them to draw on PyPI for public packages. Major PyPI users include :

NASA

A huge part of the space agency’s programming is based on Python.

IBM

IBM relies on Python for its cloud developments and various other activities.

Microsoft

Azure, Microsoft’s cloud computing service, is increasingly integrating tools programmed in Python.

Google

The search giant has always been a strong supporter of the Python ecosystem.

Instagram

This popular photo-sharing application is based on Django, a Python framework.

Netflix

The No. 1 in movie streaming uses Python for the backend of certain services, such as data analysis.

Dropbox

This storage service was originally developed mainly in Python. Other languages and technologies were added later.

An essential part of the Python ecosystem

PyPI is not just a repository for Python packages: it’s the central hub of the Python ecosystem. Without this infrastructure, the distribution and discovery of such packages would be far more complex.

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!
icon newsletter

DataNews

Get monthly insider insights from experts directly in your mailbox