Python has recently established itself as one of the leading programming languages for Data Science: easy to learn for people with a statistician's background, well supplied with libraries, capable of anything, it's a faithful companion for the Data Scientist.
However, it’s possible to do Data Science at a fairly high level without seeing all its particularities: how many Data Scientists use decorators? Is Python multi-threaded or multi-core?
So the other day, trying to answer the question of why the following code displays two different results,
For those who, like me, have already worked with Java, this may evoke memories (or even wounds), but for an inexperienced Python user, it may come as a surprise.
💡Related articles:
Dynamic typing vs. static typing
Python is a dynamically typed language, meaning you can change the type of a variable. A statically typed language, on the other hand, forces you to define the type of a variable and keep it for the life of the variable.
This obviously makes Python very easy to use, since you don’t need to specify the type of a variable in advance, and you can easily change it. But trying to do that with Java…
Python’s ease of use can also be a source of confusion.
If you don’t know exactly what type of variable a function can accept, you may find it difficult to read someone else’s code.
But one of the features offered by Python allows us to give the illusion of static typing: annotations.
Annotations
In Python, you can use annotations to add information about the code you are writing. For example, I can write:
You can specify the nature of the arguments with these “:” and a string behind them, as well as the nature of the function’s output. These annotations can be accessed using:
which refers:
However, I haven’t changed the internal structure of Python. Worse still, it’s useless.
and
work very well…
So what’s going on? Apart from the (relative) aesthetic aspect, is it totally useless?
Well, yes and no…
Indeed, these annotations are only a means of obtaining information on the various arguments and outputs of your function, but the gain in clarity is not negligible when using the function.
In fact, when working with several people, or more simply when sharing code, it can be very useful to be able to precisely identify the arguments and outputs of a function you’re using.
MyPy
But you can combine these annotations with another tool: MyPy. This is a library that hasn’t really understood how Python works and thinks that annotations are really type declarations.
It allows you to check that you’re doing things correctly… that you’re putting squares inside squares, circles inside circles and triangles inside triangles.
The annotations should therefore be seen as an aid to development and code sharing, and MyPy as a guide to clean development. Admittedly, it doesn’t make much difference when you’re coding on a notebook and importing all your functions from sklearn or pandas. True, true static typing would speed up Python’s operation (since it would no longer need to infer the type of variables), but it can help you find your way around and debug too many lines of code.
And even though I know that annotations are like good resolutions – you tell yourself you’re going to use them and two functions later, it’s forgotten – I at least wanted to tell you about my discovery.
Did you like this article? Don’t hesitate to subscribe to our newsletter for data tips you won’t find anywhere else!