Whether it's for an IT project, an HR or sales file, a table rarely has just one type of data. Quite often, numerical and textual data merge to provide more context. That's why Python Dtypes are so useful.
What's a Dtype?
- Dtypes correspond to data types in the NumPy (numerical Python) library. This tool, based on the C programming language, facilitates numerical calculations with Python. In particular, it facilitates the management of arrays of numbers. This is made possible by dtypes. Every array has different data types.
- The dtype describes how the values are coded, and how they should be interpreted. In concrete terms, they provide the following information:
- Data type: integer, real, text data, Python object, etc.
- Data size: data is coded in bits. In other words, the memory allocated to store them in memory. Depending on the amount of memory available, data can take the form of int16, uint32, complex64, etc.
- Data byte order: the “<” sign is indicated for little endian encoding, and otherwise the “>” sign for big endian encoding. These characters are specified at the beginning of the data type.
- For structured data: you’ll have additional information about the fields in the structure, the type of data for each field, the part of the memory block occupied by each field, etc. The idea here is to describe the structure of the data. The idea here is to describe as precisely as possible a table made up of different variables.
- For subnetworks: here, dtypes provide information on the form and type of data.
What are Python's dtypes?
If we’re talking about dtypes in Python, these variables actually come from the NumPy library. And on NumPy, they’re even more varied.
On Python
The Python programming language offers these data types:
- String: used to represent textual data. Here, the text is enclosed in inverted commas. For example, “Python’s dtypes”.
- Integer: this dtype represents integer numbers, such as -1, -2, -3.
- Float: similar to the previous data type, this is mainly used to represent real numbers. So not just integers, but also decimals or fractions, such as 1.2, ⅖, 5.69, etc.
- Boolean: this is simply data representing True or False.
- Complex: these are the more complex numbers, such as 1.0 + 2.3j.
These are dtypes on Python. But on NumPy, there is a wider variety of data types.
With the NumPy library
In NumPy, the dtype indicates the type of data and its size. Here is a table summarising the different possible combinations:
Data Type | Description |
---|---|
Bool_ | These are the true or false data |
byte | Signed integer (positive or negative value) |
ubyte | Unsigned integer (i.e., only positive values) |
int8 | Signed 8-bit integer. |
int16 | Signed 16-bit integer. |
int32 | Signed 32-bit integer. |
int64 | Signed 64-bit integer. |
uint8 | Unsigned 8-bit integer. |
uint16 | Unsigned 16-bit integer. |
uint32 | Unsigned 32-bit integer. |
uint64 | Unsigned 64-bit integer. |
float16 | 16-bit floating point number. |
float32 | 32-bit floating point number. |
float64 | 64-bit floating point number. |
complex64 | 64-bit complex number. |
complex128 | 128-bit complex number. |
What are dtypes used for in Python?
Python’s dtypes allow you to create structured arrays (or arrays of records). Within these structured tables, you can insert different types of data per column. For example, numerical data, textual data, complex formulas, etc. These are very similar to traditional Excel or CSV files.
They can be used by different departments within the organisation, such as human resources to compile data on all the company’s employees, logistics to have visibility of suppliers and their prices, etc.
Without dtype Python, it is not possible to create structured tables. Only ndarrays containing homogeneous data objects.
How do I create a table with dtype?
Here are some basic examples of creating arrays with dtype on Python.
A table with a single data type
There are two options for doing this.
1 / Either define the data type using a string.
Here’s the code:
import numpy as np
a1 = np.array( [1,2,3], dtype = ‘int64’)
print( a1)
2 / Either define the data type by referencing the NumPy library.
Here’s the code:
a2 = np.array( [1,2,3], dtype = np.int64)
print(a2)
In these examples we have used data representing integers, but it is of course possible to use any other dtype desired.
A table with several types of data
It is possible to create a structured table with several columns, each with a distinct type of data. To help you understand better, here is an example with 3 columns:
- A “name” field with text data (string)
- An “age” field with integers
- A “salary” field with decimals (float)
Here is the code:
employee = np.dtype([(‘name’,’S’), (‘age’, ‘i’), (‘salary’, ‘f’)])
a = np.array([(‘vincent dupont’, 32, 2368.45), (’emilie martin’, 26, 2689.23)],
dtype = employed)
print(a)
print(a.dtype)
These are just some simplified examples of dtype on Python. If you want to go further, it’s better to learn.
Join DataScientest to master dtypes on Python
Mastering Dtype on Python requires in-depth knowledge and practice. That’s why Datascientest offers training in data analysis and data science. Here you will learn everything you need to know about programming languages.