We have the answers to your questions! - Don't miss our next open house about the data universe!

Mastering Structured Tables with Dtype in Python

- Reading Time: 3 minutes
dtype

Whether it's for an IT project, an HR or sales file, a table rarely has just one type of data. Quite often, numerical and textual data merge to provide more context. That's why Python Dtypes are so useful.

What's a Dtype?

  • Dtypes correspond to data types in the NumPy (numerical Python) library. This tool, based on the C programming language, facilitates numerical calculations with Python. In particular, it facilitates the management of arrays of numbers. This is made possible by dtypes. Every array has different data types.
  • The dtype describes how the values are coded, and how they should be interpreted. In concrete terms, they provide the following information:
  • Data type: integer, real, text data, Python object, etc.
  • Data size: data is coded in bits. In other words, the memory allocated to store them in memory. Depending on the amount of memory available, data can take the form of int16, uint32, complex64, etc.
  • Data byte order: the “<” sign is indicated for little endian encoding, and otherwise the “>” sign for big endian encoding. These characters are specified at the beginning of the data type.
  • For structured data: you’ll have additional information about the fields in the structure, the type of data for each field, the part of the memory block occupied by each field, etc. The idea here is to describe the structure of the data. The idea here is to describe as precisely as possible a table made up of different variables.
  • For subnetworks: here, dtypes provide information on the form and type of data.

What are Python's dtypes?

If we’re talking about dtypes in Python, these variables actually come from the NumPy library. And on NumPy, they’re even more varied.

On Python

The Python programming language offers these data types:

  • String: used to represent textual data. Here, the text is enclosed in inverted commas. For example, “Python’s dtypes”.
  • Integer: this dtype represents integer numbers, such as -1, -2, -3.
  • Float: similar to the previous data type, this is mainly used to represent real numbers. So not just integers, but also decimals or fractions, such as 1.2, ⅖, 5.69, etc.
  • Boolean: this is simply data representing True or False.
  • Complex: these are the more complex numbers, such as 1.0 + 2.3j. 

These are dtypes on Python. But on NumPy, there is a wider variety of data types.

With the NumPy library

In NumPy, the dtype indicates the type of data and its size. Here is a table summarising the different possible combinations:

Data Type Description
Bool_ These are the true or false data
byte Signed integer (positive or negative value)
ubyte Unsigned integer (i.e., only positive values)
int8 Signed 8-bit integer.
int16 Signed 16-bit integer.
int32 Signed 32-bit integer.
int64 Signed 64-bit integer.
uint8 Unsigned 8-bit integer.
uint16 Unsigned 16-bit integer.
uint32 Unsigned 32-bit integer.
uint64 Unsigned 64-bit integer.
float16 16-bit floating point number.
float32 32-bit floating point number.
float64 64-bit floating point number.
complex64 64-bit complex number.
complex128 128-bit complex number.

What are dtypes used for in Python?

Python’s dtypes allow you to create structured arrays (or arrays of records). Within these structured tables, you can insert different types of data per column. For example, numerical data, textual data, complex formulas, etc. These are very similar to traditional Excel or CSV files.

They can be used by different departments within the organisation, such as human resources to compile data on all the company’s employees, logistics to have visibility of suppliers and their prices, etc.

Without dtype Python, it is not possible to create structured tables. Only ndarrays containing homogeneous data objects.

How do I create a table with dtype?

Here are some basic examples of creating arrays with dtype on Python.

A table with a single data type

There are two options for doing this.

1 / Either define the data type using a string.

Here’s the code:

import numpy as np

a1 = np.array( [1,2,3], dtype = ‘int64’)

print( a1)

2 / Either define the data type by referencing the NumPy library.

Here’s the code:

a2 = np.array( [1,2,3], dtype = np.int64)

print(a2)

In these examples we have used data representing integers, but it is of course possible to use any other dtype desired.

A table with several types of data

It is possible to create a structured table with several columns, each with a distinct type of data. To help you understand better, here is an example with 3 columns:

  • A “name” field with text data (string)
  • An “age” field with integers
  • A “salary” field with decimals (float)

Here is the code: 

employee = np.dtype([(‘name’,’S’), (‘age’, ‘i’), (‘salary’, ‘f’)])

a = np.array([(‘vincent dupont’, 32, 2368.45), (’emilie martin’, 26, 2689.23)],

dtype = employed)

print(a)

print(a.dtype)

These are just some simplified examples of dtype on Python. If you want to go further, it’s better to learn.

Join DataScientest to master dtypes on Python

Mastering Dtype on Python requires in-depth knowledge and practice. That’s why Datascientest offers training in data analysis and data science. Here you will learn everything you need to know about programming languages.

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!
icon newsletter

DataNews

Get monthly insider insights from experts directly in your mailbox