We have the answers to your questions! - Don't miss our next open house about the data universe!

Power BI Python: Harnessing the Synergy of tools Data Science

- Reading Time: 6 minutes
Python and Power BI: combining tools for Data Science

Power BI Python: Python can be integrated with Power BI to leverage the Data Science capabilities of Microsoft’s Business Intelligence platform. Find out how to combine Python and Power BI, what the possibilities are, and how to take a training course to learn how to master them.

The Python programming language is one of the most widely used for data science and data analysis. It is ideal for complex tasks involving data transformation, visualization, machine learning and artificial intelligence.

Microsoft’s business intelligence platform, Power BI, enables self-service business intelligence, including the visualization of data from a variety of sources.

However, since 2018, Power BI has enabled the integration of statistical and general-purpose programming languages such as R and Python. This integration brings new possibilities and extends Power BI’s capabilities, notably for data extraction and transformation or the creation of Machine Learning models.

Business Intelligence professionals no longer need to call on a Data Science team for data science tasks, and Python developers no longer need a BI team to present their analyses in report or dashboard form.

The modern Data Scientist must harness data to solve business problems. This involves collecting, cleansing, exploring and transforming data to predict future events, and then presenting the results of the analysis in the form of a report or dashboard.

As a result, the boundary between data scientist and business analyst no longer really exists. The integration between Python and Power BI makes it possible to adapt to this new reality.

Why use Power BI Python together?

When Using both Power BI + Python together, it opens up a wide range of possibilities for working with data. As a result, the platform’s capabilities become almost limitless.

You can use Python for data cleaning, writing scripts to automate even the most tedious tasks, such as deleting missing values or correcting date formats.

Python scripts are also very useful for data transformation. This process can be slow and tedious with the Power Query Editor.

Python also reveals the full value of data, thanks to predictive analysis and Machine Learning techniques. This also makes it possible to compensate for any missing data.

For Data Visualization, Python lets you add complex, personalized visuals to a report. Finally, this language offers the ability to connect data sources not natively supported by Power BI.

Power BI Python: The limits of these two tools

Despite the multitude of possibilities offered, integration between Python and Power BI also has its limitations. First of all, data sources added with Python must be public.

Only Pandas DataFrames can be imported into Power BI using Python. In addition, execution time is limited to 30 minutes for scripts and 5 minutes for visuals.

The number of Python libraries supported by Power BI Service is limited. Only Matplotlib, NumPy, Pandas, Scikit-learn, Scipy, Seaborn, Statsmodels and XGBoost are compatible. Note that Pandas and Matplotlib must be installed for integration between Power BI and Python to work.

Finally, Python visuals do not support cross-filtering. This means that selecting an item on a Python visual does not activate the same filter on other visuals. This reduces interactivity on a Power BI report.

Can we use Python instead of DAX?

The DAX or Data Analysis Expressions language enables you to create different measures and manipulate data in Power BI.

For all elements that don’t need to be dynamically generated, there are numerous alternatives. In particular, you can add new columns to a dashboard using Python.

What are the prerequisites for using Python in Power BI?

There are several prerequisites for using Python on Power BI. You need to install the Python runtime on which the scripts are executed, and the Pandas, Matplotlib and Seaborn libraries.

It is also possible to use the Visual Studio Code editor to write Python scripts more efficiently, but this is an optional extra. You can also write Python scripts from the Power BI script editor.

However, the code editor offers useful features such as syntax highlighting. You can download Visual Studio Code from the official website. Finally, be sure to update your Power BI settings so that you can work with Python.

How do I configure these two tools Power BI Python?

Before writing scripts on Power BI, it is necessary to install the latest version of Python from the official site. In the Power BI documentation, Microsoft recommends avoiding environment managers such as Anaconda to avoid problems when executing scripts.

It is possible to work around this limitation to use a custom environment on Anaconda by activating the environment from the command line before opening Power BI Destkop from the command line too.

If you’re unfamiliar with these programming notions, it’s best to stick to Power BI’s recommendations and download Python from the website without using custom environments.

Once you’ve installed Python, you’ll need to specify the file path on Power BI Desktop. From the Ribbon, select “File”, then “Options & Settings” and then “Options”.

In the list of options, look for “Scripting Python”. All that’s left to do is find the folder containing your Python distribution, in the list containing all the Python distributions installed on your computer.

If, however, you wish to enter the file path manually, bear in mind that only absolute file paths are supported. You cannot enter a relative file path.

Once you’ve installed and configured Python on Power BI, you can use scripts in a variety of ways. For example, you can use them as a data source to import new data, to enrich an existing data source in the Power Query Editor, or to visualize data on the Power BI report canvas.

Python as a data source for Power BI

To use Python as a data source for Power BI, go to the home tab and choose the “Get data” option to display the full list of data collections.

Choose the “other” category and search for “Python script” in the list. You can now write a Python script to import a dataset. A dialog box appears on the screen, asking you to enter your Python code.

Remember that Power BI can only import Pandas DataFrames. It is therefore necessary to convert the dataset using “pd.DataFrame” first.

This feature is very useful for connecting a data source that is not supported by Power BI by default. It is also handy for datasets requiring numerous transformations prior to import, for example from an API.

This method allows you to create complex data frames using Python and import them into Power BI. You can also perform data manipulations using the Pandas library, then import the final dataset into Power BI to create visualizations.

This makes Power BI much easier to use for data scientists and analysts already accustomed to manipulating data in Python.

Python for k-means partitioning on Power BI

K-means partitioning or k-means clustering is a Machine Learning technique for finding groups of similar data points: clusters.

This technique is particularly useful for marketing or customer service, to better understand the customer base.

It is perfectly possible to perform cluster analysis on Power BI data without using Python. However, this native functionality is limited, since Power BI automatically determines how many clusters to break the data into, and doesn’t allow for any changes.

Using Python overcomes this problem, as you can freely write your script to meet your specific needs.

From the Power Query Editor, select the “Transform” tab and then “Execute a Python script”. The dialog box that appears is the same as when importing a dataset, but a “dataset” location is dedicated to the current dataset.

By selecting “OK”, you can import a reduced table. You need to expand the “Value” column and select the columns you want to include in the query.

Select all columns, including the “Cluster” column just created from the script. Make sure you uncheck the “Use original column name as prefix” box.

Your dataset now contains all the original columns, plus an additional column indicating to which cluster each row of data is assigned.

Statistical calculation in Python on Power BI

By default, creating a correlation matrix is not an easy task in Power BI. Yet a correlation matrix heatmap is an integral part of a data analysis report.

To create a correlation matrix heatmap on Power BI, you can use Python’s correlation function. This heatmap will be displayed in the Report section of Power BI.

Go to the Report section of Power BI, and click on the Python visual represented by the “Py” symbol under the Visualizations section. An empty Python visual appears on the left, and a script editor opens at the bottom of the screen. You can now create visualizations in Power BI using Python scripts.

To illustrate the correlation heatmap, enter all continuous variables in the Values field. This is an essential step, as Power BI will only take into account the variables indicated for visualization.

Then simply write a few lines of code in Python to create a correlation matrix heatmap using the seaborn package. Finally, click on the “Execute Script” button to produce the heatmap.

How do I learn to use Python and Power BI?

We’ve only explored a tiny fraction of the possibilities offered by Python on Power BI. If you want to learn how to fully master these tools, you can choose DataScientest.

Our Data Analyst and Data Management training courses include an introduction to Python programming, and a Business Intelligence module focusing on Power BI.

You’ll also learn about data analysis, Machine Learning, DataViz and database manipulation. By the end of the course, you’ll have all the skills you need to become a Data Analyst or Data Manager.

These courses are entirely distance learning, combining online learning on our coached platform and Masterclasses. You can choose between an intensive BootCamp or a continuous training course, depending on your preferences.

We also offer an expert course entirely dedicated to Power BI. This 3 to 5-day course covers Power Query, DAX language, Data Visualization and workspace management.

At the end of the program, you’ll be able to take the exam for official Microsoft Power BI certification. You’ll be ready to use these tools in your business.

All our training courses are eligible for funding options and jobseekers can also benefit from full funding. Don’t wait any longer and discover DataScientest’s training courses!

You are not available?

Leave us your e-mail, so that we can send you your new articles when they are published!
icon newsletter

DataNews

Get monthly insider insights from experts directly in your mailbox