Integrating Python Script with Power BI

Organizations with two separate divisions, one for data analysis through visualization and the other for data processing and transformation, face a lot of problems due to interdependence. Microsoft recently integrated the Python programming language into Power BI to get the best of both worlds on one platform. Python can be used to analyze and manipulate complex data sets, and Power BI helps in the integration and visualization of data from various data sources.

The recent release provides two alternatives to integrate your Python code into Power BI:

  • Run Python scripts directly in Power BI Desktop.
  • Create Python visuals in Power BI.

In this article, we will implement both alternatives i.e. create a python script that would directly run on Power BI to import data and then use Python visuals to plot line and bar charts.

Prerequisites

  • Install Power BI Desktop – Download from Power BI website and install.
  • Install Python – Download from Python website and install.
  • Install Pandas – Download from Panda website and install. It is software library that provides data structures and operations for data manipulation and analysis.
  • Install Matplotlib – Download from Matplotlib website and install. It is a plotting library that provides an object-oriented API for embedding plots into applications.

In a console or shell, use the pip command-line tool to install the packages

pip install pandas
pip install matplotlib

Prepare Python Script

 Create a script in your local Python development environment and test it

import pandas as pd 
df = pd.DataFrame({ 
    'Name':['Raman','Ramanujam','Chandrasekhar','Sarabhai','Bose','Khorana'], 
    'Age':[82,33,85,52,80,89], 
    'Birthyear': [1888, 1887, 1910, 1919, 1894, 1922], 
    'Deathyear': [1970, 1920, 1995, 1971, 1974, 2011], 
    'Nobel':['Y','N','Y','N','N','Y'], 
    'Field':['Physics','Mathematics','Physics','Space','Physics','Medicine'],
}) 
print (df)

Output

           Name  Age  Birthyear  Deathyear Nobel        Field
0          Raman   82       1888       1970     Y      Physics
1      Ramanujam   33       1887       1920     N  Mathematics
2  Chandrasekhar   85       1910       1995     Y      Physics
3       Sarabhai   52       1919       1971     N        Space
4           Bose   80       1894       1974     N      Physics
5        Khorana   89       1922       2011     Y     Medicine

Enable Python Scripting

  • Open Power BI Desktop
  • Go to File > Options and settings > Options > Python Scripting
  • Provide local directory in Detected Python home directories. Click ok

Run Python Script

To run your Python Script in Power BI Desktop:

  • In the Home ribbon, select Get Data > More..
  • Select Other > Python script. Click on Connect
  • Copy Python Script into dialog box. Click OK.
  • Select the data frame in Navigator. Validate the data and Click Load.
  • Select a Visual in the Visualizations pane. Select Columns from data frame.

Create Python Visual

  • Select a Python Visual in the Visualizations pane.
  • Select Enable In the Enable script visuals dialog box.
  • Select the Columns needed to plot the visual
  • Line Plot: Copy below copy to Python script dialog box.
import matplotlib.pyplot as plt 
ax = plt.gca() 
dataset.plot(kind='line',x='Name',y='Age',ax=ax) 
plt.show() 
  • Click on Run Script Icon. A line plot is plotted in the Python Visual.
  • Bar Plot: Copy below copy to Python script dialog box.
import matplotlib.pyplot as plt 
ax = plt.gca() 
dataset.plot(kind='bar',x='Name',y='Age') 
plt.show() 
  • Click on Run Script Icon. A bar plot is plotted in the Python Visual.

Similarly, you can create Scatter plots, Multiline plots, and other complex plots using python script. Data used by the Python visual for plotting is limited to 150,000 rows or 250 MB. Also, the maximum length of a column used in the input dataset is limited to 32766 characters.



Categories: Power BI, Python

Tags:

Leave a Reply