Explore statistical summary
Data is often intertwined with statistics because statistics are one way in which you can explore your data. Statistics can show you the distribution of your data, help you to identify key takeaways and trends, and determine whether outliers exist.
A statistical summary provides a quick and simple description of your data. Power BI has many features that help you conduct a statistical analysis. Exploring the statistical summary gives the user a high-level view of the available data, where they can see clusters, patterns on behavioral data, data averages, and more. They can gain insights about their data that help drive business decisions.
For example, the Supply Chain team asks you to create a report that shows the frequency of orders for certain products and what the top 10 products are in terms of sales.
Statistical functions
Power BI supports many Data Analytics Expressions (DAX) functions that you can use to get quick statistics based on your data. You can access these quick functions by right-clicking a summarizable field assigned to the well of a visual in the Visualizations pane, as illustrated in the following image.
However, to avoid performance issues, it's better to create the statistical measures yourself by using an expression written in DAX. For example, to analyze the average order quantity for each product, you could create the following measure:
Average Qty =
AVERAGE ( Sales[Order Qty] )
Histogram
Histograms and bell curves are a common way to display statistics about your semantic models. In Power BI terms, you can represent a histogram with a bar or column chart visual and represent a bell curve with an area chart visual, as illustrated in the following image. You can also use the Q&A visual to ask a direct question about the top or bottom items in a list.
A typical bar or column chart visual in Power BI relates two data points: a measure and a dimension. A histogram differs slightly from a standard bar chart in that it only visualizes a single data point.
In this example, you can use the clustered column chart visual to present a histogram that determines the order quantities by order sizes.
You start by selecting the clustered column chart icon on the Visualization pane. Next, create a new grouping for the X-axis. We cover grouping and binning later in this module, but for now understand that they're useful in this context also.
To create the group, in the Data pane, right-click the data field that you want to analyze and then select New Group. In this case, you use the OrderQty
field. In the Groups window, configure the bin group as follows:
- Rename the group as Order Bins (Buckets).
- Set the Group type option to Bin and the Bin Type option to Number of bins.
- Set the Bin count to 5, the Min value to 1, and the Max value to 44.
Next, populate the visual as follows:
- Drag the
OrderQty
field from the Data pane into the Value well in the Visualizations pane. - Drag the
Order Bins (Buckets)
field from the Data pane into the Axis well in the Visualizations pane.
The visual now shows that the data is grouped into buckets on the X-axis, with the order quantities of that variable on the Y-axis.
The histogram displays the order quantity by order size buckets for the Supply Chain team.
Top N analysis
The TOPN
DAX function returns the top N rows of a specified table. Top N analysis is a common technique to present data that might be important, such as the top 10 selling products, top 10 performers in an organization, or top 10 customers. Alternatively, you can look at it from the other perspective and present the bottom N items in a list. In other words, the worst performers. Depending on the requirements, you might want to use one or both of these techniques.
Consider a scenario where the Supply Chain team wants to know what the top 10 selling products are. You can accomplish this task by using a Q&A visual, a Top N filter, or creating a DAX measure.
Use the Q&A visual to find the top N
Assume you created a report for the Supply Chain team, and now the team members have questions about various other views or insights that they're interested in. Power BI has a built-in Q&A visual that allows users to ask their own questions and get answers. That means you don't have to address each individual question with a report visual.
The Q&A visual is an effective tool because it allows users to quickly and independently get answers about the data. That saves time for everyone involved. The Q&A visual is unique in that it doesn't require prior knowledge of Power BI; users can ask their question.
Add the Q&A visual to your report, and then reposition the visual and customize its formatting, as required.
Now, you can use the visual to get answers. In this case, you want to know what the top 10 selling products are, so you enter a question such as, What are my top 10 products by sales? Power BI automatically displays the result for you.
Use a Top N filter
Top N is a filtering option that is available in the Filters pane. Select the field that you want to analyze on your report page (in this example, it's the Product Name
field). In the Filters pane, expand the Filter type list and select Top N. In the Show items settings, select Top and 10. Then, set the Cost of Sales
field as the value that you want to filter by.
Use a TOPN function
You can also calculate your top 10 products with DAX by using the TOPN
function. This function can be used to present a top 10 list in a different context, such as how much of the top 10 best-selling products contributed toward the overall total sales.
Start by creating a measure named Top 10 Products
. Then, use the TOPN
function along with the SUMX
function, to calculate the top 10 products by total sales, as follows:
Top 10 Products =
SUMX (
TOPN (
10,
'Product',
'Product'[Total Sales]
),
[Total Sales]
)
The following image shows the top 10 products compared to total sales.
You can adjust the DAX formula to present the same result in percentages.
For more information about the statistical capabilities of DAX, see Statistical Functions.