Plotting a pie chart with 'other' slice using matplotlib and data prepared via Pandas
In this post we will look at summarizing data with a pie chart that has an 'other' slice to capture the less frequent values.
We will first use the Python Pandas library to load data from a JSON file and prepare it for plotting.
Then we will use matplotlib to render the pie chart.
The data is in JSON format and looks like this:
{ "data": [ { "ram_size_gb": 4, "processor_count": 8 }, { "ram_size_gb": 8, "processor_count": 6 }, { "ram_size_gb": 12, "processor_count": 16 } ] }
Preparing the data
First, we install the dependencies from a terminal:
python3 -m pip install matplotlib pandas
Next, inside a new Python file, we import the libraries:
import matplotlib.pyplot as plt import pandas as pd
Next, we use Pandas to read in the JSON file and normalize the data for use:
data1 = pd.read_json("../data/file1.json") normalized_data = pd.json_normalize(data1['data'])
We use Pandas to aggregate the data, counting and then grouping by 'count':
df_grouped = df.groupby([column])[column].count().reset_index(name='count').sort_values('count', ascending = False)
We then use Pandas to take the top 5 categories, and separately all 'other' categories:
Comments
Post a Comment