Outputting a data table in Markdown format using Python Pandas
Pandas is a powerful data manipulation library for Python. One of its many features is to output to Markdown text format, using the tabulate library.
In this post, we will look at processing some hardware stats data.
The JSON data looks like this:
{ "data": [ { "ram_size_gb": 4, "processor_count": 8 }, { "ram_size_gb": 8, "processor_count": 6 }, { "ram_size_gb": 12, "processor_count": 16 } ] } |
First, we install the dependencies from a terminal:
python3 -m pip install pandas tabulate
Next, inside a new Python file, we import the necessary libraries:
import pandas as pd |
Next, we use Pandas to read in the JSON data:
jsonFs1 = "../../data/hardware-stats.json" data1 = pd.read_json(jsonFs1) | |
Then we normalize the data structure:
normalized_data = pd.json_normalize(data1['data'])
Counting and aggregating
Now we count the categories and then aggregate the count values:
column = 'ram_size_gb'
df_grouped = df.groupby([column])[column].count().reset_index(name='count').sort_values('count', ascending = False)
df_grouped = df_grouped.set_index(column)
Next we calculate a percentages column:
df_grouped['percent'] = (df_grouped[column] / df_grouped[column].sum()) * 100
The data is now ready to render.
Rendering the markdown
We use the to_markdown() extension of Pandas that is provided via tabulate library:
markdown_text = df_grouped.set_index(column).to_markdown()
Finally, we save the markdown text to a new file:
with open(md_filepath, 'w') as f:
f.write('# ' + title)
f.write(os.linesep)
f.write(os.linesep)
f.write(markdown_text)
The resulting markdown is something like this:
# RAM (Gb) | ram_size_gb | percent | |----------------------------------------:|----------:| | 4 | 23.1707 | | 8 | 20.7317 | | 12 | 19.5122 |
Complete Example
For a full code example, see this Python script in my Athena-cli github project.
Further Reading
Pandas is a powerful data manipulation and analysis library for Python.
tabulate is a pretty-printing library for Python, to print out, well, tabular tables!
Comments
Post a Comment