Extracting JSON and nested JSON from an Athena column

Extracting JSON and nested JSON from an Athena column


In case you have data in Athena where there is JSON nested inside a string column:


{

  “room_size”: "large"

}



We can read the contents of the JSON data by using the Athena SQL function json_extract():


SELECT

json_extract(my_json_column, ‘$.room_size) as room_size

FROM my_data


RESULT: large



NESTED JSON STRUCTURE:


If room_size is itself a JSON structure, then we can extract parts of it out by nesting calls to json_extract() or json_extract_scalar() as appropriate.


Example:


JSON value with more JSON:


{

   “room_size”: {“X”: 50, “y”: 20}

}



SELECT

CAST(json_extract(json_extract_scalar(my_json_column, ‘$.room_size), '$.x') AS INTEGER) as room_size_width

FROM my_data


RESULT: 50

Comments