Module 03#

Exercise 3a Saving and loading data#

Relevant sections: 3.1.2, 3.1.3

Use YAML or JSON to save your maze data structure to disk and load it again.

The maze would have looked something like this:

house = {
    "living": {
        "exits": {"north": "kitchen", "outside": "garden", "upstairs": "bedroom"},
        "people": ["James"],
        "capacity": 2,
    },
    "kitchen": {"exits": {"south": "living"}, "people": [], "capacity": 1},
    "garden": {"exits": {"inside": "living"}, "people": ["Sue"], "capacity": 3},
    "bedroom": {
        "exits": {"downstairs": "living", "jump": "garden"},
        "people": [],
        "capacity": 1,
    },
}

Exercise 3a Answer#

Save as JSON or YAML

import json
import yaml
# Write with json.dump
with open("myfile.json", "w") as f:
    json.dump(house, f)
# Look at the file on disk
!cat myfile.json
{"living": {"exits": {"north": "kitchen", "outside": "garden", "upstairs": "bedroom"}, "people": ["James"], "capacity": 2}, "kitchen": {"exits": {"south": "living"}, "people": [], "capacity": 1}, "garden": {"exits": {"inside": "living"}, "people": ["Sue"], "capacity": 3}, "bedroom": {"exits": {"downstairs": "living", "jump": "garden"}, "people": [], "capacity": 1}}
# Or with file.write, using json.dumps to convert to a string
with open("myotherfile.json", "w") as json_maze_out:
    json_maze_out.write(json.dumps(house))
# Look at the file on disk
!cat myotherfile.json
{"living": {"exits": {"north": "kitchen", "outside": "garden", "upstairs": "bedroom"}, "people": ["James"], "capacity": 2}, "kitchen": {"exits": {"south": "living"}, "people": [], "capacity": 1}, "garden": {"exits": {"inside": "living"}, "people": ["Sue"], "capacity": 3}, "bedroom": {"exits": {"downstairs": "living", "jump": "garden"}, "people": [], "capacity": 1}}
# Write with yaml.safe_dump
with open("myfile.yml", "w") as f:
    yaml.safe_dump(house, f, default_flow_style=False)
# Look at the file on disk
!cat myfile.yml
bedroom:
  capacity: 1
  exits:
    downstairs: living
    jump: garden
  people: []
garden:
  capacity: 3
  exits:
    inside: living
  people:
  - Sue
kitchen:
  capacity: 1
  exits:
    south: living
  people: []
living:
  capacity: 2
  exits:
    north: kitchen
    outside: garden
    upstairs: bedroom
  people:
  - James
# Or with file.write, using yaml.dump to convert to a string
with open("myotherfile.yaml", "w") as yaml_maze_out:
    yaml_maze_out.write(yaml.dump(house, default_flow_style=True))
# Look at the file on disk
!cat myotherfile.yaml
{bedroom: {capacity: 1, exits: {downstairs: living, jump: garden}, people: []}, garden: {
    capacity: 3, exits: {inside: living}, people: [Sue]}, kitchen: {capacity: 1, exits: {
      south: living}, people: []}, living: {capacity: 2, exits: {north: kitchen, outside: garden,
      upstairs: bedroom}, people: [James]}}

Loading with JSON or YAML

# Read into a string then load with json.loads
with open("myfile.json", "r") as f:
    mydataasstring = f.read()
my_json_data = json.loads(mydataasstring)
print(my_json_data["living"])
{'exits': {'north': 'kitchen', 'outside': 'garden', 'upstairs': 'bedroom'}, 'people': ['James'], 'capacity': 2}
# Read directly with json.load
with open("myotherfile.json") as f_json_maze:
    maze_again = json.load(f_json_maze)
print(maze_again["living"])
{'exits': {'north': 'kitchen', 'outside': 'garden', 'upstairs': 'bedroom'}, 'people': ['James'], 'capacity': 2}
# Read into a string then load with yaml.safe_load
with open("myfile.yaml", "r") as f:
    mydataasstring = f.read()
my_yaml_data = yaml.safe_load(mydataasstring)
print(my_yaml_data["living"])
{'exits': {'north': 'kitchen', 'outside': 'garden', 'upstairs': 'bedroom'}, 'people': ['James'], 'capacity': 2}
# Read directly with yaml.safe_load
with open("myotherfile.yaml") as f_yaml_maze:
    maze_again = yaml.safe_load(f_yaml_maze)
print(maze_again["living"])
{'capacity': 2, 'exits': {'north': 'kitchen', 'outside': 'garden', 'upstairs': 'bedroom'}, 'people': ['James']}

Exercise 3b Plotting with matplotlib#

Generate two plots, next to each other (on the same row).

The first plot should show sin(x) and cos(x) for the range of x between -1 pi and +1 pi.

The second plot should show sin(x), cos(x) and the sum of sin(x) and cos(x) over the same -pi to +pi range. Set suitable limits on the axes and pick colours, markers, or line-styles that will make it easy to differentiate between the curves. Add legends to both axes.

Exercise 3b Answer#

import matplotlib.pyplot as plt
import numpy as np

# Use numpy to get the range of x values (math should work too)
x = np.arange(-np.pi, np.pi, 0.1)

# Define figure dimensions
fig = plt.figure(figsize=(15,5))


ax1 = fig.add_subplot(1,2,1)
ax1.plot(x, np.sin(x),label="sin(x)",color='black', linestyle='dashed')
ax1.plot(x, np.cos(x),label="cos(x)", color='#56B4E9')
ax1.legend()
ax1.set_ylim(-1.5, 1.5)

ax2 = fig.add_subplot(1,2,2)
ax2.plot(x, np.sin(x),label="sin(x)",color='black', linestyle='dashed')
ax2.plot(x, np.cos(x),label="cos(x)", color='#56B4E9')
ax2.plot(x, np.cos(x)+np.sin(x), label='cos(x) + sin(x)', color='#E69F00', marker=".")
ax2.legend()
ax2.set_ylim(-1.5, 1.5)
(-1.5, 1.5)
../_images/module_03_22_1.png

Exercise 3c The biggest earthquake in the UK this century#

The Problem#

GeoJSON is a json-based file format for sharing geographic data. One example dataset is the USGS earthquake data:

import requests

quakes = requests.get(
    "http://earthquake.usgs.gov/fdsnws/event/1/query.geojson",
    params={
        "starttime": "2000-01-01",
        "maxlatitude": "58.723",
        "minlatitude": "50.008",
        "maxlongitude": "1.67",
        "minlongitude": "-9.756",
        "minmagnitude": "1",
        "endtime": "2021-01-19",
        "orderby": "time-asc",
    },
)
quakes.text[0:100]
'{"type":"FeatureCollection","metadata":{"generated":1717403140000,"url":"https://earthquake.usgs.gov'

Exercise 3c Answer#

Relevant sections: 3.1, 2.5.2, 2.5.1

Load the data#

  • Get the text of the web result

  • Parse the data as JSON

import requests
quakes = requests.get(
    "http://earthquake.usgs.gov/fdsnws/event/1/query.geojson",
    params={
        "starttime": "2000-01-01",
        "maxlatitude": "58.723",
        "minlatitude": "50.008",
        "maxlongitude": "1.67",
        "minlongitude": "-9.756",
        "minmagnitude": "1",
        "endtime": "2022-11-02", # Change the date to yesterday
        "orderby": "time-asc",
    },
)
import json
# Can get the data indirectly via the text and then load json text....
my_quake_data = json.loads(quakes.text) # Section 3.1 - structured data

# Requests also has a built in json parser (note this gives exactly the same result as 'my_quake_data')
requests_json = quakes.json()

Investigate the data#

  • Understand how the data is structured into dictionaries and lists

    • Where is the magnitude?

    • Where is the place description or coordinates?

There is no foolproof way of doing this. A good first step is to see the type of our data!

type(requests_json)
dict

Now we can navigate through this dictionary to see how the information is stored in the nested dictionaries and lists. The keys method can indicate what kind of information each dictionary holds, and the len function tells us how many entries are contained in a list. How you explore is up to you!

requests_json.keys()
dict_keys(['type', 'metadata', 'features', 'bbox'])
type(requests_json["features"])
list
len(requests_json["features"])
131
requests_json["features"][0]
{'type': 'Feature',
 'properties': {'mag': 2.6,
  'place': '12 km NNW of Penrith, United Kingdom',
  'time': 956553055700,
  'updated': 1415322596133,
  'tz': None,
  'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/usp0009rst',
  'detail': 'https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=usp0009rst&format=geojson',
  'felt': None,
  'cdi': None,
  'mmi': None,
  'alert': None,
  'status': 'reviewed',
  'tsunami': 0,
  'sig': 104,
  'net': 'us',
  'code': 'p0009rst',
  'ids': ',usp0009rst,',
  'sources': ',us,',
  'types': ',impact-text,origin,phase-data,',
  'nst': None,
  'dmin': None,
  'rms': None,
  'gap': None,
  'magType': 'ml',
  'type': 'earthquake',
  'title': 'M 2.6 - 12 km NNW of Penrith, United Kingdom'},
 'geometry': {'type': 'Point', 'coordinates': [-2.81, 54.77, 14]},
 'id': 'usp0009rst'}
requests_json["features"][0].keys()
dict_keys(['type', 'properties', 'geometry', 'id'])

It looks like the coordinates are in the geometry section and the magnitude is in the properties section.

requests_json["features"][0]["geometry"]
{'type': 'Point', 'coordinates': [-2.81, 54.77, 14]}
requests_json["features"][0]["properties"].keys()
dict_keys(['mag', 'place', 'time', 'updated', 'tz', 'url', 'detail', 'felt', 'cdi', 'mmi', 'alert', 'status', 'tsunami', 'sig', 'net', 'code', 'ids', 'sources', 'types', 'nst', 'dmin', 'rms', 'gap', 'magType', 'type', 'title'])
requests_json["features"][0]["properties"]["mag"]
2.6

Search through the data#

  • Program a search through all the quakes to find the biggest quake

  • Find the place of the biggest quake

quakes = requests_json["features"]
largest_so_far = quakes[0]
for quake in quakes:
    if quake["properties"]["mag"] > largest_so_far["properties"]["mag"]:
        largest_so_far = quake
largest_so_far["properties"]["mag"]
4.8
lon = largest_so_far["geometry"]["coordinates"][0]
lat = largest_so_far["geometry"]["coordinates"][1]

print(f"Latitude: {lat} Longitude: {lon}")
Latitude: 52.52 Longitude: -2.15

Visualise your answer#

  • Form a URL for an online map service at that latitude and longitude: look back at the introductory example

  • Display that image

import IPython
import requests
# This is a solution to one of the questions in module 2
# The only difference here is that the map type is set to map rather than satellite view and the zoom is 10 not 12
def op_response(lat, lon):
    response = requests.get(
    "https://static-maps.yandex.ru:443/1.x",
        params={
            "size": "400,400",  # size of map
            "ll": str(lon) + "," + str(lat),  # longitude & latitude of centre
            "z": 10,  # zoom level
            "l": "map",  # map layer (map image)
            "lang": "en_US",  # language
            },
        )
    return response.content

    
op = op_response(lat, lon)

IPython.core.display.Image(op)
../_images/module_03_49_0.png

[Optional] Equivalent solution using pandas#

In this instance Pandas probably isn’t the first thing that you would use as we have nested dictionaries and JSON works very well in such cases. If we really want to use Pandas we’ll need to flatten the nested values before constructing a DataFrame.

features = requests_json["features"]
features[0]
{'type': 'Feature',
 'properties': {'mag': 2.6,
  'place': '12 km NNW of Penrith, United Kingdom',
  'time': 956553055700,
  'updated': 1415322596133,
  'tz': None,
  'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/usp0009rst',
  'detail': 'https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=usp0009rst&format=geojson',
  'felt': None,
  'cdi': None,
  'mmi': None,
  'alert': None,
  'status': 'reviewed',
  'tsunami': 0,
  'sig': 104,
  'net': 'us',
  'code': 'p0009rst',
  'ids': ',usp0009rst,',
  'sources': ',us,',
  'types': ',impact-text,origin,phase-data,',
  'nst': None,
  'dmin': None,
  'rms': None,
  'gap': None,
  'magType': 'ml',
  'type': 'earthquake',
  'title': 'M 2.6 - 12 km NNW of Penrith, United Kingdom'},
 'geometry': {'type': 'Point', 'coordinates': [-2.81, 54.77, 14]},
 'id': 'usp0009rst'}
# We can use ** to convert a dictionary into pairs of (key, value)
# We can then run `{(k1, v1), (k2, v2)}` to convert a list of keys and values back into a dictionary
combined_features = [{**f["geometry"], **f["properties"]} for f in features]
combined_features[0]
{'type': 'earthquake',
 'coordinates': [-2.81, 54.77, 14],
 'mag': 2.6,
 'place': '12 km NNW of Penrith, United Kingdom',
 'time': 956553055700,
 'updated': 1415322596133,
 'tz': None,
 'url': 'https://earthquake.usgs.gov/earthquakes/eventpage/usp0009rst',
 'detail': 'https://earthquake.usgs.gov/fdsnws/event/1/query?eventid=usp0009rst&format=geojson',
 'felt': None,
 'cdi': None,
 'mmi': None,
 'alert': None,
 'status': 'reviewed',
 'tsunami': 0,
 'sig': 104,
 'net': 'us',
 'code': 'p0009rst',
 'ids': ',usp0009rst,',
 'sources': ',us,',
 'types': ',impact-text,origin,phase-data,',
 'nst': None,
 'dmin': None,
 'rms': None,
 'gap': None,
 'magType': 'ml',
 'title': 'M 2.6 - 12 km NNW of Penrith, United Kingdom'}
import pandas as pd

df = pd.DataFrame.from_records(combined_features)
df.head()
type coordinates mag place time updated tz url detail felt ... code ids sources types nst dmin rms gap magType title
0 earthquake [-2.81, 54.77, 14] 2.6 12 km NNW of Penrith, United Kingdom 956553055700 1415322596133 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p0009rst ,usp0009rst, ,us, ,impact-text,origin,phase-data, NaN NaN NaN NaN ml M 2.6 - 12 km NNW of Penrith, United Kingdom
1 earthquake [-1.61, 52.28, 13.1] 4.0 1 km WSW of Warwick, United Kingdom 969683025790 1415322666913 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p000a0pm ,usp000a0pm, ,us, ,impact-text,origin,phase-data, 55.0 NaN NaN NaN ml M 4.0 - 1 km WSW of Warwick, United Kingdom
2 earthquake [1.564, 53.236, 10] 4.0 38 km NNE of Cromer, United Kingdom 977442788510 1415322705662 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p000a6hd ,usp000a6hd, ,us, ,origin,phase-data, 27.0 NaN 1.12 NaN ml M 4.0 - 38 km NNE of Cromer, United Kingdom
3 earthquake [0.872, 58.097, 10] 3.3 171 km ENE of Peterhead, United Kingdom 984608438660 1415322741153 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p000abdr ,usp000abdr, ,us, ,origin,phase-data, 36.0 NaN 1.44 NaN mb M 3.3 - 171 km ENE of Peterhead, United Kingdom
4 earthquake [-1.845, 51.432, 10] 2.9 8 km W of Marlborough, United Kingdom 984879824720 1415322742102 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p000abnc ,usp000abnc, ,us, ,origin,phase-data, 19.0 NaN 0.57 NaN ml M 2.9 - 8 km W of Marlborough, United Kingdom

5 rows × 27 columns

df.sort_values("mag", ascending=False, inplace=True)
df.head()
type coordinates mag place time updated tz url detail felt ... code ids sources types nst dmin rms gap magType title
19 earthquake [-2.15, 52.52, 9.4] 4.8 2 km ESE of Wombourn, United Kingdom 1032738794600 1600455819229 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p000bcxg ,usp000bcxg,atlas20020922235314, ,us,atlas, ,impact-text,origin,phase-data,shakemap,trump-... 268.0 NaN NaN NaN mb M 4.8 - 2 km ESE of Wombourn, United Kingdom
81 earthquake [-0.332, 53.403, 18.4] 4.8 1 km NNE of Market Rasen, United Kingdom 1204073807800 1710463229619 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... 13655.0 ... p000g02w ,us2008nyae,usp000g02w,atlas20080227005647, ,us,us,atlas, ,associate,dyfi,impact-text,origin,phase-data,... 361.0 NaN NaN 19.2 mb M 4.8 - 1 km NNE of Market Rasen, United Kingdom
72 earthquake [1.009, 51.085, 10] 4.6 1 km WNW of Lympne, United Kingdom 1177744691360 1657780288041 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... 201.0 ... p000fase ,us2007bsal,usp000fase,atlas20070428071811, ,us,us,atlas, ,associate,dyfi,impact-text,origin,phase-data,... 295.0 NaN 1.12 31.8 mb M 4.6 - 1 km WNW of Lympne, United Kingdom
23 earthquake [-2.219, 53.478, 5] 4.3 1 km ESE of Manchester, United Kingdom 1035200554900 1415323007416 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... NaN ... p000beyx ,usp000beyx, ,us, ,impact-text,origin,phase-data, 46.0 NaN NaN NaN ml M 4.3 - 1 km ESE of Manchester, United Kingdom
113 earthquake [-3.8559, 51.7231, 11.55] 4.3 5 km NE of Clydach, United Kingdom 1518877865070 1681205336855 None https://earthquake.usgs.gov/earthquakes/eventp... https://earthquake.usgs.gov/fdsnws/event/1/que... 3410.0 ... 2000d3uw ,us2000d3uw, ,us, ,dyfi,impact-text,origin,phase-data,shakemap, NaN 2.167 1.14 92.0 mb M 4.3 - 5 km NE of Clydach, United Kingdom

5 rows × 27 columns

You can see that we haven’t really gained much over the JSON solution. We still needed to look at the data to see its structure and we had to manually flatten the structure.