mrau1.github.io

Final Tutorial: Impact of Coronavirus on Macro Markets¶

The aim of this project is to examine the relationships between major coronavirus events and their impact on various economies.

Datasets¶

Coronavirus Event Data:
- Retrived from https://www.coronanet-project.org/download.html.
- This dataset provides daily event data for all coronvirus events from January 2020 - August 2020.
Market Data:
- Retrieved from https://finance.yahoo.com
- This website provides historical datasets for thousands of tickers

Background Knowledge¶

During drastic times when things go wrong, people scramble, they are uncertain and tend to make rash decisions. These scenarios are all too common in our society and recently occured as a result of the COVID-19 Pandemic. The stock market and foreign exhcange rates are perfect vehicles of this channeled uncertainty and provide insights into investor sentiment and economic strength. To understand this project we need a brief background into economics.

It is known that people are never comfortable with uncertainty and tend to seek certainty whenever the possible. This is especially prevelant in financial markets where people, countries, and firms want to have certain plans around how their money is allocated. Most investors prefer reliable returns that meet their goals and countries allocate their wealth in currencies or commodities that hold value. During major events throughout history such as September 11, 2001; The Great Recession, and President John F. Kennedy's assasination, markets tumbled because investors did not know what would happen next.

Another concept that should be understood is that a stock/currency is like a voting machine. The more people that buy the stock the higher the price will go, and vice-versa. In foreign exchange markets the more people that demand a currency in relation to another, the highly demand currency will increase in value (appreciate) while the lessor will decrease in value (depreciate). If more Americans want to travel to Britain than the United States and demand more of Britain's currency (British Pound), then the British Pound will appreciate in value and the United States Dollar will depreciate.

Lastly, investors psychologically tend to act like cartels. This is because a stock is a voting machine. Even if a company or currency has no intrinsic value, if people are willing to keep buying the underlying at its current value, then the underlying will not drop. All the market players are acting rationally so nobody loses their investment. However, if one major investor (or enough smaller investors) sell enough of the underlying at once, then the value will drop and other investors will panic and close their positions. This downward trend will cause prices to rapidly drop, and it only stops when enough people have confidence to buy the underlying at its reduced price.

Purpose¶

The purpose of this project is to analyze the uncertainty of market participants. To do so i'll analyze the foreign exchange rates and major stock market indices of 14 large countries. These countries are a mix of developing and developed countries who have active stock markets, and are relevant to the world economy.

This project aims to analyze which markets were the most volatile, dependable, and risky during a time of crisis. Furthermore, consumer sentiment is analyzed regarding COVID-19 events to capture investor reaction to events.

Which stock markets were the most volatile?
What currencies were the most volatile?
What currencies had the highest appreciation? Depreciation?
How did COVID-19 event announcements effect volatility?
How did different COVID-19 events effect their respective stock markets?

!pip install mapclassify
!pip install pycountry-convert
!pip install geopy

Requirement already satisfied: mapclassify in c:\users\matthew raw\anaconda3\lib\site-packages (2.3.0)
Requirement already satisfied: networkx in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (2.4)
Requirement already satisfied: scikit-learn in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (0.22.1)
Requirement already satisfied: scipy>=1.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (1.4.1)
Requirement already satisfied: numpy>=1.3 in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (1.18.1)
Requirement already satisfied: pandas>=1.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (1.0.3)
Requirement already satisfied: decorator>=4.3.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from networkx->mapclassify) (4.4.2)
Requirement already satisfied: joblib>=0.11 in c:\users\matthew raw\anaconda3\lib\site-packages (from scikit-learn->mapclassify) (0.14.1)
Requirement already satisfied: python-dateutil>=2.6.1 in c:\users\matthew raw\anaconda3\lib\site-packages (from pandas>=1.0->mapclassify) (2.8.1)
Requirement already satisfied: pytz>=2017.2 in c:\users\matthew raw\anaconda3\lib\site-packages (from pandas>=1.0->mapclassify) (2019.3)
Requirement already satisfied: six>=1.5 in c:\users\matthew raw\anaconda3\lib\site-packages (from python-dateutil>=2.6.1->pandas>=1.0->mapclassify) (1.14.0)
Requirement already satisfied: pycountry-convert in c:\users\matthew raw\anaconda3\lib\site-packages (0.7.2)
Requirement already satisfied: pycountry>=16.11.27.1 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (20.7.3)
Requirement already satisfied: pprintpp>=0.3.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (0.4.0)
Requirement already satisfied: pytest-cov>=2.5.1 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (2.10.1)
Requirement already satisfied: repoze.lru>=0.7 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (0.7)
Requirement already satisfied: wheel>=0.30.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (0.34.2)
Requirement already satisfied: pytest-mock>=1.6.3 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (3.3.1)
Requirement already satisfied: pytest>=3.4.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (5.4.1)
Requirement already satisfied: coverage>=4.4 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest-cov>=2.5.1->pycountry-convert) (5.3)
Requirement already satisfied: py>=1.5.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (1.8.1)
Requirement already satisfied: packaging in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (20.3)
Requirement already satisfied: attrs>=17.4.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (19.3.0)
Requirement already satisfied: more-itertools>=4.0.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (8.2.0)
Requirement already satisfied: pluggy<1.0,>=0.12 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (0.13.1)
Requirement already satisfied: wcwidth in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (0.1.8)
Requirement already satisfied: importlib-metadata>=0.12 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (1.5.0)
Requirement already satisfied: atomicwrites>=1.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (1.3.0)
Requirement already satisfied: colorama in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (0.4.3)
Requirement already satisfied: pyparsing>=2.0.2 in c:\users\matthew raw\anaconda3\lib\site-packages (from packaging->pytest>=3.4.0->pycountry-convert) (2.4.6)
Requirement already satisfied: six in c:\users\matthew raw\anaconda3\lib\site-packages (from packaging->pytest>=3.4.0->pycountry-convert) (1.14.0)
Requirement already satisfied: zipp>=0.5 in c:\users\matthew raw\anaconda3\lib\site-packages (from importlib-metadata>=0.12->pytest>=3.4.0->pycountry-convert) (2.2.0)
Requirement already satisfied: geopy in c:\users\matthew raw\anaconda3\lib\site-packages (2.0.0)
Requirement already satisfied: geographiclib<2,>=1.49 in c:\users\matthew raw\anaconda3\lib\site-packages (from geopy) (1.50)

mrau1.github.io

Initial Data Extraction, Transform, Load & Tidy¶

import geopandas as gpd
import requests
import csv
import time
import pandas as pd
import scipy.stats as sp
import logging
import threading
import time
import json
import os
import datetime as dt
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
import math
from sklearn.linear_model import LinearRegression

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:95% !important; }</style>"))

#Show all the rows of a dataframe
pd.options.display.max_rows = 10000
pd.options.display.width = 1000

Note for Data Extraction:¶

Extraction: multiple datasets were extracted from finance.yahoo.com. The code below uses RESTFUL api's to get historical data for thousands of financial tickers, instruments, and currencies. This code uses the requests library to pull the data, and pandas to convert into a pandas DataFrame. Each historical dataset can be webscraped or downloaded as a .csv file. Each DataFrame has the same column structure with columns for Date, Open, Close, High, Low, Adj Close, and Volume. Volume however, is not regularly recoreded, and for the sake of this project, will be dropped from consideration.

# Url of Historical Data from yahoo.finance.com
historical_url = "https://finance.yahoo.com/quote/%s/history?p=%s"

"""Adjusted to automatically take in symbols for automation"""

# Creates urls to grab current price of the underlying
def generateUrl(symbol=False):
    url_1 = "https://finance.yahoo.com/quote/"
    url_2 = "?p="
    url_3 = "&.tsrc=fin-srch"
    if symbol == False:
        # Allows user to input stock symbol and returns the Yahoo finance URL
        symbol = input("Enter a ticker symbol for a stock: ")
        url = url_1 + symbol + url_2 + symbol + url_3
    else:
        url = url_1 + symbol + url_2 + symbol + url_3       
    return url
       
"""Adjusted to allow symbols and different URL's to pass through
    Allows us to grab from historical, options, and other data sources"""
def dataGrabber(symbol=False, url=None):
    if url == None:
        # Obtain server's response to HTTP request
        url = generateUrl(symbol)   
    src = requests.get(url)
    # Converts source code to string
    data = src.text
    return data
       

# Returns the current price of the underlying
def getCurrentPrice(symbol=False):
    # Isolates the currentPrice variable
    data = dataGrabber(symbol)
    split = data.split("currentPrice\":", 1)
    halfList = split[1]
    split2 = halfList.split("}", 1)
    currentPrice = split2[0] + "}"
    finalPrice = currentPrice.split("""{"raw":""")[1].split(",")[0]
    # Print and return current price
    print(symbol, finalPrice)
    return round(float(finalPrice),3)

# Returns historical data in Pandas DataFrame format
def get_historical(symbol=None, to_csv=False):
    # Gets CSV of historical data on stock for the past 5 years
    t = str(int(time.time()))
    url_1 = "https://query1.finance.yahoo.com/v7/finance/download/"
    url_2 = "?period1=1433980800&period2=" + t
    url_3 = "&interval=1d&events=history"

    def fixNull(x):
        if x == 'null':return 0
        else:return x
    if symbol == None:
        symbol = input("Enter a ticker symbol for a stock: ")
    url = url_1 + symbol + url_2 + url_3
    # Keeps track of url that could be used as a download link
    with requests.Session() as s:
        download = s.get(url)
        decoded_content = download.content.decode('utf-8')
        cr = csv.reader(decoded_content.splitlines(), delimiter=',')
        historical_data = list(cr)
        c = True       
        data = []
        for row in historical_data:
            row = list(map(fixNull, row))
            if c:
                columns = row
                c=False
            else:
                data.append(row)
        
        df = pd.DataFrame(data=data, columns=columns)
        # Resets Parsing Error
        if symbol == "^KS11":
            df = df.set_index('Date')
            df.loc['2020-05-06', 'Low'] = str(float(df.loc['2020-05-06', 'Low']) / 10)
            df = df.reset_index()
        # Allows us to save as a CSV
        if to_csv != False:
            df.to_csv(symbol + ".csv")
    return df

Load and Extract - Country Indices¶

Firstly, I want to load each country's index and currency ticker. To do this I compiled a csv file (data/Stock Market Indices by Country - Sheet1.csv) to hold this information. I use the pandas .read_csv() function to load the data into a pandas DataFrame. After this step I select only Symbol, Country, and Currency as variables. These variables are the only variables needed to tidy this DataFrame.

# Load country index and currency tickers
# Note that only `Symbol`, `Country`,  and `Currency` will be used. 
country_indices = pd.read_csv("data/Stock Market Indices by Country - Sheet1.csv")[['Symbol', 'Country', 'Currency']]
print(country_indices)

       Symbol         Country Currency
0       ^GSPC   United States      USD
1      ASX.AX       Australia      AUD
2        ^HSI       Hong Kong      HKD
3      ^BSESN           India      INR
4       ^JKII       Indonesia      IDR
5       ^N225           Japan      JPY
6       ^KLSE        Malaysia      MYR
7     PSEI.PS    Phillippines      PHP
8       ^KS11     South Korea      KRW
9       ^FCHI          France      EUR
10     ^GDAXI         Germany      EUR
11       ^AEX     Netherlands      EUR
12   ^BUK100P  United Kingdom      GBP
13  ^TA125.TA          Israel      ILS
14  000001.SS           China      CNY

Load and Extract - CoronaNet data¶

The CoronaNet project features over 1000 researchers worldwide who compiled event data about Covid-19 dating back to January 2020. I downloaded their dataset as a .csv file and stored the file in my jupyter repository. I use the pandas .read_csv() function to properly load the data and store in and pandas DataFrame. This dataset will be used to classify major COVID-19 announcements.

# Download COVID-19 event data
# Data set goes from January 2020 until the end of August, thus only those portions will be analyzed.
corona = pd.read_csv("data/coronanet_release.csv")
corona.head()

Clean & Manipulate - Yahoo Finance¶

Yahoo finance provides great baseline statistics for historical data, however, I need to expand on that data in order to gain the insights I seek. To do this, I created the clean_df() function to convert the data returned from API calls into workable datasets. I do this by converting Open, Close, High, Low, and Adj Close into floats, and Date into a datetime class. I then manipulate the data further to create Range, Change, AbsoluteChange, percentChange, AbsolutePercentChange, and Volatility to provide key statistics on price movement. This data will be used further on to gain insights into how each index performed relative to their peers.

# Historical Data From yahoo.finance.com is packaged the same way, so it makes sense to create functions to process these DataFrames.
# `Open`, `Close`, `High`, `Low`, and `Adj Close` are converted into floats
# `Date` is converted into a datetime class
# `Range`, `Change`, `AbsoluteChange`, `percentChange`, `AbsolutePercentChange`, and `Volatility` are created from the columns above.

def clean_df(df:pd.DataFrame, start_date=dt.datetime(2020, 1, 2), end_date=dt.datetime(2020, 11, 18)):
    df['Date'].fillna(0)
    df['Date'] = pd.to_datetime(df['Date'])
    df = df[df['Date'] >= start_date]
    df = df[df['Date'] <= end_date]
    columns = ['Open', 'Close', 'High', 'Low', 'Adj Close']
    for col in columns:
        df[col].fillna(0)
        try:df[col] = df[col].astype('float')
        except:pass
    df['Range'] = df['High'] - df['Low']
    df['Change'] = df['Close'] - df['Open']
    df['AbsoluteChange'] = df['Change'].abs()
    df['percentChange'] = df['Change'] / df['Open']
    df['AbsolutePercentChange'] = df['percentChange'].abs()
    df['Volatility'] = df['Range'].abs() / df['Open']
    
    return df

Tidy & Transform - Index Data¶

Index data comes from yahoo.finance and has the same file structure as listed above. The load_indices() function creates a master pandas DataFrame with each of the columns below and their correpsonding country. This allows me to keep a tidy DataFrame that is easily manipulatable.

# load_indices() creates a dictionary holding each country's designated index data.
def load_indices(df=country_indices):
    master = pd.DataFrame(columns=['Country',
                                   'Date', 
                                   'Open', 
                                   'High', 
                                   'Low',
                                   'Close', 
                                   'Adj Close', 
                                   'Volume', 
                                   'Range', 
                                   'Change', 
                                   'AbsoluteChange', 
                                   'percentChange', 
                                   'AbsolutePercentChange', 
                                   'Volatility'])
    for index in range(len(df)):
        new_df = clean_df(get_historical(df.loc[index].Symbol))
        new_df['Country'] = df.loc[index,"Country"]
        master = pd.concat([master, new_df])
    return master

indices = load_indices()

# Display the S&P 500 index which represents the United States stock market
# Note that the top index of each nation will be represented.
# The United States also has the Russell-2000, Dow Jones, VIX, among many others.
indices.head()

# Note that the number of trading days varies as a result of national holidays.
indices.Country.value_counts()

Phillippines      230
France            227
Netherlands       227
Australia         226
United Kingdom    226
Germany           226
United States     224
India             221
South Korea       219
Hong Kong         218
Malaysia          217
Israel            216
Indonesia         215
Japan             213
China             212
Name: Country, dtype: int64

Getting Forex Data¶

get_forex() is built off the get_historical() function and takes in two currencies to compare. A finance.yahoo.com DataFrame will be returned with the same columns seen for the index data.

"""
functions to set up forex data
Clean data to accurately merge
"""
def get_forex(cur1:str, cur2:str):
    return clean_df(get_historical(cur1+cur2+'=X'))

get_forex('USD','GBP').head()

Load, Extract, & Tidy: Coronanet Project¶

The CoronaNet Project's dataset has 41 different columns, however, we'll only use country, target_country, init_country_level, date_announced, and type. We'll use this dataset to track COVID-19 events throughout 2020 and will eventually merge with Index data. Note that this dataset is mainly meant to provide baselines and dummy variables. Meta data analyses are drawn for observational purposes and understanding of the COVID-19 timeline.

In this section I plot the amount of events per country, I use events that are at the national level. These events have backing from the national government and have a greater magnitude than local or provincial events. The variable init_country_level indicates the level of each event. Furthermore, in the first cell, the United States is listed as United States of America, thus, the name is changed to properly join with the Index data.

corona.columns

Index(['record_id', 'policy_id', 'entry_type', 'correct_type', 'update_type', 'update_level', 'description', 'date_announced', 'date_start', 'date_end', 'country', 'ISO_A3', 'ISO_A2', 'init_country_level', 'domestic_policy', 'province', 'city', 'type', 'type_sub_cat', 'type_text', 'school_status', 'target_country', 'target_geog_level', 'target_region', 'target_province', 'target_city', 'target_other', 'target_who_what', 'target_direction', 'travel_mechanism', 'compliance', 'enforcer', 'index_high_est', 'index_med_est', 'index_low_est', 'index_country_rank', 'link', 'date_updated', 'recorded_date'], dtype='object')

# Replace `United States of America` with `United States`
corona['target_country'] = corona['target_country'].str.replace('United States of America','United States')
corona['country'] = corona['country'].str.replace('United States of America','United States')
corona.country.unique()

array(['United States', 'Germany', 'Japan', 'Finland', 'Egypt', 'India',
       'Sweden', 'Ivory Coast', 'Switzerland', 'European Union',
       'Guatemala', 'New Zealand', 'Russia', 'Bolivia', 'Sierra Leone',
       'Argentina', 'Palau', 'Ethiopia', 'Israel', 'Uzbekistan',
       'Australia', 'Honduras', 'North Korea', 'Chile', 'San Marino',
       'Laos', 'Colombia', 'Luxembourg', 'Panama', 'El Salvador',
       'Netherlands', 'Gambia', 'Grenada', 'Spain', 'Chad',
       'Saint Kitts and Nevis', 'Italy', 'Algeria', 'Qatar', 'Brunei',
       'Estonia', 'Saint Vincent and the Grenadines', 'Costa Rica',
       'Brazil', 'Burkina Faso', 'Norway', 'China', 'Namibia',
       'Philippines', 'Tonga', 'Indonesia', 'Turkey', 'Mali', 'Tunisia',
       'Niger', 'Rwanda', 'Canada', 'Cyprus', 'Mongolia', 'Kuwait',
       'Liechtenstein', 'Azerbaijan', 'Myanmar', 'Ireland', 'Bangladesh',
       'Austria', 'Lebanon', 'Guinea Bissau', 'Hungary', 'Tajikistan',
       'Iraq', 'South Africa', 'Papua New Guinea', 'Bahrain', 'Iceland',
       'Botswana', 'Morocco', 'Oman', 'Nigeria', 'Sri Lanka', 'Nicaragua',
       'Seychelles', 'Mexico', 'Taiwan', 'Thailand', 'Haiti',
       'Sao Tome and Principe', 'Guyana', 'Mozambique', 'Uganda',
       'Republic of Congo', 'Solomon Islands', 'Bosnia and Herzegovina',
       'France', 'East Timor', 'Liberia', 'Zimbabwe', 'Dominica',
       'Kazakhstan', 'Nauru', 'United Arab Emirates', 'Portugal',
       'Armenia', 'Sudan', 'Vanuatu', 'Dominican Republic', 'Suriname',
       'Monaco', 'Belarus', 'Kenya', 'Poland', 'Nepal', 'Latvia', 'Samoa',
       'Bhutan', 'Czech Republic', 'Kyrgyzstan', 'United Kingdom',
       'Moldova', 'Malta', 'The Bahamas', 'Northern Cyprus', 'Vietnam',
       'Barbados', 'Singapore', 'Vatican', 'Palestine', 'Albania', 'Fiji',
       'Saudi Arabia', 'Swaziland', 'Cuba', 'Libya', 'Romania', 'Guinea',
       'Greece', 'Madagascar', 'Antigua and Barbuda', 'Peru',
       'Trinidad and Tobago', 'South Sudan', 'Cameroon', 'Ukraine',
       'Venezuela', 'Ghana', 'Yemen', 'Jordan', 'Syria', 'Pakistan',
       'Angola', 'Afghanistan', 'Comoros', 'South Korea', 'Jamaica',
       'Iran', 'Lesotho', 'Montenegro', 'Cape Verde', 'Bulgaria', 'Gabon',
       'Croatia', 'Lithuania', 'Ecuador', 'Benin', 'Malawi', 'Djibouti',
       'Tuvalu', 'Turkmenistan', 'Mauritius', 'Senegal', 'Macedonia',
       'Belgium', 'Belize', 'Equatorial Guinea', 'Georgia', 'Paraguay',
       'Kiribati', 'Central African Republic', 'Uruguay', 'Eritrea',
       'Zambia', 'Democratic Republic of the Congo',
       'United Republic of Tanzania', 'Cambodia', 'Slovakia', 'Maldives',
       'Denmark', 'Saint Lucia', 'Somalia', 'Slovenia',
       'Federated States of Micronesia', 'Andorra', 'Malaysia',
       'Marshall Islands', 'Republic of Serbia', 'Togo', 'Mauritania',
       'Burundi'], dtype=object)

# Views the number of events in each category
corona.init_country_level.value_counts()
# `No, it is at the national level` looks like it is the same as `National`

National                           14831
Provincial                         14256
Municipal                           1911
No, it is at the national level      210
Other (e.g., county)                   6
Name: init_country_level, dtype: int64

# View the source of the supposed error
corona[corona['init_country_level']=='No, it is at the national level'].country.value_counts()
# Finds that the error is from China!

China    210
Name: country, dtype: int64

# Replace `No, it is at the national level` values with `National`
corona['init_country_level'] = corona['init_country_level'].str.replace('No, it is at the national level', 'National')
corona.init_country_level.value_counts()

National                15041
Provincial              14256
Municipal                1911
Other (e.g., county)        6
Name: init_country_level, dtype: int64

# Create a national events count to isolate each countries national events
national_events = corona[corona['init_country_level']=='National'].country.value_counts()
national_events = national_events.to_frame().reset_index()
national_events.columns = ['country', 'events_count']
national_events.head(15).plot.bar(x='country', y='events_count', title='Amount of National COVID-19 Events per Country', figsize=(10, 7))

<matplotlib.axes._subplots.AxesSubplot at 0x1763c20bf88>

import plotly.graph_objs as go 
from plotly.offline import init_notebook_mode,iplot,plot
init_notebook_mode(connected=True)

# Set up specific data structure for Map
data = dict(
        type = 'choropleth',
        colorscale = 'Viridis',
        locations = national_events['country'],
        locationmode = "country names",
        z = national_events['events_count'],
        text = national_events['country'],
        colorbar = {'title' : 'National Covid Events'},
      )

layout = dict(title = 'National Covid Event Counts by Country',
              geo = dict(projection = {'type':'mercator'})
             )

# Merge National Event counts back into the main `corona` DataFrame
corona = pd.merge(national_events, corona, how='right', on='country')
corona['date_announced'] = pd.to_datetime(corona['date_announced'])
corona.head()

# Display Map
choromap = go.Figure(data = [data],layout = layout)
iplot(choromap,validate=False)

# Build Dummy Variables for Event Type
corona = corona.merge(pd.get_dummies(corona['type']), left_index=True, right_index=True)
corona.columns

Index(['country', 'events_count', 'record_id', 'policy_id', 'entry_type', 'correct_type', 'update_type', 'update_level', 'description', 'date_announced', 'date_start', 'date_end', 'ISO_A3', 'ISO_A2', 'init_country_level', 'domestic_policy', 'province', 'city', 'type', 'type_sub_cat', 'type_text', 'school_status', 'target_country', 'target_geog_level', 'target_region', 'target_province', 'target_city', 'target_other', 'target_who_what', 'target_direction', 'travel_mechanism', 'compliance', 'enforcer', 'index_high_est', 'index_med_est', 'index_low_est', 'index_country_rank', 'link', 'date_updated', 'recorded_date', 'Anti-Disinformation Measures', 'Closure and Regulation of Schools', 'Curfew', 'Declaration of Emergency', 'External Border Restrictions', 'Health Monitoring', 'Health Resources', 'Health Testing', 'Hygiene', 'Internal Border Restrictions', 'Lockdown', 'New Task Force, Bureau or Administrative Configuration', 'Other Policy Not Listed Above', 'Public Awareness Measures',
       'Quarantine', 'Restriction and Regulation of Businesses', 'Restriction and Regulation of Government Services', 'Restrictions of Mass Gatherings', 'Social Distancing'],
      dtype='object')

# Determine the amount of events per day during 2020
# `date_announced` is the variable indicating a new event
# Note that `date_announced` does not start a new event, but rather indicates that an event will happen
dd = corona[corona['date_announced'] <= dt.datetime.today()].date_announced.value_counts().to_frame()
dd.columns = ['Count']
plt.xlabel("Date")
plt.ylabel('Count')
dd.plot(title='Number of National COVID-19 Events Per Day', figsize=(10, 7))

<matplotlib.axes._subplots.AxesSubplot at 0x17637a73448>

INSIGHT - Correlation between events and VIX¶

What is the VIX?¶

The VIX is a measure of Market Volatility in the S&P 500 option spreads. The higher the VIX the more uncertainty in the marketplace.

The chart above high resembles the VIX (Volatility Index-Index), Below I quickly plot and run a regression between the closing price of the VIX (close) and the number of Covid Announcements (date_announced) between January and August 2020.

# Web Scrap VIX historical data and clean the DataFrame
VIX = clean_df(get_historical('^VIX'))

# WebScrap Vix historical data and clean the DataFrame
VIX.set_index('Date')['Close'].plot(title='VIX Closing Price 2020 Levels', figsize=(10,7))

<matplotlib.axes._subplots.AxesSubplot at 0x17637ae3a88>

# Merge VIX and events data

VX = pd.merge(VIX[['Close', 'Date']], dd.reset_index(), left_on='Date', right_on='index').set_index('index')
VX.sort_values('Close', ascending=False).head()

# Set up a Linear Regression using statsmodels.api
X = sm.add_constant(VX[['Count']])
Y = VX['Close']
output = sm.OLS(Y, X).fit()

# Output the statistical summary of the linear regression
print(output.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  Close   R-squared:                       0.792
Model:                            OLS   Adj. R-squared:                  0.791
Method:                 Least Squares   F-statistic:                     568.7
Date:                Tue, 24 Nov 2020   Prob (F-statistic):           9.98e-53
Time:                        23:53:50   Log-Likelihood:                -504.09
No. Observations:                 151   AIC:                             1012.
Df Residuals:                     149   BIC:                             1018.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         20.7300      0.728     28.477      0.000      19.292      22.168
Count          0.0661      0.003     23.847      0.000       0.061       0.072
==============================================================================
Omnibus:                        2.965   Durbin-Watson:                   0.733
Prob(Omnibus):                  0.227   Jarque-Bera (JB):                2.519
Skew:                           0.228   Prob(JB):                        0.284
Kurtosis:                       3.439   Cond. No.                         342.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Analysis - VIX vs. Covid Events¶

The correlation between the Closing Price of the VIX and the # of Covid-19 announcements per day show significance. This model, with an R^2 of 0.792 is representative of 79.2% of the sample. This model could be an accurate indicator of the VIX, however, we would like to increase our R^2 if possible. More independent variables such as event magnitude could be added to create a more representative model.

Regarding the Beta Coefficients, both the constant and count of date_announcements are statistically significant. Count has a t-stat of 23.847 which is significant to the 0.01% level.

+- 1.96 is statistically significant at the 95% significance level.

The VIX formula accounts for options spreads and is a measure of market uncertainty. Feel free to learn more about the VIX formula in the link below

https://www.cboe.com/micro/vix/vixwhite.pdf

CoronaNet Data - Mapping¶

I plot the amount of events of each nation using plotly's graphing capabilities. I generate a world map and use the National event counts we accumulated to view the amount of national events for each nation.

Coronavirus Events Per Nation: Analysis¶

As we can see above, Russia has the most national events from January 2020 - August 2020. Russia is followed by China, Bangladesh, and Guatemala. China and Russia have communist governments who have centralized control of the government. Guatemala and Bangladesh do have centralized control, however, they are generally poorly run governments. Government uncertainty may also be a reason for the quantity of events. These governments may have enacted, then retracted numerous events.

# Display Index data for the United States of America
indices[indices['Country'] == 'United States'].tail()

Create Index Analysis¶

In this section I build a matrix() function to calculate the relationships between DataFrames. I use this function to analyze the relationships between different indices and currency pairs. This function takes in a dependant column and takes the sample mean of that given dependant column through a time period. These time periods are specified by start_date and end_date. The means are sampled weekly, but can be changed by period.

# `col` is the dependent column in the matrix
# `period` is the resampling/grouping method
# `values` is the Pandas DataFrame
# `start_date` and `end_date` are date parameters
# `country_indices` are the list of tracked countries

def matrix(col,
           period='W',
           values=indices, 
           start_date=dt.datetime(2020, 2, 15), 
           end_date=dt.datetime(2020, 11, 2),
           country_indices=country_indices['Country'].to_list()):
    
    count = 0
    # Master DataFrame
    df = pd.DataFrame()
    
    for index in country_indices:   
        # Builds dataFrame if it is null
        if count == 0:
            df = values[values['Country'] == index].set_index('Date').resample(period).mean()[col].to_frame().reset_index()
            df.columns = ['Date', index]
            count += 1
            
        else: # Continually Resamples and Appends to master DataFrame
            df2 = values[values['Country'] == index].set_index('Date').resample(period).mean()[col].to_frame().reset_index()
            df2.columns = ['Date', index]
            df = pd.merge(df,
                          df2,
                          how='left',
                          on='Date')
            count += 1
    
    # Returns DataFrame between the selected dates
    df = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]
    return df.set_index('Date')

# Gets variance of volotility over March and April 2020: the main effect of COVID-19 of markets
variance = matrix(col='Volatility', period='W', start_date=dt.datetime(2020, 3, 1), end_date=dt.datetime(2020, 5, 1)).var().to_frame()
variance.columns = ['Variance'] 

# Plots the Variance in ascending order based on total variance in volatility
variance.sort_values('Variance').plot.bar(title='Variance of Percent Changes', figsize=(12, 6))

<matplotlib.axes._subplots.AxesSubplot at 0x17637b006c8>

Which country had the highest volatility in their stock market?¶

India had the highest variance of any stock market in the sample set. India was closely followed by Australia, Indonesia, and the United States. The south eastern countries are expected to have higher volatility because the pandemic hit them first, however, China, the first country infected, had the second lowest volatility. This may be a result from their decisive actions to mitigate the virus, or the world's relative certainty in the chinese economy. India is known to be a volatile developing economy and is seen as such in this analysis.

# Create Matrix
vol = matrix(col='Volatility')
variance = vol.var().to_frame()

fig, ax = plt.subplots(figsize=(20,20))

# Plot the frequency of volatile weeks during March and April

# Plot each countries mean weekly volatility
for country in vol.columns:
    plt.scatter(x=vol.index, y=vol[country], label=country)

plt.legend()
plt.title("Sample Nation Volatility 2020")
plt.ylabel("Volatility")
plt.xlabel("Date")

Text(0.5, 0, 'Date')

Forex Comparisons¶

build_forex_matrix is built to create a matrix correlating each of the currency pairs. Below Volatility is compared amongst the currency pairs.

def build_forex_matrix(data=country_indices.to_dict('list'), GET='matrix'):
    
    # Re-Categorizes currencies to not get Euro Dollar data multiple times
    try:
        currency = data['Currency']
        country = data['Country']
        country.remove('Germany')
        country.remove('Netherlands')
        country.remove('France')
        country.append('European Union')
        currency.remove('EUR')
        currency.remove('EUR')
    except: pass

    master = {}
    master_df = pd.DataFrame(columns=['Country',
                                       'Date', 
                                       'Open', 
                                       'High', 
                                       'Low',
                                       'Close', 
                                       'Adj Close', 
                                       'Volume', 
                                       'Range', 
                                       'Change', 
                                       'AbsoluteChange', 
                                       'percentChange', 
                                       'AbsolutePercentChange', 
                                       'Volatility'])
    
    # Iterate through countries and use each Country as a dependant variable 
    for index, c in enumerate(country):
        slave = {}
        main_currency = currency[index]
        currencies_to_use = (currency[0:index] + currency[index+1:-1])
        country_df = pd.DataFrame(columns=['Country',
                                   'Date', 
                                   'Open', 
                                   'High', 
                                   'Low',
                                   'Close', 
                                   'Adj Close', 
                                   'Volume', 
                                   'Range', 
                                   'Change', 
                                   'AbsoluteChange', 
                                   'percentChange', 
                                   'AbsolutePercentChange', 
                                   'Volatility'])
        
        # Iterate through other countries and use as indpendant variables
        for curr in currencies_to_use:
            try:
                # Returns forex data from yahoo.finance.com
                temp_df = get_forex(main_currency, curr)
                temp_df['Country'] = curr
                # Appends to master DataFrame
                country_df = pd.concat([country_df, temp_df])

            except: 
                pass
 
        # Creates a final matrix for Volatility for each Dependant Currency
        final = matrix(col='Volatility',
                       period='W', 
                       values=country_df, 
                       start_date=dt.datetime(2020, 2, 15), 
                       end_date=dt.datetime(2020, 4, 1),
                       country_indices=currencies_to_use).mean()
        country_df['currency_pair'] = main_currency + '/' + country_df['Country']
        master_df = pd.concat([master_df, country_df])
        final[main_currency] = 0
        master[main_currency] = final
    
    # Returns DataFrame of Raw data if selected
    if GET == 'DF':
        return master_df
    
    # Returns Matrix data
    return pd.DataFrame(master)
    

foreign = build_forex_matrix()

foreign = foreign.fillna(0)
foreign = foreign.reindex(sorted(foreign.columns), axis=1)
foreign * 100

Correction - IDR/KRW¶

Note that the value of IDR, KRW is an outlier skewing our results. To correct this error, I reset the value to 0. This value allows other values to correctly

foreign.loc['IDR', 'KRW'] = 0

import seaborn as sns
fig, ax = plt.subplots(figsize=(20,20))
colormap = sns.diverging_palette(220, 10, as_cmap=True)
sns.heatmap(foreign, cmap=colormap, annot=True, fmt = ".2f")
plt.xticks(range(len(foreign.columns)), foreign.columns)
plt.yticks(range(len(foreign.columns)), foreign.columns)
plt.title("Currency Pair Volatility Correlations")
plt.show()

forex = build_forex_matrix(GET='DF')

forex_df = forex[(forex['Date'] >= dt.datetime(2020, 2, 15)) & (forex['Date'] <= dt.datetime(2020, 4, 4)) & (forex['currency_pair'] != 'IDR/JPY')]
forex_df = forex_df.groupby(['currency_pair']).mean().sort_values('AbsolutePercentChange', ascending=False).dropna()
forex_df.head(10)['AbsolutePercentChange'].plot.bar(title='Average Daily Percent Changes During Initial Outbreaks')

<matplotlib.axes._subplots.AxesSubplot at 0x1763dd78448>

def winners_losers(country_indices=forex, target='Close', p_value=1.96):
    currencies = forex.Country.unique()
    master = {}
    scores = {}
    for main_currency in currencies:
        master[main_currency] = {'Score':0, 'Wins':list(), "Loses":list(), 'Ties':list()}
        scores[main_currency] = 0
        for curr in currencies:
            if curr != main_currency:
                try:
                    k = get_forex(main_currency, curr)
                    k = k[(k['Date'] < dt.datetime(2020, 5, 1)) &(k["Date"] > dt.datetime(2020, 2, 10))]
                    y=np.array(k[target].dropna().values, dtype=float)
                    x=np.array(pd.to_datetime(k['Date'].dropna()).index.values, dtype=float)
                    slope, intercept, r_value, p_value, std_err = sp.linregress(x,y)
                    t_stat = slope / std_err
                    if t_stat > p_value:
                        master[main_currency]['Score'] += 1
                        scores[main_currency] += 1
                        master[main_currency]['Wins'].append(curr)
                    elif math.sqrt(t_stat * t_stat) <= p_value:
                        master[main_currency]['Ties'].append(curr)
                    else:
                        scores[main_currency] -= 1
                        master[main_currency]['Score'] -= 1
                        master[main_currency]['Loses'].append(curr)                        
                        
                except:
                    pass
                    
    return master, scores

data, scores = winners_losers()

C:\Users\Matthew Raw\Anaconda3\lib\site-packages\ipykernel_launcher.py:16: RuntimeWarning:

invalid value encountered in double_scalars

pd.Series(scores).sort_values().plot.bar(figsize=(16, 6))
plt.title("Statistical Significance Scores of Currencies")
plt.ylabel("Score")

Text(0, 0.5, 'Score')

Analysis: Forex¶

From the chart above we can see that the Indian Rupee was the most volatile currency during the heart of the COVID-19 surge. This could be a result of India's developing economy and their large, vulnerable population. India is also one of the most heavily traded stock markets in Asia. The sheer growth of India's democratic economy has few to no restrictions on investment which has triggered investment. However, India in relation to China, South Korea, and Japan, did not stem the spread of COVID-19. COVID-19 still runs rampant in India, but other countries are now reopened.

In general, the most stable currencies become reserve currencies. Reserve currencies are stable currencies where nations will keep their money. For instance, unstable South American countries store their wealth in the United States; or former British colonies (like the US in 1783) would keep stores of the British pound or gold to store their wealth. In troubling times, reserve currencies tend to appreciate in value. In this scenario, the Japanese Yen appreciated the most, followed by the United States Dollar, Hong Kong Dollar, and Euro. Japan was the most successful in combatting COVID-19 compared to the United States who is facing a massive outbreak and more shutdowns. Furthermore, we can see that developing Asian countries are depreciating the most compared to their developed counterparts. What is interesting is that the British Pound had net depreciation. This may be because their leader Boris Johnson ignored COVID-19 warnings, or a sign that the financial dominance of a Britain has deteriorated after Brexit.

COVID-19 Event Analysis: Regressions¶

In this section I analyze the effect different COVID-19 events have on their respective stock market indices. For each of the events listed below, I create dummy variables and then combine the results into distinct groups. I will then run regression analyses to look for correlation amongst the variables. I hypothesize that days with more events will have have a higher volatility and days with high normally high counts will have a relief in volatility.

Regressions:¶

Dummy Variables on

corona.type.unique()

array(['Health Testing', 'Health Resources',
       'Restriction and Regulation of Government Services',
       'Restriction and Regulation of Businesses',
       'Restrictions of Mass Gatherings',
       'Closure and Regulation of Schools',
       'External Border Restrictions', 'Lockdown',
       'Declaration of Emergency', 'Quarantine', 'Hygiene',
       'Internal Border Restrictions', 'Social Distancing',
       'Other Policy Not Listed Above', 'Public Awareness Measures',
       'New Task Force, Bureau or Administrative Configuration',
       'Health Monitoring', 'Curfew', 'Anti-Disinformation Measures'],
      dtype=object)

# This function creates dummy variables for each type of event announcement
def build_dummies(df:pd.DataFrame=corona, countries:list=country_indices['Country'].tolist()):
    df = df[df['init_country_level'] == 'National']
    dates = df['date_announced'].unique()
    df = df.groupby(['country', 'date_announced'])[df['type'].unique()].sum()
    df['next_day'] = 0
    return df

master = build_dummies()
master.head()

Formatting / Tidying¶

The dummy data is left joined on indices to merge each trading day of each country on their COVID-19 events. Nan events are set to 0. The reasoning for this is to set up a dummy variable regression on Volatility. This regression aims to see how each event affects consumer sentiment.

m = pd.merge(indices[['Country', 'Date']], master, left_on=['Country','Date'], right_on=['country', 'date_announced'], how='left')
m = m.fillna(0)
m.head()

# Head of indices to show the stock market data to be merged.
indices.head()

Creating `next_day` Column¶

This dummy variable allows us to analyze the day after change due to an event. We can use this data to run a difference in differences regression model. We will run a pooled regression on Date and Country. The aim is to see if individual nations' dummy variables can be combined to provide insight into volatile markets.

We will merge the indices DataFrame with dummy on Country and Date to properly match our data. Then we will combine similar dummy variables into groups (Health, Restriction, Lockdown, & Information).

# Length of the merged index
count = len(m.index)

# Creates temporary DataFrame
dummy = m

# Groups index data in temporary dataFrame
index_data = indices.groupby(['Country', 'Date']).mean()

for ind in m.index:
    # Appends next day
    get = dummy.loc[ind].tolist()
    get[1] = dummy.loc[ind+1].tolist()[1]
    # Sets day after variable to next market day
    get[-1] = 1
    dummy.loc[count] = get
    count += 1

# Creates table of dummy variables
dummy.sort_values('Date').head()

# Merge Coronavirus dummy data on market days for respective indices
# Create four broad columns representing the categories for each event type
dummy['Health'] = dummy['Health Testing'] + dummy['Health Resources'] + dummy['Health Monitoring']
dummy['Restrictions'] = (dummy['Restriction and Regulation of Government Services'] + dummy['Restriction and Regulation of Businesses'] +
                        dummy['Restrictions of Mass Gatherings'] + dummy['External Border Restrictions'] 
                        + dummy['Internal Border Restrictions'] + dummy['Social Distancing'])
dummy['Lockdown'] = (dummy['Declaration of Emergency'] + dummy['Lockdown'] + dummy['Curfew'])
dummy['Information'] = (dummy['Other Policy Not Listed Above'] + dummy['Public Awareness Measures'] + 
                        dummy['New Task Force, Bureau or Administrative Configuration'] + 
                        dummy['Anti-Disinformation Measures'] + dummy['Hygiene'])

dummies_to_delete = ['Health Testing', 'Health Resources', 'Restriction and Regulation of Government Services', 
           'Restriction and Regulation of Businesses', 'Restrictions of Mass Gatherings', 
           'Closure and Regulation of Schools', 'External Border Restrictions', 'Other Policy Not Listed Above', 
           'Internal Border Restrictions', 'Public Awareness Measures', 'Declaration of Emergency', 
           'Health Monitoring', 'Quarantine', 'Social Distancing', 'New Task Force, Bureau or Administrative Configuration',
            'Anti-Disinformation Measures', 'Hygiene', 'Curfew']

for dumb in dummies_to_delete:
    del dummy[dumb]
    
dummy.head()

# Merge Dummy data with indices
master = pd.merge(dummy, indices, on=['Date', 'Country'], how='left').sort_values(['Country', 'Date'])

# Drop null results
master = master.dropna()

# Event_Count is the amount of dummies on a given day
master['Event_Count'] = (master['Lockdown'] + master['Health'] + master['Restrictions'] + master['Information'])

# *-1 is used to properly sort the data, was used by trial.
master['next_day'] = master['next_day'] * -1

# Sort values by Country, then Date, and lastly next_day
master = master.sort_values(['Country', 'Date', 'next_day'], ascending=True)
master['next_day'] = master['next_day'] * -1

# Saved Dataset for other used
master.to_excel("MatthewRau_IndexData.xlsx", index=False)
master.tail()

# Full Regression of all countries on volatility
temp_df = master
independents = ['Lockdown', 'Health', 'Restrictions', 'Information']
X = sm.add_constant(temp_df[independents])
Y = temp_df['Volatility']
output = sm.OLS(Y, X).fit()
output.summary()

# Create differences in Volalatility and Absolute Change in Volatility
# Is necessary for first differences model
master['Volatility_Change'] = master['Volatility'].diff(1)
master['Absolute_Volatility_Change'] = master['Volatility_Change'].abs()
master.head()

# Creates immediate effect DataFrame
# Only takes next_day == 1 (Day After Events)
master_next = master.loc[master['next_day'] == 1.0].dropna()
master_next.head()

# Makes each dummy variable binary instead of numerical
# Normally dummy variables are binary but I combined numerous dummies for certain days.
binary_master = master
binary_master['Lockdown'] = (binary_master['Lockdown'] > 0)
binary_master['Health'] = (binary_master['Health'] > 0)
binary_master['Restrictions'] = (binary_master['Restrictions'] > 0)
binary_master['Information'] = (binary_master['Information'] > 0)
binary_master.tail()

Regression: Pooled First Differences¶

A pooled first differences regression is a regression on panel data and changes over time. I take the differences between volatility for the day-after an event happens. I capture the immediate effect and group the data by its country index. This model is slit to provide clarity among groups while looking at the overall effect.

# Pooled OLS package for first differences model.
from linearmodels.panel import PooledOLS

# Create temporary DataFrame
temp_df = binary_master

# Only use February 15 - May 1. Aims to capture COVID-19 surge.
temp_df = temp_df[(temp_df['Date'] > dt.datetime(2020, 2, 15)) & (temp_df['Date'] < dt.datetime(2020, 5, 1))]

# Categorical Variable for Country
co = pd.Categorical(temp_df.Country)

# Set Double index for first differnces
temp_df = temp_df.set_index(['Country', 'Date'])
temp_df['Country'] = co

# Signify exogenous variables
exog_vars = ['Lockdown', 'Health', 'Restrictions', 'Information']
exog = sm.add_constant(temp_df[exog_vars])

# Regress and Fit model
mod = PooledOLS(temp_df.Volatility_Change, exog)
pooled_res = mod.fit()
pooled_res

# Built to analyze regressions on certain dependant and independant variables for ALL nations.
def highest_corr(dependant='Volatility', df:pd.DataFrame=master, independants=['Lockdown', 'Health', 'Restrictions', 'Information'] ):
    data = {}
    for country in master.Country.unique():
        temp_df = df.loc[df['Country'] == country].dropna()
        X = temp_df[independants]
        y = temp_df[dependant]
        #Fit the linear regression
        regr = LinearRegression()
        regr.fit(X, y)
        data[country] = (regr.score(X, y))
    return pd.Series(data)

# Bar Chart to show overall correlation of Dummy Variables on Volatility
fig1, ax = plt.subplots(1,3, figsize=(20,5))
final = pd.DataFrame()
final['Volatility'] = highest_corr().sort_values().plot.bar(ax=ax[0], color='blue', title="R2 of Dummy Variables Regressed on Volatility")
final['Volatility_Change'] = highest_corr(dependant='Volatility_Change').sort_values().plot.bar(ax=ax[1], color='purple', title="R2 of Dummy Variables Regressed on Volatility Change")
final['Absolute_Volatility_Change'] = highest_corr(dependant="Absolute_Volatility_Change").sort_values().plot.bar(ax=ax[2], color='green', title="R2 of Dummy Variables Regressed on Absolute Volatility Change")

# Bar Chart to show overall correlation of Dummy Variables on Volatility
fig1, ax = plt.subplots(1,3, figsize=(20,5))
final = pd.DataFrame()
final['Volatility'] = highest_corr(independants=['Event_Count']).sort_values().plot.bar(ax=ax[0], color='blue', title="R2 of Events/Day Regressed on Volatility")
final['Volatility_Change'] = highest_corr(independants=['Event_Count'], dependant='Volatility_Change').sort_values().plot.bar(ax=ax[1], color='purple', title="R2 of Events/Day Regressed on Volatility Change")
final['Absolute_Volatility_Change'] = highest_corr(independants=['Event_Count'], dependant="Absolute_Volatility_Change").sort_values().plot.bar(ax=ax[2], color='green', title="R2 of Events/Day Regressed on Absolute Volatility Change")

# Create Simple regression to calculate which dummy variables has the largest effect on Volatility
last_df = pd.DataFrame()
for var in ['Lockdown', 'Health', 'Restrictions', 'Information']:
    last_df[var] = highest_corr(independants=[var])

last_df.sort_values('Restrictions').plot.bar(figsize=(16, 9))

<matplotlib.axes._subplots.AxesSubplot at 0x176420e3408>

Discussion:¶

Overall there was little to no effect each coronvirus event on the overall volatility. Consumers were unfazed by constant COVID-19 updates. Their level of uncertainty did not change much, however, some nations had higher correlation than others. For instance, the United States had higher correlation among its dummy variables to volatility than other nations. Malaysia and Japan had the highest R-squared for Volatility Changes and Absolute Volatility Changes respectively, however, the R-squared was nowhere near significant. The United States had an R-squared around 0.25 meaning only 25% of the regression model was explained. The United States, however, did relate the most to the dummy variables. COVID-19 events shaped part of day-after consumer sentiment about markets, however, there is significant omitted variable bias that needs to be explained.

The pooled first differences regression model and cumulative multiple linear regression model were not explanatory of market volatility in any form. The pooled regression model had an extremely low R-squared and the independant variables did not show any significant T-scores. On the other hand, the full regression including all countries and the four dummy variables yielded significant T-scores for all dummy variables. The T-scores were significant past the 1% level, showing that each Information, Restrictions, Lockdown and Health should be included in a full regression model.

Each dummy variable by itself, showed that Restrictions were the most dependant dummy variable for singular linear regressions on Volatility. This means that Restrictions probably weighs more on Volatility than other dummy variables.

Conclusion¶

People put their wealth in safe places in times of uncertainty and in the foreign exchange market it is shown that Japan had been the safest place. Japan did not have a robust financial system until the 1960's and has grown ever since. Japan's ability to control the pandemic better than its developed partners proved fruitful in an appreciation of the Yen. Developing nations like India and Indonesia seen as major world economies were not as fortunate. They received net depreciation and may possibly mean revert in the future. The Indian Rupee and Indonesian Rupiah may be a solid investment for the prospects of a near-future mean reversion.

Although COVID-19 event data did not fully capture volatility movement in major stock market indices, it did show that some event types would a play significant effect on a more robust model. Data that should be included for next time would be COVID-19 case rates and death rates. These rates are the baseline for lockdowns and may be better indicators of different events and volatility itself.. These events in effect have some effect on markets directly and COVID-19 case rates could cause uncertainty in markets before restrictions are put in place. This pandemic has confused market participants and everyone alike, and no one model will capture the overall effect this pandemic has had on humanity.

Papers you might enjoy

	record_id	policy_id	entry_type	correct_type	update_type	update_level	description	date_announced	date_start	date_end	...	travel_mechanism	compliance	enforcer	index_high_est	index_med_est	index_low_est	index_country_rank	link	date_updated	recorded_date
0	R_1gdyC25xLGlP5H1Bv	3889409	new_entry	original	NaN	NaN	US state Michigan creates new task force for e...	2020-05-29	2024-05-29	2024-05-29	...	NaN	Mandatory (Unspecified/Implied)	New (COVID-19 specific) Task Force	NaN	NaN	NaN	NaN	https://www.michigan.gov/whitmer/0,9309,7-387-...	2020-07-05	2020-07-05T12:40:29Z
1	R_vp4xmb4cOiQ3nb3NA	1429889	new_entry	original	NaN	NaN	At June 9th 2020 the state government of Hesse...	2020-06-09	2020-06-09	2023-01-01	...	NaN	Voluntary/Recommended but No Penalties	Provincial/State Government	NaN	NaN	NaN	NaN	https://www.hessen.de/sites/default/files/medi...	2020-06-22	2020-06-22T09:20:26Z
2	R_ZxFGXQx5sEuEvbHNA	3386408	new_entry	original	NaN	NaN	The Parliament of Japan enacted on March 13, 2...	2020-03-13	2020-03-15	2022-03-13	...	NaN	Mandatory (Unspecified/Implied)	National Government	53.308786	52.497797	51.686694	158.0	https://www.aa.com.tr/en/asia-pacific/japans-a...	2020-04-15	2020-04-15T11:56:10Z
3	R_1KxVwu4XvnTKkcDCe	1846710	new_entry	original	NaN	NaN	Finland's Ministry of Social Affairs and Healt...	2020-05-28	2020-05-28	2021-12-31	...	NaN	Mandatory (Unspecified/Implied)	Ministry/Department of Health	NaN	NaN	NaN	NaN	https://stm.fi/en/artikkeli/-/asset_publisher/...	2020-06-12	2020-06-12T11:16:38Z
4	R_22kWz6RdbvMhvnnNA	1919243	new_entry	correction	NaN	NaN	Egypt will deduct 1% from people’s salaries fo...	2020-05-20	2020-07-01	2021-07-01	...	NaN	Mandatory with Legal Penalties (Jail Time)	National Government	NaN	NaN	NaN	NaN	https://www.reuters.com/article/health-coronav...	2020-07-17	2020-07-17T18:16:50Z

	Country	Date	Open	High	Low	Close	Adj Close	Volume	Range	Change	AbsoluteChange	percentChange	AbsolutePercentChange	Volatility
1148	United States	2020-01-02	3244.669922	3258.139893	3235.530029	3257.850098	3257.850098	3458250000	22.609864	13.180176	13.180176	0.004062	0.004062	0.006968
1149	United States	2020-01-03	3226.360107	3246.149902	3222.340088	3234.850098	3234.850098	3461290000	23.809814	8.489991	8.489991	0.002631	0.002631	0.007380
1150	United States	2020-01-06	3217.550049	3246.840088	3214.639893	3246.280029	3246.280029	3674070000	32.200195	28.729980	28.729980	0.008929	0.008929	0.010008
1151	United States	2020-01-07	3241.860107	3244.909912	3232.429932	3237.179932	3237.179932	3420380000	12.479980	-4.680175	4.680175	-0.001444	0.001444	0.003850
1152	United States	2020-01-08	3238.590088	3267.070068	3236.669922	3253.050049	3253.050049	3720890000	30.400146	14.459961	14.459961	0.004465	0.004465	0.009387

	Date	Open	High	Low	Close	Adj Close	Range	Change	AbsoluteChange	percentChange	AbsolutePercentChange	Volatility
1190	2020-01-02	0.75469	0.76190	0.753700	0.75470	0.75470	0.008200	0.00001	0.00001	0.000013	0.000013	0.010865
1191	2020-01-03	0.76081	0.76600	0.759810	0.76030	0.76030	0.006190	-0.00051	0.00051	-0.000670	0.000670	0.008136
1192	2020-01-06	0.76445	0.76538	0.759071	0.76452	0.76452	0.006309	0.00007	0.00007	0.000092	0.000092	0.008253
1193	2020-01-07	0.75919	0.76357	0.756880	0.75930	0.75930	0.006690	0.00011	0.00011	0.000145	0.000145	0.008812
1194	2020-01-08	0.76270	0.76437	0.759330	0.76256	0.76256	0.005040	-0.00014	0.00014	-0.000184	0.000184	0.006608

	country	events_count	record_id	policy_id	entry_type	correct_type	update_type	update_level	description	date_announced	...	travel_mechanism	compliance	enforcer	index_high_est	index_med_est	index_low_est	index_country_rank	link	date_updated	recorded_date
0	Russia	491	R_escP3JJEVT6V9OVNA	4074912	new_entry	original	NaN	NaN	Russian government is allocating about 3.5 bil...	2020-06-23	...	NaN	Mandatory with Legal Penalties (Jail Time)	National Government,New (COVID-19 specific) Ta...	NaN	NaN	NaN	NaN	http://government.ru/docs/39949/	2020-06-29	2020-06-29T19:25:01Z
1	Russia	491	R_1lnv6O7Vp9a0bmABg	9097760	new_entry	original	NaN	NaN	On April 12, the Russian Government allocated ...	2020-04-12	...	NaN	Mandatory (Unspecified/Implied)	National Government,Ministry/Department of Hea...	60.579280	56.532497	53.744462	192.0	http://government.ru/docs/39486/	2020-07-31	2020-07-31T15:58:23Z
2	Russia	491	R_1lnv6O7Vp9a0bmADu	9097760	new_entry	original	NaN	NaN	On April 12, the Russian Government allocated ...	2020-04-12	...	NaN	Mandatory (Unspecified/Implied)	National Government,Ministry/Department of Hea...	60.579280	56.532497	53.744462	192.0	http://government.ru/docs/39486/	2020-07-31	2020-07-31T15:58:23Z
3	Russia	491	R_pFd5yq7M25iOUlbDw	116208	new_entry	original	NaN	NaN	In Russia, on April 15 a temporary procedure f...	2020-04-15	...	NaN	Mandatory (Unspecified/Implied)	National Government,Ministry/Department of Health	61.181590	56.526482	53.086042	191.0	http://government.ru/docs/all/127505/	2020-05-19	2020-05-19T11:00:48Z
4	Russia	491	R_1DPaQBw2ijnsxQtBt	1581062	new_entry	original	NaN	NaN	On April 3, to regulate health resources, the ...	2020-04-03	...	NaN	Mandatory with Legal Penalties (Jail Time),Man...	National Government,Ministry/Department of Health	58.587156	56.661673	55.182878	194.0	http://government.ru/docs/39399/	2020-06-06	2020-06-06T04:18:10Z

	Close	Date	Count
index
2020-03-16	82.690002	2020-03-16	1090
2020-03-18	76.449997	2020-03-18	708
2020-03-17	75.910004	2020-03-17	956
2020-03-12	75.470001	2020-03-12	530
2020-03-19	72.000000	2020-03-19	629

	Country	Date	Open	High	Low	Close	Adj Close	Volume	Range	Change	AbsoluteChange	percentChange	AbsolutePercentChange	Volatility
1367	United States	2020-11-12	3562.669922	3569.020020	3518.580078	3537.010010	3537.010010	4890120000	50.439942	-25.659912	25.659912	-0.007202	0.007202	0.014158
1368	United States	2020-11-13	3552.570068	3593.659912	3552.570068	3585.149902	3585.149902	4709670000	41.089844	32.579834	32.579834	0.009171	0.009171	0.011566
1369	United States	2020-11-16	3600.159912	3628.510010	3600.159912	3626.909912	3626.909912	5281980000	28.350098	26.750000	26.750000	0.007430	0.007430	0.007875
1370	United States	2020-11-17	3610.310059	3623.110107	3588.679932	3609.530029	3609.530029	4799570000	34.430175	-0.780030	0.780030	-0.000216	0.000216	0.009537
1371	United States	2020-11-18	3612.090088	3619.090088	3567.330078	3567.790039	3567.790039	5274450000	51.760010	-44.300049	44.300049	-0.012264	0.012264	0.014330

	AUD	CNY	EUR	GBP	HKD	IDR	ILS	INR	JPY	KRW	MYR	PHP	USD
AUD	0.000000	1.765105	1.617726	1.402286	1.778191	25.885671	1.966947	1.936221	2.030862	1.812342	1.733630	0.000000	1.802198
CNY	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
EUR	1.614610	1.108511	0.000000	1.222037	0.975067	25.630153	0.000000	1.237742	1.102839	1.465417	1.187733	0.000000	0.978058
GBP	1.413005	1.192424	1.221819	0.000000	1.270415	0.526125	0.000000	1.936310	1.457494	0.475725	0.503219	1.346891	1.290847
HKD	1.775846	0.625612	0.971492	1.273583	0.000000	26.066931	0.000000	0.792299	1.174393	1.216914	1.170800	0.000000	0.092875
IDR	58.880200	0.000000	3.722352	4.621472	3.432996	0.000000	0.000000	0.000000	3.550658	480.011783	25.883569	0.000000	2.696909
ILS	1.947763	0.000000	4.193446	1.569960	0.000000	0.000000	0.000000	0.000000	1.999840	0.000000	0.000000	0.000000	1.435784
INR	2.095171	0.000000	1.456146	1.863981	0.958101	2.640660	0.000000	0.000000	1.467508	1.827610	1.364379	0.000000	1.469060
JPY	2.060336	1.412815	1.083625	1.455782	1.173537	15.058261	2.014596	1.471059	0.000000	1.768616	1.358183	1.336382	1.171618
KRW	1.708108	0.000000	1.436220	1.600700	1.241811	2.206590	0.000000	1.913017	1.819804	0.000000	1.395260	0.000000	1.378621
MYR	1.763106	0.000000	0.533141	1.388302	0.678967	2.246060	0.000000	1.542217	1.428620	1.330424	0.000000	0.000000	0.646466
PHP	0.000000	0.000000	1.125521	1.353483	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	1.230734
USD	1.790510	0.530555	0.976929	1.291675	0.092931	2.533675	1.438555	1.471069	1.171497	1.381792	0.648162	1.242660	0.000000

		Health Testing	Health Resources	Restriction and Regulation of Government Services	Restriction and Regulation of Businesses	Restrictions of Mass Gatherings	Closure and Regulation of Schools	External Border Restrictions	Other Policy Not Listed Above	Internal Border Restrictions	Public Awareness Measures	Declaration of Emergency	Health Monitoring	Quarantine	Social Distancing	New Task Force, Bureau or Administrative Configuration	Lockdown	Anti-Disinformation Measures	Hygiene	Curfew	next_day
country	date_announced
Afghanistan	2020-01-30	0	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
	2020-02-12	1	0	0	0	0	0	0	1	0	1	0	0	0	0	1	0	0	0	0	0
	2020-02-23	0	0	0	0	0	0	2	0	0	0	0	0	0	0	0	0	0	0	0	0
	2020-02-27	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0
	2020-03-01	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	0	0

	Country	Date	External Border Restrictions	...	next_day
0	United States	2020-01-02	0.0	...	0.0
3984	Hong Kong	2020-01-02	0.0	...	1.0
3766	Australia	2020-01-02	0.0	...	1.0
3105	China	2020-01-02	1.0	...	0.0
2889	Israel	2020-01-02	0.0	...	0.0

Dep. Variable:	Volatility	R-squared:	0.063
Model:	OLS	Adj. R-squared:	0.062
Method:	Least Squares	F-statistic:	108.7
Date:	Tue, 24 Nov 2020	Prob (F-statistic):	7.50e-90
Time:	23:56:04	Log-Likelihood:	18535.
No. Observations:	6517	AIC:	-3.706e+04
Df Residuals:	6512	BIC:	-3.703e+04
Df Model:	4
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
const	0.0160	0.000	88.060	0.000	0.016	0.016
Lockdown	0.0111	0.002	7.214	0.000	0.008	0.014
Health	0.0037	0.000	10.256	0.000	0.003	0.004
Restrictions	0.0021	0.000	10.233	0.000	0.002	0.003
Information	0.0039	0.001	6.508	0.000	0.003	0.005

Omnibus:	4508.187	Durbin-Watson:	0.429
Prob(Omnibus):	0.000	Jarque-Bera (JB):	96970.034
Skew:	3.040	Prob(JB):	0.00
Kurtosis:	20.892	Cond. No.	9.27

	Country	Date	next_day	Open	High	Low	...	Volume	Range	Change	AbsoluteChange	percentChange	AbsolutePercentChange	Volatility	Volatility_Change	Absolute_Volatility_Change
3541	Australia	2020-01-03	1.0	79.470001	80.860001	79.349998	...	174908	1.510003	0.309998	0.309998	0.003901	0.003901	0.019001	0.001596	0.001596
3542	Australia	2020-01-06	1.0	79.000000	79.959999	78.639999	...	166217	1.320000	0.889999	0.889999	0.011266	0.011266	0.016709	-0.002292	0.002292
3543	Australia	2020-01-07	1.0	79.830002	81.190002	79.750000	...	208398	1.440002	1.229996	1.229996	0.015408	0.015408	0.018038	0.001329	0.001329
3544	Australia	2020-01-08	1.0	80.680000	81.080002	79.360001	...	291714	1.720001	-0.080002	0.080002	-0.000992	0.000992	0.021319	0.003280	0.003280
3545	Australia	2020-01-09	1.0	81.339996	82.099998	80.610001	...	337113	1.489997	-0.289993	0.289993	-0.003565	0.003565	0.018318	-0.003001	0.003001

	Country	Date	Lockdown	next_day	Health	Restrictions	Information	Open	High	Low	...	Volume	Range	Change	AbsoluteChange	percentChange	AbsolutePercentChange	Volatility	Volatility_Change	Absolute_Volatility_Change
221	United States	2020-11-16	False	0.0	False	False	False	3600.159912	3628.510010	3600.159912	...	5281980000	28.350098	26.750000	26.750000	0.007430	0.007430	0.007875	0.000000	0.000000
3538	United States	2020-11-17	False	1.0	False	False	False	3610.310059	3623.110107	3588.679932	...	4799570000	34.430175	-0.780030	0.780030	-0.000216	0.000216	0.009537	0.001662	0.001662
222	United States	2020-11-17	False	0.0	False	False	False	3610.310059	3623.110107	3588.679932	...	4799570000	34.430175	-0.780030	0.780030	-0.000216	0.000216	0.009537	0.000000	0.000000
3539	United States	2020-11-18	False	1.0	False	False	False	3612.090088	3619.090088	3567.330078	...	5274450000	51.760010	-44.300049	44.300049	-0.012264	0.012264	0.014330	0.004793	0.004793
223	United States	2020-11-18	False	0.0	False	False	False	3612.090088	3619.090088	3567.330078	...	5274450000	51.760010	-44.300049	44.300049	-0.012264	0.012264	0.014330	0.000000	0.000000

Dep. Variable:	Volatility_Change	R-squared:	0.0012
Estimator:	PooledOLS	R-squared (Between):	-3.4681
No. Observations:	1516	R-squared (Within):	0.0015
Date:	Tue, Nov 24 2020	R-squared (Overall):	0.0012
Time:	23:56:05	Log-likelihood	4318.7
Cov. Estimator:	Unadjusted
		F-statistic:	0.4683
Entities:	15	P-value	0.7591
Avg Obs:	101.07	Distribution:	F(4,1511)
Min Obs:	72.000
Max Obs:	108.00	F-statistic (robust):	0.4683
		P-value	0.7591
Time periods:	54	Distribution:	F(4,1511)
Avg Obs:	28.074
Min Obs:	8.0000
Max Obs:	30.000

Parameter Estimates
	Parameter	Std. Err.	T-stat	P-value	Lower CI	Upper CI
const	0.0003	0.0004	0.7986	0.4247	-0.0005	0.0012
Lockdown	-0.0016	0.0024	-0.6429	0.5204	-0.0063	0.0032
Health	-0.0011	0.0011	-0.9533	0.3406	-0.0033	0.0011
Restrictions	-0.0001	0.0010	-0.1063	0.9154	-0.0020	0.0018
Information	-0.0005	0.0012	-0.3921	0.6951	-0.0027	0.0018