mrau1.github.io

Final Tutorial: Impact of Coronavirus on Macro Markets

The aim of this project is to examine the relationships between major coronavirus events and their impact on various economies.

Datasets

Background Knowledge

During drastic times when things go wrong, people scramble, they are uncertain and tend to make rash decisions. These scenarios are all too common in our society and recently occured as a result of the COVID-19 Pandemic. The stock market and foreign exhcange rates are perfect vehicles of this channeled uncertainty and provide insights into investor sentiment and economic strength. To understand this project we need a brief background into economics.

It is known that people are never comfortable with uncertainty and tend to seek certainty whenever the possible. This is especially prevelant in financial markets where people, countries, and firms want to have certain plans around how their money is allocated. Most investors prefer reliable returns that meet their goals and countries allocate their wealth in currencies or commodities that hold value. During major events throughout history such as September 11, 2001; The Great Recession, and President John F. Kennedy's assasination, markets tumbled because investors did not know what would happen next.

Another concept that should be understood is that a stock/currency is like a voting machine. The more people that buy the stock the higher the price will go, and vice-versa. In foreign exchange markets the more people that demand a currency in relation to another, the highly demand currency will increase in value (appreciate) while the lessor will decrease in value (depreciate). If more Americans want to travel to Britain than the United States and demand more of Britain's currency (British Pound), then the British Pound will appreciate in value and the United States Dollar will depreciate.

Lastly, investors psychologically tend to act like cartels. This is because a stock is a voting machine. Even if a company or currency has no intrinsic value, if people are willing to keep buying the underlying at its current value, then the underlying will not drop. All the market players are acting rationally so nobody loses their investment. However, if one major investor (or enough smaller investors) sell enough of the underlying at once, then the value will drop and other investors will panic and close their positions. This downward trend will cause prices to rapidly drop, and it only stops when enough people have confidence to buy the underlying at its reduced price.

Purpose

The purpose of this project is to analyze the uncertainty of market participants. To do so i'll analyze the foreign exchange rates and major stock market indices of 14 large countries. These countries are a mix of developing and developed countries who have active stock markets, and are relevant to the world economy.

This project aims to analyze which markets were the most volatile, dependable, and risky during a time of crisis. Furthermore, consumer sentiment is analyzed regarding COVID-19 events to capture investor reaction to events.

  • Which stock markets were the most volatile?
  • What currencies were the most volatile?
  • What currencies had the highest appreciation? Depreciation?
  • How did COVID-19 event announcements effect volatility?
  • How did different COVID-19 events effect their respective stock markets?
In [1]:
!pip install mapclassify
!pip install pycountry-convert
!pip install geopy
Requirement already satisfied: mapclassify in c:\users\matthew raw\anaconda3\lib\site-packages (2.3.0)
Requirement already satisfied: networkx in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (2.4)
Requirement already satisfied: scikit-learn in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (0.22.1)
Requirement already satisfied: scipy>=1.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (1.4.1)
Requirement already satisfied: numpy>=1.3 in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (1.18.1)
Requirement already satisfied: pandas>=1.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from mapclassify) (1.0.3)
Requirement already satisfied: decorator>=4.3.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from networkx->mapclassify) (4.4.2)
Requirement already satisfied: joblib>=0.11 in c:\users\matthew raw\anaconda3\lib\site-packages (from scikit-learn->mapclassify) (0.14.1)
Requirement already satisfied: python-dateutil>=2.6.1 in c:\users\matthew raw\anaconda3\lib\site-packages (from pandas>=1.0->mapclassify) (2.8.1)
Requirement already satisfied: pytz>=2017.2 in c:\users\matthew raw\anaconda3\lib\site-packages (from pandas>=1.0->mapclassify) (2019.3)
Requirement already satisfied: six>=1.5 in c:\users\matthew raw\anaconda3\lib\site-packages (from python-dateutil>=2.6.1->pandas>=1.0->mapclassify) (1.14.0)
Requirement already satisfied: pycountry-convert in c:\users\matthew raw\anaconda3\lib\site-packages (0.7.2)
Requirement already satisfied: pycountry>=16.11.27.1 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (20.7.3)
Requirement already satisfied: pprintpp>=0.3.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (0.4.0)
Requirement already satisfied: pytest-cov>=2.5.1 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (2.10.1)
Requirement already satisfied: repoze.lru>=0.7 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (0.7)
Requirement already satisfied: wheel>=0.30.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (0.34.2)
Requirement already satisfied: pytest-mock>=1.6.3 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (3.3.1)
Requirement already satisfied: pytest>=3.4.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pycountry-convert) (5.4.1)
Requirement already satisfied: coverage>=4.4 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest-cov>=2.5.1->pycountry-convert) (5.3)
Requirement already satisfied: py>=1.5.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (1.8.1)
Requirement already satisfied: packaging in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (20.3)
Requirement already satisfied: attrs>=17.4.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (19.3.0)
Requirement already satisfied: more-itertools>=4.0.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (8.2.0)
Requirement already satisfied: pluggy<1.0,>=0.12 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (0.13.1)
Requirement already satisfied: wcwidth in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (0.1.8)
Requirement already satisfied: importlib-metadata>=0.12 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (1.5.0)
Requirement already satisfied: atomicwrites>=1.0 in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (1.3.0)
Requirement already satisfied: colorama in c:\users\matthew raw\anaconda3\lib\site-packages (from pytest>=3.4.0->pycountry-convert) (0.4.3)
Requirement already satisfied: pyparsing>=2.0.2 in c:\users\matthew raw\anaconda3\lib\site-packages (from packaging->pytest>=3.4.0->pycountry-convert) (2.4.6)
Requirement already satisfied: six in c:\users\matthew raw\anaconda3\lib\site-packages (from packaging->pytest>=3.4.0->pycountry-convert) (1.14.0)
Requirement already satisfied: zipp>=0.5 in c:\users\matthew raw\anaconda3\lib\site-packages (from importlib-metadata>=0.12->pytest>=3.4.0->pycountry-convert) (2.2.0)
Requirement already satisfied: geopy in c:\users\matthew raw\anaconda3\lib\site-packages (2.0.0)
Requirement already satisfied: geographiclib<2,>=1.49 in c:\users\matthew raw\anaconda3\lib\site-packages (from geopy) (1.50)

mrau1.github.io

Initial Data Extraction, Transform, Load & Tidy

In [2]:
import geopandas as gpd
import requests
import csv
import time
import pandas as pd
import scipy.stats as sp
import logging
import threading
import time
import json
import os
import datetime as dt
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
import math
from sklearn.linear_model import LinearRegression

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:95% !important; }</style>"))

#Show all the rows of a dataframe
pd.options.display.max_rows = 10000
pd.options.display.width = 1000
Note for Data Extraction:

Extraction: multiple datasets were extracted from finance.yahoo.com. The code below uses RESTFUL api's to get historical data for thousands of financial tickers, instruments, and currencies. This code uses the requests library to pull the data, and pandas to convert into a pandas DataFrame. Each historical dataset can be webscraped or downloaded as a .csv file. Each DataFrame has the same column structure with columns for Date, Open, Close, High, Low, Adj Close, and Volume. Volume however, is not regularly recoreded, and for the sake of this project, will be dropped from consideration.

In [3]:
# Url of Historical Data from yahoo.finance.com
historical_url = "https://finance.yahoo.com/quote/%s/history?p=%s"

"""Adjusted to automatically take in symbols for automation"""

# Creates urls to grab current price of the underlying
def generateUrl(symbol=False):
    url_1 = "https://finance.yahoo.com/quote/"
    url_2 = "?p="
    url_3 = "&.tsrc=fin-srch"
    if symbol == False:
        # Allows user to input stock symbol and returns the Yahoo finance URL
        symbol = input("Enter a ticker symbol for a stock: ")
        url = url_1 + symbol + url_2 + symbol + url_3
    else:
        url = url_1 + symbol + url_2 + symbol + url_3       
    return url
       
"""Adjusted to allow symbols and different URL's to pass through
    Allows us to grab from historical, options, and other data sources"""
def dataGrabber(symbol=False, url=None):
    if url == None:
        # Obtain server's response to HTTP request
        url = generateUrl(symbol)   
    src = requests.get(url)
    # Converts source code to string
    data = src.text
    return data
       

# Returns the current price of the underlying
def getCurrentPrice(symbol=False):
    # Isolates the currentPrice variable
    data = dataGrabber(symbol)
    split = data.split("currentPrice\":", 1)
    halfList = split[1]
    split2 = halfList.split("}", 1)
    currentPrice = split2[0] + "}"
    finalPrice = currentPrice.split("""{"raw":""")[1].split(",")[0]
    # Print and return current price
    print(symbol, finalPrice)
    return round(float(finalPrice),3)

# Returns historical data in Pandas DataFrame format
def get_historical(symbol=None, to_csv=False):
    # Gets CSV of historical data on stock for the past 5 years
    t = str(int(time.time()))
    url_1 = "https://query1.finance.yahoo.com/v7/finance/download/"
    url_2 = "?period1=1433980800&period2=" + t
    url_3 = "&interval=1d&events=history"

    def fixNull(x):
        if x == 'null':return 0
        else:return x
    if symbol == None:
        symbol = input("Enter a ticker symbol for a stock: ")
    url = url_1 + symbol + url_2 + url_3
    # Keeps track of url that could be used as a download link
    with requests.Session() as s:
        download = s.get(url)
        decoded_content = download.content.decode('utf-8')
        cr = csv.reader(decoded_content.splitlines(), delimiter=',')
        historical_data = list(cr)
        c = True       
        data = []
        for row in historical_data:
            row = list(map(fixNull, row))
            if c:
                columns = row
                c=False
            else:
                data.append(row)
        
        df = pd.DataFrame(data=data, columns=columns)
        # Resets Parsing Error
        if symbol == "^KS11":
            df = df.set_index('Date')
            df.loc['2020-05-06', 'Low'] = str(float(df.loc['2020-05-06', 'Low']) / 10)
            df = df.reset_index()
        # Allows us to save as a CSV
        if to_csv != False:
            df.to_csv(symbol + ".csv")
    return df

Load and Extract - Country Indices

Firstly, I want to load each country's index and currency ticker. To do this I compiled a csv file (data/Stock Market Indices by Country - Sheet1.csv) to hold this information. I use the pandas .read_csv() function to load the data into a pandas DataFrame. After this step I select only Symbol, Country, and Currency as variables. These variables are the only variables needed to tidy this DataFrame.

In [4]:
# Load country index and currency tickers
# Note that only `Symbol`, `Country`,  and `Currency` will be used. 
country_indices = pd.read_csv("data/Stock Market Indices by Country - Sheet1.csv")[['Symbol', 'Country', 'Currency']]
print(country_indices)
       Symbol         Country Currency
0       ^GSPC   United States      USD
1      ASX.AX       Australia      AUD
2        ^HSI       Hong Kong      HKD
3      ^BSESN           India      INR
4       ^JKII       Indonesia      IDR
5       ^N225           Japan      JPY
6       ^KLSE        Malaysia      MYR
7     PSEI.PS    Phillippines      PHP
8       ^KS11     South Korea      KRW
9       ^FCHI          France      EUR
10     ^GDAXI         Germany      EUR
11       ^AEX     Netherlands      EUR
12   ^BUK100P  United Kingdom      GBP
13  ^TA125.TA          Israel      ILS
14  000001.SS           China      CNY
Load and Extract - CoronaNet data

The CoronaNet project features over 1000 researchers worldwide who compiled event data about Covid-19 dating back to January 2020. I downloaded their dataset as a .csv file and stored the file in my jupyter repository. I use the pandas .read_csv() function to properly load the data and store in and pandas DataFrame. This dataset will be used to classify major COVID-19 announcements.

In [5]:
# Download COVID-19 event data
# Data set goes from January 2020 until the end of August, thus only those portions will be analyzed.
corona = pd.read_csv("data/coronanet_release.csv")
corona.head()
Out[5]:
record_id policy_id entry_type correct_type update_type update_level description date_announced date_start date_end ... travel_mechanism compliance enforcer index_high_est index_med_est index_low_est index_country_rank link date_updated recorded_date
0 R_1gdyC25xLGlP5H1Bv 3889409 new_entry original NaN NaN US state Michigan creates new task force for e... 2020-05-29 2024-05-29 2024-05-29 ... NaN Mandatory (Unspecified/Implied) New (COVID-19 specific) Task Force NaN NaN NaN NaN https://www.michigan.gov/whitmer/0,9309,7-387-... 2020-07-05 2020-07-05T12:40:29Z
1 R_vp4xmb4cOiQ3nb3NA 1429889 new_entry original NaN NaN At June 9th 2020 the state government of Hesse... 2020-06-09 2020-06-09 2023-01-01 ... NaN Voluntary/Recommended but No Penalties Provincial/State Government NaN NaN NaN NaN https://www.hessen.de/sites/default/files/medi... 2020-06-22 2020-06-22T09:20:26Z
2 R_ZxFGXQx5sEuEvbHNA 3386408 new_entry original NaN NaN The Parliament of Japan enacted on March 13, 2... 2020-03-13 2020-03-15 2022-03-13 ... NaN Mandatory (Unspecified/Implied) National Government 53.308786 52.497797 51.686694 158.0 https://www.aa.com.tr/en/asia-pacific/japans-a... 2020-04-15 2020-04-15T11:56:10Z
3 R_1KxVwu4XvnTKkcDCe 1846710 new_entry original NaN NaN Finland's Ministry of Social Affairs and Healt... 2020-05-28 2020-05-28 2021-12-31 ... NaN Mandatory (Unspecified/Implied) Ministry/Department of Health NaN NaN NaN NaN https://stm.fi/en/artikkeli/-/asset_publisher/... 2020-06-12 2020-06-12T11:16:38Z
4 R_22kWz6RdbvMhvnnNA 1919243 new_entry correction NaN NaN Egypt will deduct 1% from people’s salaries fo... 2020-05-20 2020-07-01 2021-07-01 ... NaN Mandatory with Legal Penalties (Jail Time) National Government NaN NaN NaN NaN https://www.reuters.com/article/health-coronav... 2020-07-17 2020-07-17T18:16:50Z

5 rows × 39 columns

Clean & Manipulate - Yahoo Finance

Yahoo finance provides great baseline statistics for historical data, however, I need to expand on that data in order to gain the insights I seek. To do this, I created the clean_df() function to convert the data returned from API calls into workable datasets. I do this by converting Open, Close, High, Low, and Adj Close into floats, and Date into a datetime class. I then manipulate the data further to create Range, Change, AbsoluteChange, percentChange, AbsolutePercentChange, and Volatility to provide key statistics on price movement. This data will be used further on to gain insights into how each index performed relative to their peers.

In [6]:
# Historical Data From yahoo.finance.com is packaged the same way, so it makes sense to create functions to process these DataFrames.
# `Open`, `Close`, `High`, `Low`, and `Adj Close` are converted into floats
# `Date` is converted into a datetime class
# `Range`, `Change`, `AbsoluteChange`, `percentChange`, `AbsolutePercentChange`, and `Volatility` are created from the columns above.

def clean_df(df:pd.DataFrame, start_date=dt.datetime(2020, 1, 2), end_date=dt.datetime(2020, 11, 18)):
    df['Date'].fillna(0)
    df['Date'] = pd.to_datetime(df['Date'])
    df = df[df['Date'] >= start_date]
    df = df[df['Date'] <= end_date]
    columns = ['Open', 'Close', 'High', 'Low', 'Adj Close']
    for col in columns:
        df[col].fillna(0)
        try:df[col] = df[col].astype('float')
        except:pass
    df['Range'] = df['High'] - df['Low']
    df['Change'] = df['Close'] - df['Open']
    df['AbsoluteChange'] = df['Change'].abs()
    df['percentChange'] = df['Change'] / df['Open']
    df['AbsolutePercentChange'] = df['percentChange'].abs()
    df['Volatility'] = df['Range'].abs() / df['Open']
    
    return df

Tidy & Transform - Index Data

Index data comes from yahoo.finance and has the same file structure as listed above. The load_indices() function creates a master pandas DataFrame with each of the columns below and their correpsonding country. This allows me to keep a tidy DataFrame that is easily manipulatable.

In [7]:
# load_indices() creates a dictionary holding each country's designated index data.
def load_indices(df=country_indices):
    master = pd.DataFrame(columns=['Country',
                                   'Date', 
                                   'Open', 
                                   'High', 
                                   'Low',
                                   'Close', 
                                   'Adj Close', 
                                   'Volume', 
                                   'Range', 
                                   'Change', 
                                   'AbsoluteChange', 
                                   'percentChange', 
                                   'AbsolutePercentChange', 
                                   'Volatility'])
    for index in range(len(df)):
        new_df = clean_df(get_historical(df.loc[index].Symbol))
        new_df['Country'] = df.loc[index,"Country"]
        master = pd.concat([master, new_df])
    return master

indices = load_indices()
In [8]:
# Display the S&P 500 index which represents the United States stock market
# Note that the top index of each nation will be represented.
# The United States also has the Russell-2000, Dow Jones, VIX, among many others.
indices.head()
Out[8]:
Country Date Open High Low Close Adj Close Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility
1148 United States 2020-01-02 3244.669922 3258.139893 3235.530029 3257.850098 3257.850098 3458250000 22.609864 13.180176 13.180176 0.004062 0.004062 0.006968
1149 United States 2020-01-03 3226.360107 3246.149902 3222.340088 3234.850098 3234.850098 3461290000 23.809814 8.489991 8.489991 0.002631 0.002631 0.007380
1150 United States 2020-01-06 3217.550049 3246.840088 3214.639893 3246.280029 3246.280029 3674070000 32.200195 28.729980 28.729980 0.008929 0.008929 0.010008
1151 United States 2020-01-07 3241.860107 3244.909912 3232.429932 3237.179932 3237.179932 3420380000 12.479980 -4.680175 4.680175 -0.001444 0.001444 0.003850
1152 United States 2020-01-08 3238.590088 3267.070068 3236.669922 3253.050049 3253.050049 3720890000 30.400146 14.459961 14.459961 0.004465 0.004465 0.009387
In [9]:
# Note that the number of trading days varies as a result of national holidays.
indices.Country.value_counts()
Out[9]:
Phillippines      230
France            227
Netherlands       227
Australia         226
United Kingdom    226
Germany           226
United States     224
India             221
South Korea       219
Hong Kong         218
Malaysia          217
Israel            216
Indonesia         215
Japan             213
China             212
Name: Country, dtype: int64

Getting Forex Data

get_forex() is built off the get_historical() function and takes in two currencies to compare. A finance.yahoo.com DataFrame will be returned with the same columns seen for the index data.

In [10]:
"""
functions to set up forex data
Clean data to accurately merge
"""
def get_forex(cur1:str, cur2:str):
    return clean_df(get_historical(cur1+cur2+'=X'))

get_forex('USD','GBP').head() 
Out[10]:
Date Open High Low Close Adj Close Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility
1190 2020-01-02 0.75469 0.76190 0.753700 0.75470 0.75470 0 0.008200 0.00001 0.00001 0.000013 0.000013 0.010865
1191 2020-01-03 0.76081 0.76600 0.759810 0.76030 0.76030 0 0.006190 -0.00051 0.00051 -0.000670 0.000670 0.008136
1192 2020-01-06 0.76445 0.76538 0.759071 0.76452 0.76452 0 0.006309 0.00007 0.00007 0.000092 0.000092 0.008253
1193 2020-01-07 0.75919 0.76357 0.756880 0.75930 0.75930 0 0.006690 0.00011 0.00011 0.000145 0.000145 0.008812
1194 2020-01-08 0.76270 0.76437 0.759330 0.76256 0.76256 0 0.005040 -0.00014 0.00014 -0.000184 0.000184 0.006608

Load, Extract, & Tidy: Coronanet Project

The CoronaNet Project's dataset has 41 different columns, however, we'll only use country, target_country, init_country_level, date_announced, and type. We'll use this dataset to track COVID-19 events throughout 2020 and will eventually merge with Index data. Note that this dataset is mainly meant to provide baselines and dummy variables. Meta data analyses are drawn for observational purposes and understanding of the COVID-19 timeline.

In this section I plot the amount of events per country, I use events that are at the national level. These events have backing from the national government and have a greater magnitude than local or provincial events. The variable init_country_level indicates the level of each event. Furthermore, in the first cell, the United States is listed as United States of America, thus, the name is changed to properly join with the Index data.

In [11]:
corona.columns
Out[11]:
Index(['record_id', 'policy_id', 'entry_type', 'correct_type', 'update_type', 'update_level', 'description', 'date_announced', 'date_start', 'date_end', 'country', 'ISO_A3', 'ISO_A2', 'init_country_level', 'domestic_policy', 'province', 'city', 'type', 'type_sub_cat', 'type_text', 'school_status', 'target_country', 'target_geog_level', 'target_region', 'target_province', 'target_city', 'target_other', 'target_who_what', 'target_direction', 'travel_mechanism', 'compliance', 'enforcer', 'index_high_est', 'index_med_est', 'index_low_est', 'index_country_rank', 'link', 'date_updated', 'recorded_date'], dtype='object')
In [12]:
# Replace `United States of America` with `United States`
corona['target_country'] = corona['target_country'].str.replace('United States of America','United States')
corona['country'] = corona['country'].str.replace('United States of America','United States')
corona.country.unique()
Out[12]:
array(['United States', 'Germany', 'Japan', 'Finland', 'Egypt', 'India',
       'Sweden', 'Ivory Coast', 'Switzerland', 'European Union',
       'Guatemala', 'New Zealand', 'Russia', 'Bolivia', 'Sierra Leone',
       'Argentina', 'Palau', 'Ethiopia', 'Israel', 'Uzbekistan',
       'Australia', 'Honduras', 'North Korea', 'Chile', 'San Marino',
       'Laos', 'Colombia', 'Luxembourg', 'Panama', 'El Salvador',
       'Netherlands', 'Gambia', 'Grenada', 'Spain', 'Chad',
       'Saint Kitts and Nevis', 'Italy', 'Algeria', 'Qatar', 'Brunei',
       'Estonia', 'Saint Vincent and the Grenadines', 'Costa Rica',
       'Brazil', 'Burkina Faso', 'Norway', 'China', 'Namibia',
       'Philippines', 'Tonga', 'Indonesia', 'Turkey', 'Mali', 'Tunisia',
       'Niger', 'Rwanda', 'Canada', 'Cyprus', 'Mongolia', 'Kuwait',
       'Liechtenstein', 'Azerbaijan', 'Myanmar', 'Ireland', 'Bangladesh',
       'Austria', 'Lebanon', 'Guinea Bissau', 'Hungary', 'Tajikistan',
       'Iraq', 'South Africa', 'Papua New Guinea', 'Bahrain', 'Iceland',
       'Botswana', 'Morocco', 'Oman', 'Nigeria', 'Sri Lanka', 'Nicaragua',
       'Seychelles', 'Mexico', 'Taiwan', 'Thailand', 'Haiti',
       'Sao Tome and Principe', 'Guyana', 'Mozambique', 'Uganda',
       'Republic of Congo', 'Solomon Islands', 'Bosnia and Herzegovina',
       'France', 'East Timor', 'Liberia', 'Zimbabwe', 'Dominica',
       'Kazakhstan', 'Nauru', 'United Arab Emirates', 'Portugal',
       'Armenia', 'Sudan', 'Vanuatu', 'Dominican Republic', 'Suriname',
       'Monaco', 'Belarus', 'Kenya', 'Poland', 'Nepal', 'Latvia', 'Samoa',
       'Bhutan', 'Czech Republic', 'Kyrgyzstan', 'United Kingdom',
       'Moldova', 'Malta', 'The Bahamas', 'Northern Cyprus', 'Vietnam',
       'Barbados', 'Singapore', 'Vatican', 'Palestine', 'Albania', 'Fiji',
       'Saudi Arabia', 'Swaziland', 'Cuba', 'Libya', 'Romania', 'Guinea',
       'Greece', 'Madagascar', 'Antigua and Barbuda', 'Peru',
       'Trinidad and Tobago', 'South Sudan', 'Cameroon', 'Ukraine',
       'Venezuela', 'Ghana', 'Yemen', 'Jordan', 'Syria', 'Pakistan',
       'Angola', 'Afghanistan', 'Comoros', 'South Korea', 'Jamaica',
       'Iran', 'Lesotho', 'Montenegro', 'Cape Verde', 'Bulgaria', 'Gabon',
       'Croatia', 'Lithuania', 'Ecuador', 'Benin', 'Malawi', 'Djibouti',
       'Tuvalu', 'Turkmenistan', 'Mauritius', 'Senegal', 'Macedonia',
       'Belgium', 'Belize', 'Equatorial Guinea', 'Georgia', 'Paraguay',
       'Kiribati', 'Central African Republic', 'Uruguay', 'Eritrea',
       'Zambia', 'Democratic Republic of the Congo',
       'United Republic of Tanzania', 'Cambodia', 'Slovakia', 'Maldives',
       'Denmark', 'Saint Lucia', 'Somalia', 'Slovenia',
       'Federated States of Micronesia', 'Andorra', 'Malaysia',
       'Marshall Islands', 'Republic of Serbia', 'Togo', 'Mauritania',
       'Burundi'], dtype=object)
In [13]:
# Views the number of events in each category
corona.init_country_level.value_counts()
# `No, it is at the national level` looks like it is the same as `National`
Out[13]:
National                           14831
Provincial                         14256
Municipal                           1911
No, it is at the national level      210
Other (e.g., county)                   6
Name: init_country_level, dtype: int64
In [14]:
# View the source of the supposed error
corona[corona['init_country_level']=='No, it is at the national level'].country.value_counts()
# Finds that the error is from China!
Out[14]:
China    210
Name: country, dtype: int64
In [15]:
# Replace `No, it is at the national level` values with `National`
corona['init_country_level'] = corona['init_country_level'].str.replace('No, it is at the national level', 'National')
corona.init_country_level.value_counts()
Out[15]:
National                15041
Provincial              14256
Municipal                1911
Other (e.g., county)        6
Name: init_country_level, dtype: int64
In [16]:
# Create a national events count to isolate each countries national events
national_events = corona[corona['init_country_level']=='National'].country.value_counts()
national_events = national_events.to_frame().reset_index()
national_events.columns = ['country', 'events_count']
national_events.head(15).plot.bar(x='country', y='events_count', title='Amount of National COVID-19 Events per Country', figsize=(10, 7))
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x1763c20bf88>
In [17]:
import plotly.graph_objs as go 
from plotly.offline import init_notebook_mode,iplot,plot
init_notebook_mode(connected=True)
In [18]:
# Set up specific data structure for Map
data = dict(
        type = 'choropleth',
        colorscale = 'Viridis',
        locations = national_events['country'],
        locationmode = "country names",
        z = national_events['events_count'],
        text = national_events['country'],
        colorbar = {'title' : 'National Covid Events'},
      )

layout = dict(title = 'National Covid Event Counts by Country',
              geo = dict(projection = {'type':'mercator'})
             )
In [19]:
# Merge National Event counts back into the main `corona` DataFrame
corona = pd.merge(national_events, corona, how='right', on='country')
corona['date_announced'] = pd.to_datetime(corona['date_announced'])
corona.head()
Out[19]:
country events_count record_id policy_id entry_type correct_type update_type update_level description date_announced ... travel_mechanism compliance enforcer index_high_est index_med_est index_low_est index_country_rank link date_updated recorded_date
0 Russia 491 R_escP3JJEVT6V9OVNA 4074912 new_entry original NaN NaN Russian government is allocating about 3.5 bil... 2020-06-23 ... NaN Mandatory with Legal Penalties (Jail Time) National Government,New (COVID-19 specific) Ta... NaN NaN NaN NaN http://government.ru/docs/39949/ 2020-06-29 2020-06-29T19:25:01Z
1 Russia 491 R_1lnv6O7Vp9a0bmABg 9097760 new_entry original NaN NaN On April 12, the Russian Government allocated ... 2020-04-12 ... NaN Mandatory (Unspecified/Implied) National Government,Ministry/Department of Hea... 60.579280 56.532497 53.744462 192.0 http://government.ru/docs/39486/ 2020-07-31 2020-07-31T15:58:23Z
2 Russia 491 R_1lnv6O7Vp9a0bmADu 9097760 new_entry original NaN NaN On April 12, the Russian Government allocated ... 2020-04-12 ... NaN Mandatory (Unspecified/Implied) National Government,Ministry/Department of Hea... 60.579280 56.532497 53.744462 192.0 http://government.ru/docs/39486/ 2020-07-31 2020-07-31T15:58:23Z
3 Russia 491 R_pFd5yq7M25iOUlbDw 116208 new_entry original NaN NaN In Russia, on April 15 a temporary procedure f... 2020-04-15 ... NaN Mandatory (Unspecified/Implied) National Government,Ministry/Department of Health 61.181590 56.526482 53.086042 191.0 http://government.ru/docs/all/127505/ 2020-05-19 2020-05-19T11:00:48Z
4 Russia 491 R_1DPaQBw2ijnsxQtBt 1581062 new_entry original NaN NaN On April 3, to regulate health resources, the ... 2020-04-03 ... NaN Mandatory with Legal Penalties (Jail Time),Man... National Government,Ministry/Department of Health 58.587156 56.661673 55.182878 194.0 http://government.ru/docs/39399/ 2020-06-06 2020-06-06T04:18:10Z

5 rows × 40 columns

In [20]:
# Display Map
choromap = go.Figure(data = [data],layout = layout)
iplot(choromap,validate=False)
In [21]:
# Build Dummy Variables for Event Type
corona = corona.merge(pd.get_dummies(corona['type']), left_index=True, right_index=True)
corona.columns
Out[21]:
Index(['country', 'events_count', 'record_id', 'policy_id', 'entry_type', 'correct_type', 'update_type', 'update_level', 'description', 'date_announced', 'date_start', 'date_end', 'ISO_A3', 'ISO_A2', 'init_country_level', 'domestic_policy', 'province', 'city', 'type', 'type_sub_cat', 'type_text', 'school_status', 'target_country', 'target_geog_level', 'target_region', 'target_province', 'target_city', 'target_other', 'target_who_what', 'target_direction', 'travel_mechanism', 'compliance', 'enforcer', 'index_high_est', 'index_med_est', 'index_low_est', 'index_country_rank', 'link', 'date_updated', 'recorded_date', 'Anti-Disinformation Measures', 'Closure and Regulation of Schools', 'Curfew', 'Declaration of Emergency', 'External Border Restrictions', 'Health Monitoring', 'Health Resources', 'Health Testing', 'Hygiene', 'Internal Border Restrictions', 'Lockdown', 'New Task Force, Bureau or Administrative Configuration', 'Other Policy Not Listed Above', 'Public Awareness Measures',
       'Quarantine', 'Restriction and Regulation of Businesses', 'Restriction and Regulation of Government Services', 'Restrictions of Mass Gatherings', 'Social Distancing'],
      dtype='object')
In [22]:
# Determine the amount of events per day during 2020
# `date_announced` is the variable indicating a new event
# Note that `date_announced` does not start a new event, but rather indicates that an event will happen
dd = corona[corona['date_announced'] <= dt.datetime.today()].date_announced.value_counts().to_frame()
dd.columns = ['Count']
plt.xlabel("Date")
plt.ylabel('Count')
dd.plot(title='Number of National COVID-19 Events Per Day', figsize=(10, 7))
Out[22]:
<matplotlib.axes._subplots.AxesSubplot at 0x17637a73448>

INSIGHT - Correlation between events and VIX

What is the VIX?

The VIX is a measure of Market Volatility in the S&P 500 option spreads. The higher the VIX the more uncertainty in the marketplace.

The chart above high resembles the VIX (Volatility Index-Index), Below I quickly plot and run a regression between the closing price of the VIX (close) and the number of Covid Announcements (date_announced) between January and August 2020.

In [23]:
# Web Scrap VIX historical data and clean the DataFrame
VIX = clean_df(get_historical('^VIX'))
In [24]:
# WebScrap Vix historical data and clean the DataFrame
VIX.set_index('Date')['Close'].plot(title='VIX Closing Price 2020 Levels', figsize=(10,7))
Out[24]:
<matplotlib.axes._subplots.AxesSubplot at 0x17637ae3a88>
In [25]:
# Merge VIX and events data

VX = pd.merge(VIX[['Close', 'Date']], dd.reset_index(), left_on='Date', right_on='index').set_index('index')
VX.sort_values('Close', ascending=False).head()
Out[25]:
Close Date Count
index
2020-03-16 82.690002 2020-03-16 1090
2020-03-18 76.449997 2020-03-18 708
2020-03-17 75.910004 2020-03-17 956
2020-03-12 75.470001 2020-03-12 530
2020-03-19 72.000000 2020-03-19 629
In [26]:
# Set up a Linear Regression using statsmodels.api
X = sm.add_constant(VX[['Count']])
Y = VX['Close']
output = sm.OLS(Y, X).fit()

# Output the statistical summary of the linear regression
print(output.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  Close   R-squared:                       0.792
Model:                            OLS   Adj. R-squared:                  0.791
Method:                 Least Squares   F-statistic:                     568.7
Date:                Tue, 24 Nov 2020   Prob (F-statistic):           9.98e-53
Time:                        23:53:50   Log-Likelihood:                -504.09
No. Observations:                 151   AIC:                             1012.
Df Residuals:                     149   BIC:                             1018.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         20.7300      0.728     28.477      0.000      19.292      22.168
Count          0.0661      0.003     23.847      0.000       0.061       0.072
==============================================================================
Omnibus:                        2.965   Durbin-Watson:                   0.733
Prob(Omnibus):                  0.227   Jarque-Bera (JB):                2.519
Skew:                           0.228   Prob(JB):                        0.284
Kurtosis:                       3.439   Cond. No.                         342.
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Analysis - VIX vs. Covid Events

The correlation between the Closing Price of the VIX and the # of Covid-19 announcements per day show significance. This model, with an R^2 of 0.792 is representative of 79.2% of the sample. This model could be an accurate indicator of the VIX, however, we would like to increase our R^2 if possible. More independent variables such as event magnitude could be added to create a more representative model.

Regarding the Beta Coefficients, both the constant and count of date_announcements are statistically significant. Count has a t-stat of 23.847 which is significant to the 0.01% level.

+- 1.96 is statistically significant at the 95% significance level.

The VIX formula accounts for options spreads and is a measure of market uncertainty. Feel free to learn more about the VIX formula in the link below

https://www.cboe.com/micro/vix/vixwhite.pdf

CoronaNet Data - Mapping

I plot the amount of events of each nation using plotly's graphing capabilities. I generate a world map and use the National event counts we accumulated to view the amount of national events for each nation.

Coronavirus Events Per Nation: Analysis

As we can see above, Russia has the most national events from January 2020 - August 2020. Russia is followed by China, Bangladesh, and Guatemala. China and Russia have communist governments who have centralized control of the government. Guatemala and Bangladesh do have centralized control, however, they are generally poorly run governments. Government uncertainty may also be a reason for the quantity of events. These governments may have enacted, then retracted numerous events.

In [27]:
# Display Index data for the United States of America
indices[indices['Country'] == 'United States'].tail()
Out[27]:
Country Date Open High Low Close Adj Close Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility
1367 United States 2020-11-12 3562.669922 3569.020020 3518.580078 3537.010010 3537.010010 4890120000 50.439942 -25.659912 25.659912 -0.007202 0.007202 0.014158
1368 United States 2020-11-13 3552.570068 3593.659912 3552.570068 3585.149902 3585.149902 4709670000 41.089844 32.579834 32.579834 0.009171 0.009171 0.011566
1369 United States 2020-11-16 3600.159912 3628.510010 3600.159912 3626.909912 3626.909912 5281980000 28.350098 26.750000 26.750000 0.007430 0.007430 0.007875
1370 United States 2020-11-17 3610.310059 3623.110107 3588.679932 3609.530029 3609.530029 4799570000 34.430175 -0.780030 0.780030 -0.000216 0.000216 0.009537
1371 United States 2020-11-18 3612.090088 3619.090088 3567.330078 3567.790039 3567.790039 5274450000 51.760010 -44.300049 44.300049 -0.012264 0.012264 0.014330

Create Index Analysis

In this section I build a matrix() function to calculate the relationships between DataFrames. I use this function to analyze the relationships between different indices and currency pairs. This function takes in a dependant column and takes the sample mean of that given dependant column through a time period. These time periods are specified by start_date and end_date. The means are sampled weekly, but can be changed by period.

In [28]:
# `col` is the dependent column in the matrix
# `period` is the resampling/grouping method
# `values` is the Pandas DataFrame
# `start_date` and `end_date` are date parameters
# `country_indices` are the list of tracked countries

def matrix(col,
           period='W',
           values=indices, 
           start_date=dt.datetime(2020, 2, 15), 
           end_date=dt.datetime(2020, 11, 2),
           country_indices=country_indices['Country'].to_list()):
    
    count = 0
    # Master DataFrame
    df = pd.DataFrame()
    
    for index in country_indices:   
        # Builds dataFrame if it is null
        if count == 0:
            df = values[values['Country'] == index].set_index('Date').resample(period).mean()[col].to_frame().reset_index()
            df.columns = ['Date', index]
            count += 1
            
        else: # Continually Resamples and Appends to master DataFrame
            df2 = values[values['Country'] == index].set_index('Date').resample(period).mean()[col].to_frame().reset_index()
            df2.columns = ['Date', index]
            df = pd.merge(df,
                          df2,
                          how='left',
                          on='Date')
            count += 1
    
    # Returns DataFrame between the selected dates
    df = df[(df['Date'] >= start_date) & (df['Date'] <= end_date)]
    return df.set_index('Date')

# Gets variance of volotility over March and April 2020: the main effect of COVID-19 of markets
variance = matrix(col='Volatility', period='W', start_date=dt.datetime(2020, 3, 1), end_date=dt.datetime(2020, 5, 1)).var().to_frame()
variance.columns = ['Variance'] 

# Plots the Variance in ascending order based on total variance in volatility
variance.sort_values('Variance').plot.bar(title='Variance of Percent Changes', figsize=(12, 6))
Out[28]:
<matplotlib.axes._subplots.AxesSubplot at 0x17637b006c8>

Which country had the highest volatility in their stock market?

India had the highest variance of any stock market in the sample set. India was closely followed by Australia, Indonesia, and the United States. The south eastern countries are expected to have higher volatility because the pandemic hit them first, however, China, the first country infected, had the second lowest volatility. This may be a result from their decisive actions to mitigate the virus, or the world's relative certainty in the chinese economy. India is known to be a volatile developing economy and is seen as such in this analysis.

In [29]:
# Create Matrix
vol = matrix(col='Volatility')
variance = vol.var().to_frame()

fig, ax = plt.subplots(figsize=(20,20))

# Plot the frequency of volatile weeks during March and April

# Plot each countries mean weekly volatility
for country in vol.columns:
    plt.scatter(x=vol.index, y=vol[country], label=country)

plt.legend()
plt.title("Sample Nation Volatility 2020")
plt.ylabel("Volatility")
plt.xlabel("Date")
Out[29]:
Text(0.5, 0, 'Date')

Forex Comparisons

build_forex_matrix is built to create a matrix correlating each of the currency pairs. Below Volatility is compared amongst the currency pairs.

In [30]:
def build_forex_matrix(data=country_indices.to_dict('list'), GET='matrix'):
    
    # Re-Categorizes currencies to not get Euro Dollar data multiple times
    try:
        currency = data['Currency']
        country = data['Country']
        country.remove('Germany')
        country.remove('Netherlands')
        country.remove('France')
        country.append('European Union')
        currency.remove('EUR')
        currency.remove('EUR')
    except: pass

    master = {}
    master_df = pd.DataFrame(columns=['Country',
                                       'Date', 
                                       'Open', 
                                       'High', 
                                       'Low',
                                       'Close', 
                                       'Adj Close', 
                                       'Volume', 
                                       'Range', 
                                       'Change', 
                                       'AbsoluteChange', 
                                       'percentChange', 
                                       'AbsolutePercentChange', 
                                       'Volatility'])
    
    # Iterate through countries and use each Country as a dependant variable 
    for index, c in enumerate(country):
        slave = {}
        main_currency = currency[index]
        currencies_to_use = (currency[0:index] + currency[index+1:-1])
        country_df = pd.DataFrame(columns=['Country',
                                   'Date', 
                                   'Open', 
                                   'High', 
                                   'Low',
                                   'Close', 
                                   'Adj Close', 
                                   'Volume', 
                                   'Range', 
                                   'Change', 
                                   'AbsoluteChange', 
                                   'percentChange', 
                                   'AbsolutePercentChange', 
                                   'Volatility'])
        
        # Iterate through other countries and use as indpendant variables
        for curr in currencies_to_use:
            try:
                # Returns forex data from yahoo.finance.com
                temp_df = get_forex(main_currency, curr)
                temp_df['Country'] = curr
                # Appends to master DataFrame
                country_df = pd.concat([country_df, temp_df])

            except: 
                pass
 
        # Creates a final matrix for Volatility for each Dependant Currency
        final = matrix(col='Volatility',
                       period='W', 
                       values=country_df, 
                       start_date=dt.datetime(2020, 2, 15), 
                       end_date=dt.datetime(2020, 4, 1),
                       country_indices=currencies_to_use).mean()
        country_df['currency_pair'] = main_currency + '/' + country_df['Country']
        master_df = pd.concat([master_df, country_df])
        final[main_currency] = 0
        master[main_currency] = final
    
    # Returns DataFrame of Raw data if selected
    if GET == 'DF':
        return master_df
    
    # Returns Matrix data
    return pd.DataFrame(master)
    

foreign = build_forex_matrix()
In [31]:
foreign = foreign.fillna(0)
foreign = foreign.reindex(sorted(foreign.columns), axis=1)
foreign * 100 
Out[31]:
AUD CNY EUR GBP HKD IDR ILS INR JPY KRW MYR PHP USD
AUD 0.000000 1.765105 1.617726 1.402286 1.778191 25.885671 1.966947 1.936221 2.030862 1.812342 1.733630 0.000000 1.802198
CNY 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
EUR 1.614610 1.108511 0.000000 1.222037 0.975067 25.630153 0.000000 1.237742 1.102839 1.465417 1.187733 0.000000 0.978058
GBP 1.413005 1.192424 1.221819 0.000000 1.270415 0.526125 0.000000 1.936310 1.457494 0.475725 0.503219 1.346891 1.290847
HKD 1.775846 0.625612 0.971492 1.273583 0.000000 26.066931 0.000000 0.792299 1.174393 1.216914 1.170800 0.000000 0.092875
IDR 58.880200 0.000000 3.722352 4.621472 3.432996 0.000000 0.000000 0.000000 3.550658 480.011783 25.883569 0.000000 2.696909
ILS 1.947763 0.000000 4.193446 1.569960 0.000000 0.000000 0.000000 0.000000 1.999840 0.000000 0.000000 0.000000 1.435784
INR 2.095171 0.000000 1.456146 1.863981 0.958101 2.640660 0.000000 0.000000 1.467508 1.827610 1.364379 0.000000 1.469060
JPY 2.060336 1.412815 1.083625 1.455782 1.173537 15.058261 2.014596 1.471059 0.000000 1.768616 1.358183 1.336382 1.171618
KRW 1.708108 0.000000 1.436220 1.600700 1.241811 2.206590 0.000000 1.913017 1.819804 0.000000 1.395260 0.000000 1.378621
MYR 1.763106 0.000000 0.533141 1.388302 0.678967 2.246060 0.000000 1.542217 1.428620 1.330424 0.000000 0.000000 0.646466
PHP 0.000000 0.000000 1.125521 1.353483 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.230734
USD 1.790510 0.530555 0.976929 1.291675 0.092931 2.533675 1.438555 1.471069 1.171497 1.381792 0.648162 1.242660 0.000000

Correction - IDR/KRW

Note that the value of IDR, KRW is an outlier skewing our results. To correct this error, I reset the value to 0. This value allows other values to correctly

In [32]:
foreign.loc['IDR', 'KRW'] = 0
In [33]:
import seaborn as sns
fig, ax = plt.subplots(figsize=(20,20))
colormap = sns.diverging_palette(220, 10, as_cmap=True)
sns.heatmap(foreign, cmap=colormap, annot=True, fmt = ".2f")
plt.xticks(range(len(foreign.columns)), foreign.columns)
plt.yticks(range(len(foreign.columns)), foreign.columns)
plt.title("Currency Pair Volatility Correlations")
plt.show()
In [34]:
forex = build_forex_matrix(GET='DF')
In [35]:
forex_df = forex[(forex['Date'] >= dt.datetime(2020, 2, 15)) & (forex['Date'] <= dt.datetime(2020, 4, 4)) & (forex['currency_pair'] != 'IDR/JPY')]
forex_df = forex_df.groupby(['currency_pair']).mean().sort_values('AbsolutePercentChange', ascending=False).dropna()
forex_df.head(10)['AbsolutePercentChange'].plot.bar(title='Average Daily Percent Changes During Initial Outbreaks')
Out[35]:
<matplotlib.axes._subplots.AxesSubplot at 0x1763dd78448>
In [36]:
def winners_losers(country_indices=forex, target='Close', p_value=1.96):
    currencies = forex.Country.unique()
    master = {}
    scores = {}
    for main_currency in currencies:
        master[main_currency] = {'Score':0, 'Wins':list(), "Loses":list(), 'Ties':list()}
        scores[main_currency] = 0
        for curr in currencies:
            if curr != main_currency:
                try:
                    k = get_forex(main_currency, curr)
                    k = k[(k['Date'] < dt.datetime(2020, 5, 1)) &(k["Date"] > dt.datetime(2020, 2, 10))]
                    y=np.array(k[target].dropna().values, dtype=float)
                    x=np.array(pd.to_datetime(k['Date'].dropna()).index.values, dtype=float)
                    slope, intercept, r_value, p_value, std_err = sp.linregress(x,y)
                    t_stat = slope / std_err
                    if t_stat > p_value:
                        master[main_currency]['Score'] += 1
                        scores[main_currency] += 1
                        master[main_currency]['Wins'].append(curr)
                    elif math.sqrt(t_stat * t_stat) <= p_value:
                        master[main_currency]['Ties'].append(curr)
                    else:
                        scores[main_currency] -= 1
                        master[main_currency]['Score'] -= 1
                        master[main_currency]['Loses'].append(curr)                        
                        
                except:
                    pass
                    
    return master, scores

data, scores = winners_losers()
C:\Users\Matthew Raw\Anaconda3\lib\site-packages\ipykernel_launcher.py:16: RuntimeWarning:

invalid value encountered in double_scalars

In [37]:
pd.Series(scores).sort_values().plot.bar(figsize=(16, 6))
plt.title("Statistical Significance Scores of Currencies")
plt.ylabel("Score")
Out[37]:
Text(0, 0.5, 'Score')

Analysis: Forex

From the chart above we can see that the Indian Rupee was the most volatile currency during the heart of the COVID-19 surge. This could be a result of India's developing economy and their large, vulnerable population. India is also one of the most heavily traded stock markets in Asia. The sheer growth of India's democratic economy has few to no restrictions on investment which has triggered investment. However, India in relation to China, South Korea, and Japan, did not stem the spread of COVID-19. COVID-19 still runs rampant in India, but other countries are now reopened.

In general, the most stable currencies become reserve currencies. Reserve currencies are stable currencies where nations will keep their money. For instance, unstable South American countries store their wealth in the United States; or former British colonies (like the US in 1783) would keep stores of the British pound or gold to store their wealth. In troubling times, reserve currencies tend to appreciate in value. In this scenario, the Japanese Yen appreciated the most, followed by the United States Dollar, Hong Kong Dollar, and Euro. Japan was the most successful in combatting COVID-19 compared to the United States who is facing a massive outbreak and more shutdowns. Furthermore, we can see that developing Asian countries are depreciating the most compared to their developed counterparts. What is interesting is that the British Pound had net depreciation. This may be because their leader Boris Johnson ignored COVID-19 warnings, or a sign that the financial dominance of a Britain has deteriorated after Brexit.

COVID-19 Event Analysis: Regressions

In this section I analyze the effect different COVID-19 events have on their respective stock market indices. For each of the events listed below, I create dummy variables and then combine the results into distinct groups. I will then run regression analyses to look for correlation amongst the variables. I hypothesize that days with more events will have have a higher volatility and days with high normally high counts will have a relief in volatility.

Regressions:
  1. Dummy Variables on
In [38]:
corona.type.unique()
Out[38]:
array(['Health Testing', 'Health Resources',
       'Restriction and Regulation of Government Services',
       'Restriction and Regulation of Businesses',
       'Restrictions of Mass Gatherings',
       'Closure and Regulation of Schools',
       'External Border Restrictions', 'Lockdown',
       'Declaration of Emergency', 'Quarantine', 'Hygiene',
       'Internal Border Restrictions', 'Social Distancing',
       'Other Policy Not Listed Above', 'Public Awareness Measures',
       'New Task Force, Bureau or Administrative Configuration',
       'Health Monitoring', 'Curfew', 'Anti-Disinformation Measures'],
      dtype=object)
In [39]:
# This function creates dummy variables for each type of event announcement
def build_dummies(df:pd.DataFrame=corona, countries:list=country_indices['Country'].tolist()):
    df = df[df['init_country_level'] == 'National']
    dates = df['date_announced'].unique()
    df = df.groupby(['country', 'date_announced'])[df['type'].unique()].sum()
    df['next_day'] = 0
    return df
In [40]:
master = build_dummies()
master.head()
Out[40]:
Health Testing Health Resources Restriction and Regulation of Government Services Restriction and Regulation of Businesses Restrictions of Mass Gatherings Closure and Regulation of Schools External Border Restrictions Other Policy Not Listed Above Internal Border Restrictions Public Awareness Measures Declaration of Emergency Health Monitoring Quarantine Social Distancing New Task Force, Bureau or Administrative Configuration Lockdown Anti-Disinformation Measures Hygiene Curfew next_day
country date_announced
Afghanistan 2020-01-30 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2020-02-12 1 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0
2020-02-23 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0
2020-02-27 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
2020-03-01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

Formatting / Tidying

The dummy data is left joined on indices to merge each trading day of each country on their COVID-19 events. Nan events are set to 0. The reasoning for this is to set up a dummy variable regression on Volatility. This regression aims to see how each event affects consumer sentiment.

In [41]:
m = pd.merge(indices[['Country', 'Date']], master, left_on=['Country','Date'], right_on=['country', 'date_announced'], how='left')
m = m.fillna(0)
m.head()
Out[41]:
Country Date Health Testing Health Resources Restriction and Regulation of Government Services Restriction and Regulation of Businesses Restrictions of Mass Gatherings Closure and Regulation of Schools External Border Restrictions Other Policy Not Listed Above ... Declaration of Emergency Health Monitoring Quarantine Social Distancing New Task Force, Bureau or Administrative Configuration Lockdown Anti-Disinformation Measures Hygiene Curfew next_day
0 United States 2020-01-02 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 United States 2020-01-03 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 United States 2020-01-06 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 United States 2020-01-07 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 United States 2020-01-08 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

5 rows × 22 columns

In [42]:
# Head of indices to show the stock market data to be merged.
indices.head()
Out[42]:
Country Date Open High Low Close Adj Close Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility
1148 United States 2020-01-02 3244.669922 3258.139893 3235.530029 3257.850098 3257.850098 3458250000 22.609864 13.180176 13.180176 0.004062 0.004062 0.006968
1149 United States 2020-01-03 3226.360107 3246.149902 3222.340088 3234.850098 3234.850098 3461290000 23.809814 8.489991 8.489991 0.002631 0.002631 0.007380
1150 United States 2020-01-06 3217.550049 3246.840088 3214.639893 3246.280029 3246.280029 3674070000 32.200195 28.729980 28.729980 0.008929 0.008929 0.010008
1151 United States 2020-01-07 3241.860107 3244.909912 3232.429932 3237.179932 3237.179932 3420380000 12.479980 -4.680175 4.680175 -0.001444 0.001444 0.003850
1152 United States 2020-01-08 3238.590088 3267.070068 3236.669922 3253.050049 3253.050049 3720890000 30.400146 14.459961 14.459961 0.004465 0.004465 0.009387
Creating next_day Column

This dummy variable allows us to analyze the day after change due to an event. We can use this data to run a difference in differences regression model. We will run a pooled regression on Date and Country. The aim is to see if individual nations' dummy variables can be combined to provide insight into volatile markets.

We will merge the indices DataFrame with dummy on Country and Date to properly match our data. Then we will combine similar dummy variables into groups (Health, Restriction, Lockdown, & Information).

In [43]:
# Length of the merged index
count = len(m.index)

# Creates temporary DataFrame
dummy = m

# Groups index data in temporary dataFrame
index_data = indices.groupby(['Country', 'Date']).mean()

for ind in m.index:
    # Appends next day
    get = dummy.loc[ind].tolist()
    get[1] = dummy.loc[ind+1].tolist()[1]
    # Sets day after variable to next market day
    get[-1] = 1
    dummy.loc[count] = get
    count += 1

# Creates table of dummy variables
dummy.sort_values('Date').head()
Out[43]:
Country Date Health Testing Health Resources Restriction and Regulation of Government Services Restriction and Regulation of Businesses Restrictions of Mass Gatherings Closure and Regulation of Schools External Border Restrictions Other Policy Not Listed Above ... Declaration of Emergency Health Monitoring Quarantine Social Distancing New Task Force, Bureau or Administrative Configuration Lockdown Anti-Disinformation Measures Hygiene Curfew next_day
0 United States 2020-01-02 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3984 Hong Kong 2020-01-02 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
3766 Australia 2020-01-02 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0
3105 China 2020-01-02 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2889 Israel 2020-01-02 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

5 rows × 22 columns

In [44]:
# Merge Coronavirus dummy data on market days for respective indices
# Create four broad columns representing the categories for each event type
dummy['Health'] = dummy['Health Testing'] + dummy['Health Resources'] + dummy['Health Monitoring']
dummy['Restrictions'] = (dummy['Restriction and Regulation of Government Services'] + dummy['Restriction and Regulation of Businesses'] +
                        dummy['Restrictions of Mass Gatherings'] + dummy['External Border Restrictions'] 
                        + dummy['Internal Border Restrictions'] + dummy['Social Distancing'])
dummy['Lockdown'] = (dummy['Declaration of Emergency'] + dummy['Lockdown'] + dummy['Curfew'])
dummy['Information'] = (dummy['Other Policy Not Listed Above'] + dummy['Public Awareness Measures'] + 
                        dummy['New Task Force, Bureau or Administrative Configuration'] + 
                        dummy['Anti-Disinformation Measures'] + dummy['Hygiene'])

dummies_to_delete = ['Health Testing', 'Health Resources', 'Restriction and Regulation of Government Services', 
           'Restriction and Regulation of Businesses', 'Restrictions of Mass Gatherings', 
           'Closure and Regulation of Schools', 'External Border Restrictions', 'Other Policy Not Listed Above', 
           'Internal Border Restrictions', 'Public Awareness Measures', 'Declaration of Emergency', 
           'Health Monitoring', 'Quarantine', 'Social Distancing', 'New Task Force, Bureau or Administrative Configuration',
            'Anti-Disinformation Measures', 'Hygiene', 'Curfew']

for dumb in dummies_to_delete:
    del dummy[dumb]
    
dummy.head()
Out[44]:
Country Date Lockdown next_day Health Restrictions Information
0 United States 2020-01-02 0.0 0.0 0.0 0.0 0.0
1 United States 2020-01-03 0.0 0.0 0.0 0.0 0.0
2 United States 2020-01-06 0.0 0.0 0.0 0.0 0.0
3 United States 2020-01-07 0.0 0.0 0.0 0.0 0.0
4 United States 2020-01-08 0.0 0.0 0.0 0.0 0.0
In [45]:
# Merge Dummy data with indices
master = pd.merge(dummy, indices, on=['Date', 'Country'], how='left').sort_values(['Country', 'Date'])

# Drop null results
master = master.dropna()

# Event_Count is the amount of dummies on a given day
master['Event_Count'] = (master['Lockdown'] + master['Health'] + master['Restrictions'] + master['Information'])

# *-1 is used to properly sort the data, was used by trial.
master['next_day'] = master['next_day'] * -1

# Sort values by Country, then Date, and lastly next_day
master = master.sort_values(['Country', 'Date', 'next_day'], ascending=True)
master['next_day'] = master['next_day'] * -1

# Saved Dataset for other used
master.to_excel("MatthewRau_IndexData.xlsx", index=False)
master.tail()
Out[45]:
Country Date Lockdown next_day Health Restrictions Information Open High Low Close Adj Close Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility Event_Count
221 United States 2020-11-16 0.0 0.0 0.0 0.0 0.0 3600.159912 3628.510010 3600.159912 3626.909912 3626.909912 5281980000 28.350098 26.750000 26.750000 0.007430 0.007430 0.007875 0.0
3538 United States 2020-11-17 0.0 1.0 0.0 0.0 0.0 3610.310059 3623.110107 3588.679932 3609.530029 3609.530029 4799570000 34.430175 -0.780030 0.780030 -0.000216 0.000216 0.009537 0.0
222 United States 2020-11-17 0.0 0.0 0.0 0.0 0.0 3610.310059 3623.110107 3588.679932 3609.530029 3609.530029 4799570000 34.430175 -0.780030 0.780030 -0.000216 0.000216 0.009537 0.0
3539 United States 2020-11-18 0.0 1.0 0.0 0.0 0.0 3612.090088 3619.090088 3567.330078 3567.790039 3567.790039 5274450000 51.760010 -44.300049 44.300049 -0.012264 0.012264 0.014330 0.0
223 United States 2020-11-18 0.0 0.0 0.0 0.0 0.0 3612.090088 3619.090088 3567.330078 3567.790039 3567.790039 5274450000 51.760010 -44.300049 44.300049 -0.012264 0.012264 0.014330 0.0
In [46]:
# Full Regression of all countries on volatility
temp_df = master
independents = ['Lockdown', 'Health', 'Restrictions', 'Information']
X = sm.add_constant(temp_df[independents])
Y = temp_df['Volatility']
output = sm.OLS(Y, X).fit()
output.summary()
Out[46]:
OLS Regression Results
Dep. Variable: Volatility R-squared: 0.063
Model: OLS Adj. R-squared: 0.062
Method: Least Squares F-statistic: 108.7
Date: Tue, 24 Nov 2020 Prob (F-statistic): 7.50e-90
Time: 23:56:04 Log-Likelihood: 18535.
No. Observations: 6517 AIC: -3.706e+04
Df Residuals: 6512 BIC: -3.703e+04
Df Model: 4
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
const 0.0160 0.000 88.060 0.000 0.016 0.016
Lockdown 0.0111 0.002 7.214 0.000 0.008 0.014
Health 0.0037 0.000 10.256 0.000 0.003 0.004
Restrictions 0.0021 0.000 10.233 0.000 0.002 0.003
Information 0.0039 0.001 6.508 0.000 0.003 0.005
Omnibus: 4508.187 Durbin-Watson: 0.429
Prob(Omnibus): 0.000 Jarque-Bera (JB): 96970.034
Skew: 3.040 Prob(JB): 0.00
Kurtosis: 20.892 Cond. No. 9.27


Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In [47]:
# Create differences in Volalatility and Absolute Change in Volatility
# Is necessary for first differences model
master['Volatility_Change'] = master['Volatility'].diff(1)
master['Absolute_Volatility_Change'] = master['Volatility_Change'].abs()
master.head()
Out[47]:
Country Date Lockdown next_day Health Restrictions Information Open High Low ... Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility Event_Count Volatility_Change Absolute_Volatility_Change
3766 Australia 2020-01-02 0.0 1.0 0.0 0.0 0.0 78.139999 79.419998 78.059998 ... 207926 1.360000 0.949997 0.949997 0.012158 0.012158 0.017405 0.0 NaN NaN
224 Australia 2020-01-02 0.0 0.0 0.0 0.0 0.0 78.139999 79.419998 78.059998 ... 207926 1.360000 0.949997 0.949997 0.012158 0.012158 0.017405 0.0 0.000000 0.000000
3541 Australia 2020-01-03 0.0 1.0 0.0 0.0 0.0 79.470001 80.860001 79.349998 ... 174908 1.510003 0.309998 0.309998 0.003901 0.003901 0.019001 0.0 0.001596 0.001596
225 Australia 2020-01-03 0.0 0.0 0.0 0.0 0.0 79.470001 80.860001 79.349998 ... 174908 1.510003 0.309998 0.309998 0.003901 0.003901 0.019001 0.0 0.000000 0.000000
3542 Australia 2020-01-06 0.0 1.0 0.0 0.0 0.0 79.000000 79.959999 78.639999 ... 166217 1.320000 0.889999 0.889999 0.011266 0.011266 0.016709 0.0 -0.002292 0.002292

5 rows × 22 columns

In [48]:
# Creates immediate effect DataFrame
# Only takes next_day == 1 (Day After Events)
master_next = master.loc[master['next_day'] == 1.0].dropna()
master_next.head()
Out[48]:
Country Date Lockdown next_day Health Restrictions Information Open High Low ... Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility Event_Count Volatility_Change Absolute_Volatility_Change
3541 Australia 2020-01-03 0.0 1.0 0.0 0.0 0.0 79.470001 80.860001 79.349998 ... 174908 1.510003 0.309998 0.309998 0.003901 0.003901 0.019001 0.0 0.001596 0.001596
3542 Australia 2020-01-06 0.0 1.0 0.0 0.0 0.0 79.000000 79.959999 78.639999 ... 166217 1.320000 0.889999 0.889999 0.011266 0.011266 0.016709 0.0 -0.002292 0.002292
3543 Australia 2020-01-07 0.0 1.0 0.0 0.0 0.0 79.830002 81.190002 79.750000 ... 208398 1.440002 1.229996 1.229996 0.015408 0.015408 0.018038 0.0 0.001329 0.001329
3544 Australia 2020-01-08 0.0 1.0 0.0 0.0 0.0 80.680000 81.080002 79.360001 ... 291714 1.720001 -0.080002 0.080002 -0.000992 0.000992 0.021319 0.0 0.003280 0.003280
3545 Australia 2020-01-09 0.0 1.0 0.0 0.0 0.0 81.339996 82.099998 80.610001 ... 337113 1.489997 -0.289993 0.289993 -0.003565 0.003565 0.018318 0.0 -0.003001 0.003001

5 rows × 22 columns

In [49]:
# Makes each dummy variable binary instead of numerical
# Normally dummy variables are binary but I combined numerous dummies for certain days.
binary_master = master
binary_master['Lockdown'] = (binary_master['Lockdown'] > 0)
binary_master['Health'] = (binary_master['Health'] > 0)
binary_master['Restrictions'] = (binary_master['Restrictions'] > 0)
binary_master['Information'] = (binary_master['Information'] > 0)
binary_master.tail()
Out[49]:
Country Date Lockdown next_day Health Restrictions Information Open High Low ... Volume Range Change AbsoluteChange percentChange AbsolutePercentChange Volatility Event_Count Volatility_Change Absolute_Volatility_Change
221 United States 2020-11-16 False 0.0 False False False 3600.159912 3628.510010 3600.159912 ... 5281980000 28.350098 26.750000 26.750000 0.007430 0.007430 0.007875 0.0 0.000000 0.000000
3538 United States 2020-11-17 False 1.0 False False False 3610.310059 3623.110107 3588.679932 ... 4799570000 34.430175 -0.780030 0.780030 -0.000216 0.000216 0.009537 0.0 0.001662 0.001662
222 United States 2020-11-17 False 0.0 False False False 3610.310059 3623.110107 3588.679932 ... 4799570000 34.430175 -0.780030 0.780030 -0.000216 0.000216 0.009537 0.0 0.000000 0.000000
3539 United States 2020-11-18 False 1.0 False False False 3612.090088 3619.090088 3567.330078 ... 5274450000 51.760010 -44.300049 44.300049 -0.012264 0.012264 0.014330 0.0 0.004793 0.004793
223 United States 2020-11-18 False 0.0 False False False 3612.090088 3619.090088 3567.330078 ... 5274450000 51.760010 -44.300049 44.300049 -0.012264 0.012264 0.014330 0.0 0.000000 0.000000

5 rows × 22 columns

Regression: Pooled First Differences

A pooled first differences regression is a regression on panel data and changes over time. I take the differences between volatility for the day-after an event happens. I capture the immediate effect and group the data by its country index. This model is slit to provide clarity among groups while looking at the overall effect.

In [50]:
# Pooled OLS package for first differences model.
from linearmodels.panel import PooledOLS

# Create temporary DataFrame
temp_df = binary_master

# Only use February 15 - May 1. Aims to capture COVID-19 surge.
temp_df = temp_df[(temp_df['Date'] > dt.datetime(2020, 2, 15)) & (temp_df['Date'] < dt.datetime(2020, 5, 1))]

# Categorical Variable for Country
co = pd.Categorical(temp_df.Country)

# Set Double index for first differnces
temp_df = temp_df.set_index(['Country', 'Date'])
temp_df['Country'] = co

# Signify exogenous variables
exog_vars = ['Lockdown', 'Health', 'Restrictions', 'Information']
exog = sm.add_constant(temp_df[exog_vars])

# Regress and Fit model
mod = PooledOLS(temp_df.Volatility_Change, exog)
pooled_res = mod.fit()
pooled_res
Out[50]:
PooledOLS Estimation Summary
Dep. Variable: Volatility_Change R-squared: 0.0012
Estimator: PooledOLS R-squared (Between): -3.4681
No. Observations: 1516 R-squared (Within): 0.0015
Date: Tue, Nov 24 2020 R-squared (Overall): 0.0012
Time: 23:56:05 Log-likelihood 4318.7
Cov. Estimator: Unadjusted
F-statistic: 0.4683
Entities: 15 P-value 0.7591
Avg Obs: 101.07 Distribution: F(4,1511)
Min Obs: 72.000
Max Obs: 108.00 F-statistic (robust): 0.4683
P-value 0.7591
Time periods: 54 Distribution: F(4,1511)
Avg Obs: 28.074
Min Obs: 8.0000
Max Obs: 30.000
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
const 0.0003 0.0004 0.7986 0.4247 -0.0005 0.0012
Lockdown -0.0016 0.0024 -0.6429 0.5204 -0.0063 0.0032
Health -0.0011 0.0011 -0.9533 0.3406 -0.0033 0.0011
Restrictions -0.0001 0.0010 -0.1063 0.9154 -0.0020 0.0018
Information -0.0005 0.0012 -0.3921 0.6951 -0.0027 0.0018

id: 0x17642922508
In [51]:
# Built to analyze regressions on certain dependant and independant variables for ALL nations.
def highest_corr(dependant='Volatility', df:pd.DataFrame=master, independants=['Lockdown', 'Health', 'Restrictions', 'Information'] ):
    data = {}
    for country in master.Country.unique():
        temp_df = df.loc[df['Country'] == country].dropna()
        X = temp_df[independants]
        y = temp_df[dependant]
        #Fit the linear regression
        regr = LinearRegression()
        regr.fit(X, y)
        data[country] = (regr.score(X, y))
    return pd.Series(data)
In [52]:
# Bar Chart to show overall correlation of Dummy Variables on Volatility
fig1, ax = plt.subplots(1,3, figsize=(20,5))
final = pd.DataFrame()
final['Volatility'] = highest_corr().sort_values().plot.bar(ax=ax[0], color='blue', title="R2 of Dummy Variables Regressed on Volatility")
final['Volatility_Change'] = highest_corr(dependant='Volatility_Change').sort_values().plot.bar(ax=ax[1], color='purple', title="R2 of Dummy Variables Regressed on Volatility Change")
final['Absolute_Volatility_Change'] = highest_corr(dependant="Absolute_Volatility_Change").sort_values().plot.bar(ax=ax[2], color='green', title="R2 of Dummy Variables Regressed on Absolute Volatility Change")
In [53]:
# Bar Chart to show overall correlation of Dummy Variables on Volatility
fig1, ax = plt.subplots(1,3, figsize=(20,5))
final = pd.DataFrame()
final['Volatility'] = highest_corr(independants=['Event_Count']).sort_values().plot.bar(ax=ax[0], color='blue', title="R2 of Events/Day Regressed on Volatility")
final['Volatility_Change'] = highest_corr(independants=['Event_Count'], dependant='Volatility_Change').sort_values().plot.bar(ax=ax[1], color='purple', title="R2 of Events/Day Regressed on Volatility Change")
final['Absolute_Volatility_Change'] = highest_corr(independants=['Event_Count'], dependant="Absolute_Volatility_Change").sort_values().plot.bar(ax=ax[2], color='green', title="R2 of Events/Day Regressed on Absolute Volatility Change")
In [54]:
# Create Simple regression to calculate which dummy variables has the largest effect on Volatility
last_df = pd.DataFrame()
for var in ['Lockdown', 'Health', 'Restrictions', 'Information']:
    last_df[var] = highest_corr(independants=[var])

last_df.sort_values('Restrictions').plot.bar(figsize=(16, 9))
Out[54]:
<matplotlib.axes._subplots.AxesSubplot at 0x176420e3408>

Discussion:

Overall there was little to no effect each coronvirus event on the overall volatility. Consumers were unfazed by constant COVID-19 updates. Their level of uncertainty did not change much, however, some nations had higher correlation than others. For instance, the United States had higher correlation among its dummy variables to volatility than other nations. Malaysia and Japan had the highest R-squared for Volatility Changes and Absolute Volatility Changes respectively, however, the R-squared was nowhere near significant. The United States had an R-squared around 0.25 meaning only 25% of the regression model was explained. The United States, however, did relate the most to the dummy variables. COVID-19 events shaped part of day-after consumer sentiment about markets, however, there is significant omitted variable bias that needs to be explained.

The pooled first differences regression model and cumulative multiple linear regression model were not explanatory of market volatility in any form. The pooled regression model had an extremely low R-squared and the independant variables did not show any significant T-scores. On the other hand, the full regression including all countries and the four dummy variables yielded significant T-scores for all dummy variables. The T-scores were significant past the 1% level, showing that each Information, Restrictions, Lockdown and Health should be included in a full regression model.

Each dummy variable by itself, showed that Restrictions were the most dependant dummy variable for singular linear regressions on Volatility. This means that Restrictions probably weighs more on Volatility than other dummy variables.

Conclusion

People put their wealth in safe places in times of uncertainty and in the foreign exchange market it is shown that Japan had been the safest place. Japan did not have a robust financial system until the 1960's and has grown ever since. Japan's ability to control the pandemic better than its developed partners proved fruitful in an appreciation of the Yen. Developing nations like India and Indonesia seen as major world economies were not as fortunate. They received net depreciation and may possibly mean revert in the future. The Indian Rupee and Indonesian Rupiah may be a solid investment for the prospects of a near-future mean reversion.

Although COVID-19 event data did not fully capture volatility movement in major stock market indices, it did show that some event types would a play significant effect on a more robust model. Data that should be included for next time would be COVID-19 case rates and death rates. These rates are the baseline for lockdowns and may be better indicators of different events and volatility itself.. These events in effect have some effect on markets directly and COVID-19 case rates could cause uncertainty in markets before restrictions are put in place. This pandemic has confused market participants and everyone alike, and no one model will capture the overall effect this pandemic has had on humanity.

Papers you might enjoy