Plotting raw data

This notebook focuses on plotting raw measurements (VOD calculation and plotting is covered in the next one)

import gnssvod as gv
import pandas as pd
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.collections import PatchCollection
import matplotlib.dates as mdates

The first step is to load all files of the merged data that were saved as a NetCDF.

We can read them as a series of xarray datasets.

ds = xr.open_mfdataset('data_RINEX2.11/Dav_paired/*.nc',combine='nested',concat_dim='Epoch')
ds

<xarray.Dataset> Size: 9MB
Dimensions:    (Station: 2, Epoch: 1441, SV: 77)
Coordinates:
  * Station    (Station) <U9 72B 'Dav1_Grnd' 'Dav2_Twr'
  * SV         (SV) <U3 924B 'C06' 'C07' 'C09' 'C10' ... 'R23' 'S23' 'S27' 'S36'
  * Epoch      (Epoch) datetime64[ns] 12kB 2021-04-28T21:07:00 ... 2021-04-29...
Data variables:
    S1         (Station, Epoch, SV) float64 2MB dask.array<chunksize=(2, 692, 55), meta=np.ndarray>
    S2         (Station, Epoch, SV) float64 2MB dask.array<chunksize=(2, 692, 55), meta=np.ndarray>
    S7         (Station, Epoch, SV) float64 2MB dask.array<chunksize=(2, 692, 55), meta=np.ndarray>
    Azimuth    (Station, Epoch, SV) float64 2MB dask.array<chunksize=(2, 692, 55), meta=np.ndarray>
    Elevation  (Station, Epoch, SV) float64 2MB dask.array<chunksize=(2, 692, 55), meta=np.ndarray>

Here we will convert the xarray.DataSet to a pandas.DataFrame for further processing

df = ds.to_dataframe().dropna(how='all').reorder_levels(["Station","Epoch","SV"]).sort_index()
df

			S1	S2	S7	Azimuth	Elevation
Station	Epoch	SV
Dav1_Grnd	2021-04-28 21:07:00	C06	NaN	NaN	25.0	36.6	10.1
		C09	35.0	35.0	30.0	49.0	32.7
		C11	33.0	33.0	30.5	177.2	35.1
		C14	36.3	36.3	35.2	-96.4	76.8
		C16	30.7	30.7	30.8	38.3	15.2
...	...	...	...	...	...	...	...
Dav2_Twr	2021-04-29 03:07:00	R16	46.4	35.7	NaN	-173.5	68.9
		R23	43.0	NaN	NaN	NaN	NaN
		S23	45.0	NaN	NaN	NaN	NaN
		S27	29.5	NaN	NaN	NaN	NaN
		S36	44.0	NaN	NaN	NaN	NaN

89349 rows × 5 columns

Hemispheric plot of raw measurements

Subsetting the data

For plotting, it will be useful to know how to easily subset data. As the dataframe contains a MultiIndex, we use the .xs function to subset it. For instance, to get all data from the Galileo satellite #3 at the ground station:

df.xs('E03',level='SV').xs('Dav1_Grnd',level='Station')

	S1	S2	S7	Azimuth	Elevation
Epoch
2021-04-28 21:07:00	NaN	NaN	26.5	-141.2	25.9
2021-04-28 21:07:15	NaN	NaN	27.4	-141.2	26.0
2021-04-28 21:07:30	NaN	NaN	28.6	-141.1	26.1
2021-04-28 21:07:45	25.4	NaN	30.2	-141.1	26.2
2021-04-28 21:08:00	27.0	NaN	31.0	-141.0	26.3
...	...	...	...	...	...
2021-04-29 03:06:00	39.0	NaN	24.5	110.9	36.3
2021-04-29 03:06:15	39.0	NaN	20.9	111.0	36.3
2021-04-29 03:06:30	39.0	NaN	22.2	111.0	36.2
2021-04-29 03:06:45	38.0	NaN	25.2	111.1	36.1
2021-04-29 03:07:00	37.1	NaN	27.6	111.2	36.0

1441 rows × 5 columns

Plotting observation points in polar coordinates

Single satellite, single site

mySV = 'E03'
mystation_name = 'Dav1_Grnd'
# initialize figure with polar axes
fig, ax = plt.subplots(figsize=(10,10),subplot_kw=dict(projection='polar'))
# subset the dataset
subdf = df.xs(mySV,level='SV').xs(mystation_name,level='Station')

# polar plots need a radius and theta direction in radians
radius = 90-subdf.Elevation
theta = np.deg2rad(subdf.Azimuth)

# plot each measurement and color by signal to noise ratio
hs = ax.scatter(theta,radius,c=subdf.S1)
ax.set_rlim([0,90])
ax.set_theta_zero_location("N")
plt.colorbar(hs, shrink=.5, label='SNR (L1)')
plt.title(mystation_name)

Text(0.5, 1.0, 'Dav1_Grnd')

../_images/55dc6268198b3009249a36bf60c04043399bf3eb412cb012a2f792450939e690.png

Single satellite, all sites

mySV = 'E03'
# get all sites as list
station_names = df.index.get_level_values('Station').unique()
# ensure we use the same color limits in all plots
clim = [15,47]
# initialize figure with polar axes
fig, ax = plt.subplots(1,len(station_names),figsize=(10,10),subplot_kw=dict(projection='polar'))
for i, iname in enumerate(station_names):
    # subset the dataset
    subdf = df.xs(mySV,level='SV').xs(iname,level='Station')
    # polar plots need a radius and theta direction in radians
    radius = 90-subdf.Elevation
    theta = np.deg2rad(subdf.Azimuth)
    # plot each measurement and color by signal to noise ratio
    hs = ax[i].scatter(theta,radius,c=subdf.S1)
    hs.set_clim(clim)
    ax[i].set_rlim([0,90])
    ax[i].set_theta_zero_location("N")
    ax[i].set_title(iname)

plt.colorbar(hs, ax=ax, location='bottom', shrink=.5, pad=0.05, label='SNR (L1)')

<matplotlib.colorbar.Colorbar at 0x7f42cd141910>

../_images/60cca72cddc2cb3562cd379c1408ffafe8ffae336bb14c3a36b8dd59e56f04cd.png

All satellites, all sites

# get all sites as list
station_names = df.index.get_level_values('Station').unique()
# ensure we use the same color limits in all plots
clim = [15,47]
# initialize figure with polar axes
fig, ax = plt.subplots(1,len(station_names),figsize=(10,10),subplot_kw=dict(projection='polar'))
for i, iname in enumerate(station_names):
    # subset the dataset
    subdf = df.xs(iname,level='Station')
    # polar plots need a radius and theta direction in radians
    radius = 90-subdf.Elevation
    theta = np.deg2rad(subdf.Azimuth)
    # plot each measurement and color by signal to noise ratio
    hs = ax[i].scatter(theta,radius,c=subdf.S1,s=10)
    hs.set_clim(clim)
    ax[i].set_rlim([0,90])
    ax[i].set_theta_zero_location("N")
    ax[i].set_title(iname)

plt.colorbar(hs, ax=ax, location='bottom', shrink=.5, pad=0.05, label='SNR (L1)')

<matplotlib.colorbar.Colorbar at 0x7f42cc077af0>

../_images/fa44a48c28b040d15cf1c0e49fda1d76d868781a7713a4cfb792a8c526c38ceb.png

As expected, signal to noise is larger at the clear-sky (Dav2_Twr) site. We also see that SNR increases as a function of elevation. This mainly occurs because the GNSS antenna gain is strongest in the zenith direction.

We also see that no measurements can be found towards the North of the stations, this is because no GNSS satellites go over this area of the sky.

Finally, to the West, measurements are missing below elevations of about 20°. This is because the surrounding terrain and the mountains are cutting off the line of sight to the satellite.

Plotting observation points as time series

mySV = 'E03'
# get all sites as list
station_names = df.index.get_level_values('Station').unique()

fig, ax = plt.subplots(1,figsize=(10,5))
for i, iname in enumerate(station_names):
    # subset the dataset
    subdf = df.xs(iname,level='Station').xs(mySV,level='SV')
    # plot each measurement and color by signal to noise ratio
    hs = ax.plot(subdf.index.get_level_values('Epoch'),subdf.S1,label=iname)
    
myFmt = mdates.DateFormatter('%H:%M')
ax.xaxis.set_major_formatter(myFmt)
ax.set_ylabel('SNR (L1)')
ax.legend()
plt.title(mySV)

Text(0.5, 1.0, 'E03')

../_images/5ba42dc35a4d2e7bcd31455e90bfc7e0ae2e360e6f6fc203670a6c2318eda915.png

Here we see several interesting features. In the beginning, the SNR of the clear-sky antenna (Dav2_Twr) increases over time as the satellite gets closer to the zenith, then starts to decrease after passing zenith.

We also observe oscillations of the clear-sky signal. These are likely the result of ground multipath reflections. The ground reflected signal reaches the receiver with a delay, causing an either constructive (signal enhancement) or destructive (signal attenuation) interference, in alternance as the satellite rises in the sky.

Finally, we see that the signal is severely attenuated at the subcanopy site, as the satellite is obscured by more or less dense parts of the canopy. Around 00:00-02:00 the satellite likely was moving through a gap in the canopy and the signal strength is more or less the same at the two receivers.

Geostationary satellites

Some of the logged satellites are geostationary (meaning they stay in the same spot in the sky) and have an “S” as prefix. These are satellites of the regional Satellite-based Augmentation Systems (SBAS), here the European Geostationary Navigation Overlay Service (EGNOS). These satellites broadcast correction information calculated in real time, which improves positioning (if your GNSS receiver has the ability to receive it).

As they have a fixed position in the sky, they do not sample the whole canopy. As a result, their signals are much more constant over time.

mySV = ['S23','S36']
# get all sites as list
station_names = df.index.get_level_values('Station').unique()

fig, ax = plt.subplots(len(mySV),figsize=(10,10))
for isv, svname in enumerate(mySV):
    for i, iname in enumerate(station_names):
        # subset the dataset
        subdf = df.xs(iname,level='Station').xs(svname,level='SV')
        # plot each measurement and color by signal to noise ratio
        hs = ax[isv].plot(subdf.index.get_level_values('Epoch'),subdf.S1,label=iname)
        
    myFmt = mdates.DateFormatter('%H:%M')
    ax[isv].xaxis.set_major_formatter(myFmt)
    ax[isv].set_ylabel('SNR (L1)')
    ax[isv].legend()
    ax[isv].set_title(svname)

../_images/aec3271c49fe62445051d8379b123c082b19d121aaede1bd69555b219ac2d101.png

Those measurements can be useful to diagnose other possible effects on SNR. For instance, the density of the atmosphere has an effect on SNR which can change over the day and will influence both stations similarly. The temperature of the sensing equipment (receiver, cables, antenna) can also vary over the day and have a small influence on SNR.

The plateaus occurs because the SNR is encoded as an integer on the Reach M2 receiver. We are logging data every second, but we averaged to 15 seconds during preprocessing, explaing why we also have intermediate values.

mySV = 'S27'
# get all sites as list
station_names = df.index.get_level_values('Station').unique()

fig, ax = plt.subplots(1,figsize=(10,5))
for i, iname in enumerate(station_names):
    # subset the dataset
    subdf = df.xs(iname,level='Station').xs(mySV,level='SV')
    # plot each measurement and color by signal to noise ratio
    hs = ax.plot(subdf.index.get_level_values('Epoch'),subdf.S1,label=iname,marker='.',markersize=5)
    
myFmt = mdates.DateFormatter('%H:%M')
ax.xaxis.set_major_formatter(myFmt)
ax.set_ylabel('SNR (L1)')
ax.legend()
plt.title(mySV)

Text(0.5, 1.0, 'S27')

../_images/8f10022fdafd30967acdba9e3154e02b32fe9db87d197ce59e4af94da94fa4a7.png

For this geostationary satellite, there is an outlier at about 22:30 for Dav2_Twr, causing a rather awkward time series.

If a satellite is barely visible or has poor health, the receiver may lose track of the signal for some amount of time (a loss of signal tracking is seen for Dav1_Grnd around 02:30 for instance).

Since both receivers record similar signal strength, it is most likely that the geostationary satellite appears in a gap in the canopy and the subcanopy receiver has a clear line of sight as well.

We also see a significant jump at around 01:50. This happens if the transmit power of the satellite is changed (for any power management reason). Such jumps also occur with normal (non-geostationary) satellites but may be less obvious to detect. Because such jumps can occur anytime and on any satellite, it is always recommended to calculate the signal attenuation caused by the canopy (i.e. the difference between the two sensors) at the level of instantaneous observations. If we took here the average of Dav1_Grnd and compared it to the average of Dav2_Twr, Dav1_Grnd would have higher signal strength, but only because Dav2_Twr did not take as many measurements when ‘S27’ was transmitting at a higher power. In summary, calculate the difference first, then average.

Calculating and plotting hemispheric averages

Note, here because we only use 6 hours worth of data, plotting individual measurements as we did above is still possible. However, when working with larger datasets, this quickly becomes impractical. gnssvod offers a solution to calculate statistics (like an average) over a hemispheric grid.

Hemispheric grid objects

The Hemi class is a convenience class of gnssvod for creating, using, and plotting hemispheric grid objects.

They are constructed by specifying the desired angular resolution of cells in degrees. The smaller the angular resolution, the finer the grid.

hemi = gv.hemibuild(4)
hemi

<gnssvod.hemistats.hemistats.Hemi at 0x7f42b55bc070>

Hemi object structure

Hemi objects expose a small set of attributes describing the grid geometry, as well as helper methods to work with hemispheric binning.

Attributes

Hemi.angular_resolution
Angular diameter of the zenith cell (degrees).
Hemi.ncells
Total number of grid cells in the hemisphere.
Hemi.grid
pandas.DataFrame describing cell geometry (centers and edges).

Methods

Hemi.patches()
Returns a pandas.Series of matplotlib patches representing grid cells, typically used for hemispheric plotting.
Hemi.add_CellID()
Assigns hemispheric CellIDs to an existing dataframe of GNSS observations, enabling spatial binning and statistics.

hemi.grid

	azi	ele	azimin	azimax	elemin	elemax
CellID
0	0.000000	90.0	0.000000	360.000000	88.0	90.0
1	22.500000	86.0	0.000000	45.000000	84.0	88.0
2	67.500000	86.0	45.000000	90.000000	84.0	88.0
3	112.500000	86.0	90.000000	135.000000	84.0	88.0
4	157.500000	86.0	135.000000	180.000000	84.0	88.0
...	...	...	...	...	...	...
1523	345.789474	6.0	344.210526	347.368421	4.0	8.0
1524	348.947368	6.0	347.368421	350.526316	4.0	8.0
1525	352.105263	6.0	350.526316	353.684211	4.0	8.0
1526	355.263158	6.0	353.684211	356.842105	4.0	8.0
1527	358.421053	6.0	356.842105	360.000000	4.0	8.0

1528 rows × 6 columns

To plot the grid, we first retrieve the patches:

# get patches
patches = hemi.patches() 
patches

CellID
     Rectangle(xy=(0, 0), width=6.28319, height=2, ...
     Rectangle(xy=(0, 2), width=0.785398, height=4,...
     Rectangle(xy=(0.785398, 2), width=0.785398, he...
     Rectangle(xy=(1.5708, 2), width=0.785398, heig...
     Rectangle(xy=(2.35619, 2), width=0.785398, hei...
                              ...                        
  Rectangle(xy=(6.00761, 82), width=0.0551157, h...
  Rectangle(xy=(6.06272, 82), width=0.0551157, h...
  Rectangle(xy=(6.11784, 82), width=0.0551157, h...
  Rectangle(xy=(6.17295, 82), width=0.0551157, h...
  Rectangle(xy=(6.22807, 82), width=0.0551157, h...
Name: Patches, Length: 1528, dtype: object

Creating the patches can take time, especially if there are many. With an angular resolution of 6° and 693 patches, this is still quite fast, but at 0.1°, it will take a lot of time. This is why patches are only calculated if the method .patches() is called by the user and are otherwise not calculated by default when calling hemibuild

The patches can then be easily plotted as follows:

fig, ax = plt.subplots(figsize=(7,7),subplot_kw=dict(projection='polar'))
pc = PatchCollection(patches.values,facecolor='none',linewidth=1)
#ax.bar(0, 1).remove() # bar hack to force curved patch edges, activate this if the patches look square and not curved
ax.add_collection(pc)
ax.set_rlim([0,90])
ax.set_theta_zero_location("N")

../_images/5136ee184b6c8c0deb8ef37aa8ec5ba18aeccfd72dca06adff358a5c87c32803.png

Assigning measurements to grid cells

The Hemi.add_CellID assigns a CellID to an existing dataframe of GNSS measurements.

df = hemi.add_CellID(
    df,
    aziname="Azimuth",
    elename="Elevation",
    idname="CellID",
    drop=True,
)

By default, it will assume that the dataframe df contains columns named Azimuth and Elevation, and that the CellID should be added to a new column named CellID. All these defaults can be easily overriden by manipulating the arguments. drop=True means that measurements that do not fall within in any cell are dropped off (discarded from the df). This can affect a substantial number of observations if an elevation cutoff larger than zero was specified when building the grid with hemibuild.

newdf = hemi.add_CellID(df)
newdf

			S1	S2	S7	Azimuth	Elevation	CellID
Station	Epoch	SV
Dav1_Grnd	2021-04-28 21:07:00	C06	NaN	NaN	25.0	36.6	10.1	1312
		C09	35.0	35.0	30.0	49.0	32.7	690
		C11	33.0	33.0	30.5	177.2	35.1	724
		C14	36.3	36.3	35.2	-96.4	76.8	42
		C16	30.7	30.7	30.8	38.3	15.2	1201
...	...	...	...	...	...	...	...	...
Dav2_Twr	2021-04-29 03:07:00	R05	49.4	39.0	NaN	123.8	40.1	532
		R07	46.5	37.0	NaN	-44.0	20.7	1068
		R09	42.4	39.3	NaN	-154.1	18.9	1143
		R15	50.5	40.6	NaN	40.7	45.7	432
		R16	46.4	35.7	NaN	-173.5	68.9	101

80816 rows × 6 columns

It now becomes relatively easy to calculate any sort of statistics per grid cell using the base pandas functions.

hemi_average = newdf.groupby(['CellID','Station']).mean()
hemi_average

		S1	S2	S7	Azimuth	Elevation
CellID	Station
1	Dav1_Grnd	47.853333	43.027273	39.0	4.746667	84.620000
1	Dav2_Twr	48.353333	43.581818	41.0	4.746667	84.620000
3	Dav1_Grnd	47.677419	41.145161	NaN	114.351613	84.716129
3	Dav2_Twr	48.361290	43.958065	NaN	114.351613	84.722581
4	Dav1_Grnd	47.925000	42.715000	NaN	146.860000	84.560000
...	...	...	...	...	...	...
1522	Dav1_Grnd	35.250000	NaN	NaN	-15.933333	7.750000
1522	Dav2_Twr	44.333333	35.150000	NaN	-15.933333	7.750000
1523	Dav1_Grnd	30.300000	NaN	NaN	-15.700000	7.400000
1523	Dav2_Twr	44.133333	35.550000	NaN	-15.666667	7.300000
1524	Dav2_Twr	36.300000	NaN	NaN	-9.700000	8.000000

1779 rows × 5 columns

We can now plot this data in grid form. Here we look at the mean signal to noise ratio (SNR) at each site.

fig, ax = plt.subplots(1,2,figsize=(10,10),subplot_kw=dict(projection='polar'))

station_names = df.index.get_level_values('Station').unique()
for i, iname in enumerate(station_names):
    # associate the mean values to the patches, join inner will drop patches with no data, making plotting slightly faster
    ipatches = pd.concat([patches,hemi_average.xs(iname, level='Station')],join='inner',axis=1)
    # plotting with colored patches
    pc = PatchCollection(ipatches.Patches,array=ipatches.S1,edgecolor='face', linewidth=1)
    pc.set_clim([25,50])
    ax[i].add_collection(pc)
    ax[i].set_rlim([0,90])
    ax[i].set_theta_zero_location("N")
    ax[i].set_title(iname)

plt.colorbar(pc, ax=ax, location='bottom', shrink=.5, pad=0.05, label='SNR (L1)')
plt.savefig('figures/illustration_snr.png',facecolor='white', transparent=False,bbox_inches='tight')

../_images/268517b7d17559a67aadfbc8e3532c5839a30bf1fad2d34a3c007598f7fde902.png

We can also look at the number of observations by calculating another statistic

hemi_count = newdf.groupby(['CellID','Station']).count()
hemi_count

fig, ax = plt.subplots(1,2,figsize=(10,10),subplot_kw=dict(projection='polar'))

station_names = df.index.get_level_values('Station').unique()
for i, iname in enumerate(station_names):
    # associate the mean values to the patches, join inner will drop patches with no data, making plotting slightly faster
    ipatches = pd.concat([patches,hemi_count.xs(iname, level='Station')],join='inner',axis=1)
    # plotting with colored patches
    pc = PatchCollection(ipatches.Patches,array=ipatches.S1,edgecolor='face', linewidth=1)
    pc.set_clim([0,200])
    ax[i].add_collection(pc)
    ax[i].set_rlim([0,90])
    ax[i].set_theta_zero_location("N")
    ax[i].set_title(iname)

plt.colorbar(pc, ax=ax, location='bottom', shrink=.5, pad=0.05, label='# of obs')

<matplotlib.colorbar.Colorbar at 0x7f42b4fcf7f0>

../_images/e8223a68f574adb0aa7d3306cc769e0d1be197092d3b957d45e9044df2dc2333.png

In the next notebook, we illustrate how to calculate and plot GNSS-VOD.