3. Processing physiological data

There are a few common processing steps often performed when working with physiological data:

We have already seen that peakdet.operations has functions to perform a few of these steps, but it is worth going into all of them in a bit more detail:

3.1. Visual inspection

One of the first steps to do with raw data is visually inspect it. No amount of processing can fix bad data, and so it’s good to check that your data quality is appropriate before continuing. Plotting data can be achieved with plot_physio():

>>> from peakdet import load_physio, operations
>>> data = load_physio('ECG.csv', fs=1000.0)
>>> ax = operations.plot_physio(data)
>>> ax.set_xlim(0, 10)  # doctest: +SKIP
../_images/processing-1.png

For now this will simply plot the raw waveform, but we’ll see later how this function has some added benefits.

3.2. Interpolation

Raw data can often be collected at a sampling rate above what is biologically meaningful. For example, human respiration is relatively slow, so acquiring it at, for example, 250 Hz is often more than sufficient; when it is acquired at a higher rate it can be quite noisy. In this case, we might want to interpolate (or decimate) the data to a lower sampling rate, which can be done with interpolate_physio():

>>> data = load_physio('RESP.csv', fs=1000.)
>>> print(data)
Physio(size=24000, fs=1000.0)
>>> data = operations.interpolate_physio(data, target_fs=250.)
>>> print(data)
Physio(size=6000, fs=250.0)

Note that the size of the data decreased by a factor of four (24000 to 6000), the same as the decrease in sampling rate.

Data can also be upsampled via interpolation, though care must be taken in interpreting the results of such a procedure:

>>> data = load_physio('PPG.csv', fs=25.0)
>>> print(data)
Physio(size=24000, fs=25.0)
>>> data = operations.interpolate_physio(data, target_fs=250.0)
>>> print(data)
Physio(size=240000, fs=250.0)

3.3. Temporal filtering

Once our data is at an appropriate sampling rate, we may want to apply a temporal filter with filter_physio(). This function supports lowpass, highpass, bandpass, and bandstop filters with user-specified frequency cutoffs. First, let’s take a look at our interpolated PPG data:

>>> ax = operations.plot_physio(data)
>>> ax.set_xlim(0, 10)  # doctest: +SKIP
../_images/processing-4.png

If we’re going to do peak detection, it would be great to get rid of the venous pulsations in the waveform to avoid potentially picking them up. If we apply a lowpass filter with a 1.0 Hz cutoff we can do just that:

>>> data = operations.filter_physio(data, cutoffs=1.0, method='lowpass')
>>> ax = operations.plot_physio(data)
>>> ax.set_xlim(0, 10)  # doctest: +SKIP
../_images/processing-5.png

Filter settings are highly dependent on the data, so visually confirming that the filter is performing as expected is important!

3.4. Peak detection

Many physiological processing pipelines requiring performing peak detection on the data (e.g., to calculate heart rate, respiratory rate, pulse rate). That process can be accomplished with peakfind_physio():

>>> data = operations.peakfind_physio(data, thresh=0.1, dist=100)
>>> data.peaks[:10]
array([ 164,  529,  901, 1278, 1628, 1983, 2381, 2774, 3153, 3486])
>>> data.troughs[:10]
array([ 356,  732, 1111, 1465, 1817, 2205, 2603, 2989, 3330, 3677])

The peaks and troughs attributes mark the indices of the detected peaks and troughs in the data; these can be converted to time series by dividing by the sampling frequency:

>>> data.peaks[:10] / data.fs
array([ 0.656,  2.116,  3.604,  5.112,  6.512,  7.932,  9.524, 11.096,
       12.612, 13.944])
>>> data.troughs[:10] / data.fs
array([ 1.424,  2.928,  4.444,  5.86 ,  7.268,  8.82 , 10.412, 11.956,
       13.32 , 14.708])

Once these attributes are instantiated, subsequent calls to plot_physio() will denote peaks with red dots and troughs with green dots to aid visual inspection:

>>> ax = operations.plot_physio(data)
>>> ax.set_xlim(0, 10)  # doctest: +SKIP
../_images/processing-7.png