xliucs · huq02 · Dec 10, 2021 · Dec 10, 2021 · Dec 12, 2021 · Dec 24, 2021
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,15 @@
+code/__pycache__/*
+
+rPPG-checkpoints
+rPPG-checkpoints/*
+.vscode
+.vscode/*
+.vs
+.vs/*
+0.png
+1.png
+log.txt
+.gitignore
+picture1.jpg
+picture.jpg
+picture2.jpg
diff --git a/README.md b/README.md
@@ -1,102 +1,121 @@
-## MTTS-CAN: Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement
+# Analysis and optimization of photoplethysmography imaging methods for non-contact measurement of heart variability parameters
 
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)
 
+Deep learning neuronal networks based on remote photoplethysmography. Extracting the pulse signal from video using machine learning with a view to heart rate variability parameters.
+Source code of the master thesis titles: "Analysis and optimization of photoplethysmography imaging methods for non-contact measurement of heart variability parameters"
+
+## Cite as
+
+Sarah Quehl. (2022, April 11). Analysis and Optimization of Photoplethysmography Imaging Methods for Non-Contact Measurement of Heart Variability Parameters
+
+## Abstract 
+Heart rate variability is an important physiological parameter for health and refers to the natural
+variation of the time between two heartbeats. Heart rate variability describes the adaptability of an
+organism to external and internal factors and can be measured with common measuring devices, like
+electrocardiogram or photoplethysmogram. Today, this is even possible with the smartphone via apps.
+Photoplethysmography Imaging as a non-contact method is a further development of state-of-the-art
+photoplethysmography for recording cardiac activity by detecting minimal pulse-induced fluctuations
+on the skin with a RGB camera. Most Photoplethysmography Imaging methods focus on heart rate
+measurement and do not consider heart rate variability. In recent years, many new approaches based
+on signal filtering or neural networks have been presented. However, the accuracy required for medical
+purposes, especially with regard to heart rate variability, has not yet been achieved and represents a
+major challenge.
+This thesis compares current Photoplethysmography Imaging methods based on neural networks. For
+this purpose, four basis methods are implemented and tested for functionality. Based on these findings,
+two new networks were developed, the PTS-CAN and the PPTS-CAN. These are based on multi-objective
+optimization and add one and two additional outputs to the neural network, respectively. The additional
+output of the PTS-CAN outputs a binary signal that has a value of one at peaks. For this output two new
+loss functions were developed, which have the goal to reduce the temporal error of the peaks. For this
+purpose, two new loss functions named ownGauss and the TE were developed, the last one allows an
+interpretation of the error in seconds. Both manipulate the ground truth to generate a loss, to reward
+the peaks that are close to the real peak and to punish peaks that are further away. A further output
+was added to the first model, which outputs various variable parameters and is evaluated by the mean
+absolute percentage error loss function. All used models are trained on the same database and are
+compared. In addition, there is a comparison with the first developed methods on the subject of vital
+parameters extraction from video. A final comparison shows an improvement in HR and HRV parameter
+calculation with the new methods. The heart rate calculation can be improved by about 20%. In the field
+of HRV parameters, an improvement of 5,7% can be achieved for the parameter SDNN, for example.
+In a cross-validation, improvements are achieved over the baseline methods and there is also a slight
+improvement over the basis models. For the parameters in the frequency domain, the improvements are
+a bit less clear than in the time domain, since the frequency analysis is more challenging here.
+A project was generated, which can be used as a basis for further experiments with further approaches
+and loss functions. The integration of further network architectures as well as loss functions is easily
+possible.
+
+## Preprocessing
+It is recommended to save the important information of each video into a hdf5-file using the `prepare_databases.py` script. Here pixel data, ground truth and various parameters are integrated.
+
+## Training
 
-## Paper
-
-#### [Xin Liu](https://homes.cs.washington.edu/~xliu0/), [Josh Fromm](https://www.linkedin.com/in/josh-fromm-2a4a2258/), [Shwetak Patel](https://ubicomplab.cs.washington.edu/members/), [Daniel McDuff](https://www.microsoft.com/en-us/research/people/damcduff/), “Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement”, NeurIPS 2020, Oral Presentation (105 out of 9454 submissions) 
-
-#### Link: <https://papers.nips.cc/paper/2020/file/e1228be46de6a0234ac22ded31417bc7-Paper.pdf>
-
-
-## New Pre-Trained Model (Updated Nov 2021)
-
-Working in pregress. Check back later! 
-
-#### Abstract
-
-Telehealth and remote health monitoring have become increasingly important during the SARS-CoV-2 pandemic and it is widely expected that this will have a lasting impact on healthcare practices. These tools can help reduce the risk of exposing patients and medical staff to infection, make healthcare services more accessible, and allow providers to see more patients. However, objective measurement of vital signs is challenging without direct contact with a patient. We present a video-based and on-device optical cardiopulmonary vital sign measurement approach. It leverages a novel multi-task temporal shift convolutional attention network (MTTS-CAN) and enables real-time cardiovascular and respiratory measurements on mobile platforms. We evaluate our system on an ARM CPU and achieve state-of-the-art accuracy while running at over 150 frames per second which enables real-time applications. Systematic experimentation on large benchmark datasets reveals that our approach leads to substantial (20\%-50\%) reductions in error and generalizes well across datasets.
-
-
+`python code/train.py --exp_name test --exp_name [e.g., test] --data_dir [DATASET_PATH] --temporal [e.g., MMTS_CAN]`
 
-## Waveform Samples
+examples:
 
-### Pulse
+python code/train.py --exp_name test1 --data_dir /mnt/share/StudiShare/sarah/Databases/ --temporal TS_CAN --database_name MIX2
 
-![pulse_waveform](./pulse_waveform.png)
 
+#### Issues:
 
-### Respiration 
+In PPTS_CAN, the frame rate used is derived from the video length used (which results from the data sets). This must still be passed in generalized form in the layers.
 
-![resp_waveform](./resp_waveform.png)
+## Inference
 
+`python code/predict_vitals_oneVideo.py --video_path [VIDEO_PATH] --save_dir [SAVE_PATH] --trained_model [CHECKPOINT_PATH]
+        --model_name [e.g., TS_CAN, PTS_CAN, PPTS_CAN] --parameter [e.g., "bpm, sdnn, pnn50, lfhf"]`
 
-## Citation 
+## Path dependencies in the following scripts
+final_evaluation.py
 
-``` bash
-@article{liu2020multi,
-  title={Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement},
-  author={Liu, Xin and Fromm, Josh and Patel, Shwetak and McDuff, Daniel},
-  journal={arXiv preprint arXiv:2006.03790},
-  year={2020}
-}
-```
+model_evaluation.py
 
-## Demo
+pre_process.py
 
-**Try out our live demo via link [here](https://vitals.cs.washington.edu/).**
+predict_vitals_comparison.py
 
-Our demo code: https://github.com/ubicomplab/rppg-web
+predict_vitals_new.py
 
+predict_vitals_oneVideo.py
 
-## TVM
+predict.vitals.py
 
-If you want to use TVM, pleaea follow [this tutorial](https://tvm.apache.org/docs/) to set it up. Then, you will need to replace the code in `incubator-tvm/python/tvm/relay/frontend/keras.py` with our `code/tvm-ops-mtts-can.py`. We implemented required tensor operations for attention, tensor shift module used in our models. 
+layer_output.py
 
-## Training 
 
-`python code/train.py --exp_name test --exp_name [e.g., test] --data_dir [DATASET_PATH] --temporal [e.g., MMTS_CAN]`
+In the current scripts, the data has been divided into the folders 1)Training and 2)Validation.
 
-## Inference 
+## evaluation_iPhys.py
+Script for evaluating the prediction of the iPhys models (GreenChannel, POH, CHROM) with the same procedure and products as in the finalEvaluation.py script. 
 
-`python code/predict_vitals.py --video_path [VIDEO_PATH]`
+### Requirements:
+Predictions of the models, saved as a .txt file with the names: `*GC.txt`, `*ICA_POH.txt`, `*CHROM.txt` 
 
-The default video sampling rate is 30Hz. 
-
-#### Note
-
-During the inference, the program will generate a sample pre-processed frame. Please ensure it is in portrait orientation. If not, you can comment out line 30 (rotation) in the `inference_preprocess.py`. 
+They are located in the same folder as the ground truth files.
 
 
 ## Requirements
 
 
 Tensorflow 2.0+
+tested with Tensorflow-gpu=2.3
 
+`conda create -n tf-gpu tensorflow-gpu cudatoolkit=10.1` -- this command takes care of both CUDA and TF environments.
 
-`conda create -n tf-gpu tensorflow-gpu cudatoolkit=10.1` -- this command takes care of both CUDA and TF environments. 
-
-`pip install opencv-python scipy numpy matplotlib`
+`pip install opencv-python scipy numpy matplotlib heartpy scikit-learn`
 
-If`pip install opencv-python` does not work, I found these commands always work on my mac. 
+If`pip install opencv-python` does not work, I found these commands always work on my mac.
 
 ```
 conda install -c menpo opencv -y
 pip install opencv-python
 ```
 
 
-
+## Basis Paper
+The code is based on the following paper:
+#### [Xin Liu](https://homes.cs.washington.edu/~xliu0/), [Josh Fromm](https://www.linkedin.com/in/josh-fromm-2a4a2258/), [Shwetak Patel](https://ubicomplab.cs.washington.edu/members/), [Daniel McDuff](https://www.microsoft.com/en-us/research/people/damcduff/), “Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement”, NeurIPS 2020, Oral Presentation (105 out of 9454 submissions)´
 
 ## Contact
 
-Please post your technical questions regarding this repo via Github Issues. 
-
-
-
-
-
-
-
+Please post your technical questions regarding this repo via Github Issues.
diff --git a/code/custom_fit.py b/code/custom_fit.py
@@ -0,0 +1,184 @@
+import tensorflow as tf
+from tensorflow import keras
+from tensorflow.python.keras.engine import data_adapter
+from tensorflow.python.eager import backprop
+from tensorflow.python.keras.mixed_precision.experimental import loss_scale_optimizer as lso
+from tensorflow.python.distribute import parameter_server_strategy
+
+
+class CustomModel(keras.Model):
+    def train_step(self, data):
+        # Unpack the data. Its structure depends on your model and
+        # on what you pass to `fit()`.
+        data = data_adapter.expand_1d(data)
+        x, y, sample_weight = data_adapter.unpack_x_y_sample_weight(data)
+
+        # with tf.GradientTape() as tape:
+        #     y_pred = self(x, training=True)  # Forward pass
+        #     # Compute the loss value
+        #     # (the loss function is configured in `compile()`)
+        #     loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses)
+
+        # # Compute gradients
+        # trainable_vars = self.trainable_variables
+        # gradients = tape.gradient(loss, trainable_vars)
+        # # Update weights
+        # self.optimizer.apply_gradients(zip(gradients, trainable_vars))
+        # # Update metrics (includes the metric that tracks the loss)
+        # self.compiled_metrics.update_state(y, y_pred)
+        # # Return a dict mapping metric names to current value
+        # return {m.name: m.result() for m in self.metrics}
+
+        with backprop.GradientTape() as tape:
+            y_pred = self(x, training=True)
+
+            # y_pred = get_peaks(y_pred)
+            # y = get_peaks(y)
+            # y, y_pred = filt_peaks(y, y_pred)
+            # y = tf.cast(y, tf.float32)
+            # y_pred = tf.cast(y_pred, tf.float32)
+
+            loss = self.compiled_loss(
+                y, y_pred, sample_weight, regularization_losses=self.losses)
+            # For custom training steps, users can just write:
+            #   trainable_variables = self.trainable_variables
+            #   gradients = tape.gradient(loss, trainable_variables)
+            #   self.optimizer.apply_gradients(zip(gradients, trainable_variables))
+            # The _minimize call does a few extra steps unnecessary in most cases,
+            # such as loss scaling and gradient clipping.
+        _minimize(self.distribute_strategy, tape, self.optimizer, loss,
+                    self.trainable_variables)
+
+        self.compiled_metrics.update_state(y, y_pred, sample_weight)
+        return {m.name: m.result() for m in self.metrics}
+
+
+def _minimize(strategy, tape, optimizer, loss, trainable_variables):
+  """Minimizes loss for one step by updating `trainable_variables`.
+
+  This is roughly equivalent to
+
+  ```python
+  gradients = tape.gradient(loss, trainable_variables)
+  self.optimizer.apply_gradients(zip(gradients, trainable_variables))
+  ```
+
+  However, this function also applies gradient clipping and loss scaling if the
+  optimizer is a LossScaleOptimizer.
+
+  Args:
+    strategy: `tf.distribute.Strategy`.
+    tape: A gradient tape. The loss must have been computed under this tape.
+    optimizer: The optimizer used to minimize the loss.
+    loss: The loss tensor.
+    trainable_variables: The variables that will be updated in order to minimize
+      the loss.
+  """
+
+  with tape:
+    if isinstance(optimizer, lso.LossScaleOptimizer):
+      loss = optimizer.get_scaled_loss(loss)
+
+  gradients = tape.gradient(loss, trainable_variables)
+
+  # Whether to aggregate gradients outside of optimizer. This requires support
+  # of the optimizer and doesn't work with ParameterServerStrategy and
+  # CentralStroageStrategy.
+  aggregate_grads_outside_optimizer = (
+      optimizer._HAS_AGGREGATE_GRAD and  # pylint: disable=protected-access
+      not isinstance(strategy.extended,
+                     parameter_server_strategy.ParameterServerStrategyExtended))
+
+  if aggregate_grads_outside_optimizer:
+    # We aggregate gradients before unscaling them, in case a subclass of
+    # LossScaleOptimizer all-reduces in fp16. All-reducing in fp16 can only be
+    # done on scaled gradients, not unscaled gradients, for numeric stability.
+    gradients = optimizer._aggregate_gradients(zip(gradients,  # pylint: disable=protected-access
+                                                   trainable_variables))
+  if isinstance(optimizer, lso.LossScaleOptimizer):
+    gradients = optimizer.get_unscaled_gradients(gradients)
+  gradients = optimizer._clip_gradients(gradients)  # pylint: disable=protected-access
+  if trainable_variables:
+    if aggregate_grads_outside_optimizer:
+      optimizer.apply_gradients(
+          zip(gradients, trainable_variables),
+          experimental_aggregate_gradients=False)
+    else:
+      optimizer.apply_gradients(zip(gradients, trainable_variables))
+
+@tf.function
+def get_peaks(y):
+    # y: (N,)
+    data_reshaped = tf.reshape(y, (1, -1, 1)) # (1, N, 1)
+    max_pooled_in_tensor =  tf.nn.max_pool(data_reshaped, (20,), 1,'SAME')
+    maxima = tf.equal(data_reshaped,max_pooled_in_tensor) # (1, N, 1)
+    maxima = tf.cast(maxima, tf.float32)
+    maxima = tf.squeeze(maxima) # (N,1)
+    peaks = tf.where(maxima) # now only the Peak Indices (A, 3)
+    peaks = tf.reshape(peaks, (tf.size(y),)) # (A,1)
+
+    return peaks
+
+# x: true y: prediction
+# input: peaks of truth and prediction as tensor...
+@tf.function
+def filt_peaks(x,y):
+    def true_fn():
+        return min
+    def false_fn():
+        return tf.cast(-1, tf.int64)
+    max_offset = 10
+    mask = tf.cast(tf.zeros(tf.size(x)),tf.bool) # tensor with size of x (truth data)
+    # check which peaks of truth are recognized in pred
+    min = 0
+    min = tf.cast(min, tf.int64)
+
+    # for item in y: # items of predicion
+    #     diff = tf.abs(x - item) # diff of truth data and item
+    #     min = tf.reduce_min(diff) # minimum of diff
+    #     min = tf.cond(tf.less(min, max_offset), true_fn, false_fn)
+    #     temp_mask = tf.equal(min, diff)
+    #     mask = tf.logical_or(mask, temp_mask)
+
+    # x = tf.boolean_mask(x, mask)
+    def fn(item):
+        def true_fn():
+            return tf.cast(min, tf.float64)
+        def false_fn():
+            return tf.cast(-1, tf.float64)
+        diff = tf.abs(x - item) # diff of truth data and item
+        diff = tf.cast(diff, tf.float64)
+        min = tf.reduce_min(diff) # minimum of diff
+        min = tf.cond(tf.less(min, max_offset), true_fn, false_fn)
+        temp_mask = tf.equal(min, diff)
+        return temp_mask
+    mask1 = tf.map_fn(fn=lambda item: fn(item), elems=y, fn_output_signature=tf.bool)
+    mask1 = tf.reduce_any(mask1, 0)
+    x = tf.boolean_mask(x,mask1)
+
+    # check if outliners are in pred
+    # mask = tf.cast(tf.zeros(tf.size(y)), tf.bool)
+    # for item in x: 
+    #     diff = tf.abs(y - item) # diff of truth data and item
+    #     min = tf.reduce_min(diff) # minimum of diff
+    #     min = tf.cond(tf.less(min, max_offset), true_fn, false_fn)
+    #     temp_mask = tf.equal(min, diff)
+    #     mask = tf.logical_or(mask, temp_mask)
+    # y = tf.boolean_mask(y,mask)
+
+    def fn2(item):
+        def true_fn():
+            return tf.cast(min, tf.float64)
+        def false_fn():
+            return tf.cast(-1, dtype=tf.float64)
+        diff = tf.abs(y - item) # diff of truth data and item
+        diff = tf.cast(diff, tf.float64)
+        min = tf.reduce_min(diff) # minimum of diff
+        min = tf.cond(tf.less(min, max_offset), true_fn, false_fn)
+        temp_mask = tf.equal(min, diff)
+        return temp_mask
+    mask2 = tf.map_fn(fn=lambda item: fn2(item), elems=x, fn_output_signature=tf.bool)
+    mask2 = tf.reduce_any(mask2, 0)
+    y = tf.boolean_mask(y,mask2)
+
+    return x, y