.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/smooth.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_smooth.py: Smooth pose tracks ===================== Smooth pose tracks using the rolling median and Savitzky-Golay filters. .. GENERATED FROM PYTHON SOURCE LINES 8-10 Imports ------- .. GENERATED FROM PYTHON SOURCE LINES 10-21 .. code-block:: Python import matplotlib.pyplot as plt from scipy.signal import welch from movement import sample_data from movement.filtering import ( interpolate_over_time, rolling_filter, savgol_filter, ) .. GENERATED FROM PYTHON SOURCE LINES 22-28 Load a sample dataset --------------------- Let's load a sample dataset and print it to inspect its contents. Note that if you are running this notebook interactively, you can simply type the variable name (here ``ds_wasp``) in a cell to get an interactive display of the dataset's contents. .. GENERATED FROM PYTHON SOURCE LINES 28-32 .. code-block:: Python ds_wasp = sample_data.fetch_dataset("DLC_single-wasp.predictions.h5") print(ds_wasp) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 61kB Dimensions: (time: 1085, space: 2, keypoints: 2, individuals: 1) Coordinates: * time (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1 * space (space) ` to compute the rolling mean, maximum, and minimum values (instead of the median), by setting ``statistic`` to ``"mean"``, ``"max"``, or ``"min"``, respectively. .. GENERATED FROM PYTHON SOURCE LINES 132-143 .. code-block:: Python window = int(0.1 * ds_wasp.fps) ds_wasp_smooth = ds_wasp.copy() ds_wasp_smooth.update( { "position": rolling_filter( ds_wasp.position, window, statistic="median", print_report=True ) } ) .. rst-class:: sphx-glr-script-out .. code-block:: none No missing points (marked as NaN) in input. No missing points (marked as NaN) in output. .. raw:: html
<xarray.Dataset> Size: 61kB
    Dimensions:      (time: 1085, space: 2, keypoints: 2, individuals: 1)
    Coordinates:
      * time         (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U7 56B 'head' 'stinger'
      * individuals  (individuals) <U12 48B 'individual_0'
    Data variables:
        position     (time, space, keypoints, individuals) float64 35kB 1.086e+03...
        confidence   (time, keypoints, individuals) float64 17kB 0.05305 ... 0.0
    Attributes:
        source_software:  DeepLabCut
        ds_type:          poses
        fps:              40.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/DLC_single-wasp.predi...
        frame_path:       /home/runner/.movement/data/frames/single-wasp_frame-10...


.. GENERATED FROM PYTHON SOURCE LINES 144-147 We see from the printed report that the dataset has no missing values neither before nor after smoothing. Let's visualise the effects of applying the rolling median filter in the time and frequency domains. .. GENERATED FROM PYTHON SOURCE LINES 147-152 .. code-block:: Python plot_raw_and_smooth_timeseries_and_psd( ds_wasp, ds_wasp_smooth, keypoint="stinger" ) .. image-sg:: /examples/images/sphx_glr_smooth_001.png :alt: Time Domain, Frequency Domain :srcset: /examples/images/sphx_glr_smooth_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 153-163 We see that applying the filter has removed the "spikes" present around the 14 second mark in the raw data. However, it has not dealt with the big shift occurring during the final second. In the frequency domain, we can see that the filter has reduced the power in the high-frequency components, without affecting the low frequency components. This shows what the rolling median is good for: removing brief "spikes" (e.g. a keypoint abruptly jumping to a different location for a frame or two) and high-frequency "jitter" (often present due to pose estimation working on a per-frame basis). .. GENERATED FROM PYTHON SOURCE LINES 165-172 Choosing parameters for the rolling filter ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We can control the behaviour of the rolling filter via three parameters: ``window``, ``min_periods`` and ``statistic`` which was mentioned above. To better understand the effect of these parameters, let's use a dataset that contains missing values. .. GENERATED FROM PYTHON SOURCE LINES 172-176 .. code-block:: Python ds_mouse = sample_data.fetch_dataset("SLEAP_single-mouse_EPM.analysis.h5") print(ds_mouse) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 1MB Dimensions: (time: 18485, space: 2, keypoints: 6, individuals: 1) Coordinates: * time (time) float64 148kB 0.0 0.03333 0.06667 ... 616.1 616.1 616.1 * space (space)
<xarray.Dataset> Size: 1MB
    Dimensions:      (time: 18485, space: 2, keypoints: 6, individuals: 1)
    Coordinates:
      * time         (time) float64 148kB 0.0 0.03333 0.06667 ... 616.1 616.1 616.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U9 216B 'snout' 'left_ear' ... 'tail_end'
      * individuals  (individuals) <U4 16B 'id_0'
    Data variables:
        position     (time, space, keypoints, individuals) float32 887kB nan ... ...
        confidence   (time, keypoints, individuals) float32 444kB nan nan ... 0.7607
    Attributes:
        source_software:  SLEAP
        ds_type:          poses
        fps:              30.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/SLEAP_single-mouse_EP...
        frame_path:       /home/runner/.movement/data/frames/single-mouse_EPM_fra...


.. GENERATED FROM PYTHON SOURCE LINES 195-206 The report informs us that the raw data contains NaN values, most of which occur at the ``snout`` and ``tail_end`` keypoints. After filtering, the number of NaNs has increased. This is because the default behaviour of the rolling filter is to propagate NaN values, i.e. if any value in the rolling window is NaN, the output will also be NaN. To modify this behaviour, you can set the value of the ``min_periods`` parameter to an integer value. This parameter determines the minimum number of non-NaN values required in the window for the output to be non-NaN. For example, setting ``min_periods=2`` means that two non-NaN values in the window are sufficient for the median to be calculated. Let's try this. .. GENERATED FROM PYTHON SOURCE LINES 206-219 .. code-block:: Python ds_mouse_smooth.update( { "position": rolling_filter( ds_mouse.position, window, min_periods=2, statistic="median", print_report=True, ) } ) .. rst-class:: sphx-glr-script-out .. code-block:: none Missing points (marked as NaN) in input: keypoints snout left_ear right_ear centre tail_base tail_end individuals id_0 4494/18485 (24.31%) 513/18485 (2.78%) 533/18485 (2.88%) 490/18485 (2.65%) 704/18485 (3.81%) 2496/18485 (13.5%) Missing points (marked as NaN) in output: keypoints snout left_ear right_ear centre tail_base tail_end individuals id_0 4455/18485 (24.1%) 487/18485 (2.63%) 507/18485 (2.74%) 465/18485 (2.52%) 673/18485 (3.64%) 2428/18485 (13.13%) .. raw:: html
<xarray.Dataset> Size: 1MB
    Dimensions:      (time: 18485, space: 2, keypoints: 6, individuals: 1)
    Coordinates:
      * time         (time) float64 148kB 0.0 0.03333 0.06667 ... 616.1 616.1 616.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U9 216B 'snout' 'left_ear' ... 'tail_end'
      * individuals  (individuals) <U4 16B 'id_0'
    Data variables:
        position     (time, space, keypoints, individuals) float32 887kB nan ... ...
        confidence   (time, keypoints, individuals) float32 444kB nan nan ... 0.7607
    Attributes:
        source_software:  SLEAP
        ds_type:          poses
        fps:              30.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/SLEAP_single-mouse_EP...
        frame_path:       /home/runner/.movement/data/frames/single-mouse_EPM_fra...


.. GENERATED FROM PYTHON SOURCE LINES 220-226 We see that this time the number of NaN values has decreased across all keypoints. Let's visualise the effects of the rolling median filter in the time and frequency domains. Here we focus on the first 80 seconds for the ``snout`` keypoint. You can adjust the ``keypoint`` and ``time_range`` arguments to explore other parts of the data. .. GENERATED FROM PYTHON SOURCE LINES 226-231 .. code-block:: Python plot_raw_and_smooth_timeseries_and_psd( ds_mouse, ds_mouse_smooth, keypoint="snout", time_range=slice(0, 80) ) .. image-sg:: /examples/images/sphx_glr_smooth_002.png :alt: Time Domain, Frequency Domain :srcset: /examples/images/sphx_glr_smooth_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 232-236 The smoothing once again reduces the power of high-frequency components, but the resulting time series stays quite close to the raw data. What happens if we increase the ``window`` to 2 seconds (60 frames)? .. GENERATED FROM PYTHON SOURCE LINES 236-250 .. code-block:: Python window = int(2 * ds_mouse.fps) ds_mouse_smooth.update( { "position": rolling_filter( ds_mouse.position, window, min_periods=2, statistic="median", print_report=True, ) } ) .. rst-class:: sphx-glr-script-out .. code-block:: none Missing points (marked as NaN) in input: keypoints snout left_ear right_ear centre tail_base tail_end individuals id_0 4494/18485 (24.31%) 513/18485 (2.78%) 533/18485 (2.88%) 490/18485 (2.65%) 704/18485 (3.81%) 2496/18485 (13.5%) Missing points (marked as NaN) in output: keypoints snout left_ear right_ear centre tail_base tail_end individuals id_0 795/18485 (4.3%) 80/18485 (0.43%) 80/18485 (0.43%) 80/18485 (0.43%) 80/18485 (0.43%) 239/18485 (1.29%) .. raw:: html
<xarray.Dataset> Size: 1MB
    Dimensions:      (time: 18485, space: 2, keypoints: 6, individuals: 1)
    Coordinates:
      * time         (time) float64 148kB 0.0 0.03333 0.06667 ... 616.1 616.1 616.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U9 216B 'snout' 'left_ear' ... 'tail_end'
      * individuals  (individuals) <U4 16B 'id_0'
    Data variables:
        position     (time, space, keypoints, individuals) float32 887kB nan ... ...
        confidence   (time, keypoints, individuals) float32 444kB nan nan ... 0.7607
    Attributes:
        source_software:  SLEAP
        ds_type:          poses
        fps:              30.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/SLEAP_single-mouse_EP...
        frame_path:       /home/runner/.movement/data/frames/single-mouse_EPM_fra...


.. GENERATED FROM PYTHON SOURCE LINES 251-256 The number of NaN values has decreased even further. That's because the chance of finding at least 2 valid values within a 2-second window (i.e. 60 frames) is quite high. Let's plot the results for the same keypoint and time range as before. .. GENERATED FROM PYTHON SOURCE LINES 256-260 .. code-block:: Python plot_raw_and_smooth_timeseries_and_psd( ds_mouse, ds_mouse_smooth, keypoint="snout", time_range=slice(0, 80) ) .. image-sg:: /examples/images/sphx_glr_smooth_003.png :alt: Time Domain, Frequency Domain :srcset: /examples/images/sphx_glr_smooth_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 261-269 We see that the filtered time series is much smoother and it has even "bridged" over some small gaps. That said, it often deviates from the raw data, in ways that may not be desirable, depending on the application. Here, our choice of ``window`` may be too large. In general, you should choose a ``window`` that is small enough to preserve the original data structure, but large enough to remove "spikes" and high-frequency noise. Always inspect the results to ensure that the filter is not removing important features. .. GENERATED FROM PYTHON SOURCE LINES 271-287 Smoothing with a Savitzky-Golay filter -------------------------------------- Here we apply the :func:`movement.filtering.savgol_filter` function (a wrapper around :func:`scipy.signal.savgol_filter`), to the ``position`` data variable. The Savitzky-Golay filter is a polynomial smoothing filter that can be applied to time series data on a rolling window basis. A polynomial with a degree specified by ``polyorder`` is applied to each data segment defined by the size ``window``. The value of the polynomial at the midpoint of each ``window`` is then used as the output value. Let's try it on the mouse dataset, this time using a 0.2-second window (i.e. 6 frames) and the default ``polyorder=2`` for smoothing. As before, we first compute the corresponding number of observations to be used as the ``window`` size. .. GENERATED FROM PYTHON SOURCE LINES 287-293 .. code-block:: Python window = int(0.2 * ds_mouse.fps) ds_mouse_smooth.update( {"position": savgol_filter(ds_mouse.position, window, print_report=True)} ) .. rst-class:: sphx-glr-script-out .. code-block:: none Missing points (marked as NaN) in input: keypoints snout left_ear right_ear centre tail_base tail_end individuals id_0 4494/18485 (24.31%) 513/18485 (2.78%) 533/18485 (2.88%) 490/18485 (2.65%) 704/18485 (3.81%) 2496/18485 (13.5%) Missing points (marked as NaN) in output: keypoints snout left_ear right_ear centre tail_base tail_end individuals id_0 5810/18485 (31.43%) 895/18485 (4.84%) 905/18485 (4.9%) 839/18485 (4.54%) 1186/18485 (6.42%) 3801/18485 (20.56%) .. raw:: html
<xarray.Dataset> Size: 1MB
    Dimensions:      (time: 18485, space: 2, keypoints: 6, individuals: 1)
    Coordinates:
      * time         (time) float64 148kB 0.0 0.03333 0.06667 ... 616.1 616.1 616.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U9 216B 'snout' 'left_ear' ... 'tail_end'
      * individuals  (individuals) <U4 16B 'id_0'
    Data variables:
        position     (time, space, keypoints, individuals) float32 887kB nan ... ...
        confidence   (time, keypoints, individuals) float32 444kB nan nan ... 0.7607
    Attributes:
        source_software:  SLEAP
        ds_type:          poses
        fps:              30.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/SLEAP_single-mouse_EP...
        frame_path:       /home/runner/.movement/data/frames/single-mouse_EPM_fra...


.. GENERATED FROM PYTHON SOURCE LINES 294-300 We see that the number of NaN values has increased after filtering. This is for the same reason as with the rolling filter (in its default mode), i.e. if there is at least one NaN value in the window, the output will be NaN. Unlike the rolling filter, the Savitzky-Golay filter does not provide a ``min_periods`` parameter to control this behaviour. Let's visualise the effects in the time and frequency domains. .. GENERATED FROM PYTHON SOURCE LINES 300-304 .. code-block:: Python plot_raw_and_smooth_timeseries_and_psd( ds_mouse, ds_mouse_smooth, keypoint="snout", time_range=slice(0, 80) ) .. image-sg:: /examples/images/sphx_glr_smooth_004.png :alt: Time Domain, Frequency Domain :srcset: /examples/images/sphx_glr_smooth_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 305-307 Once again, the power of high-frequency components has been reduced, but more missing values have been introduced. .. GENERATED FROM PYTHON SOURCE LINES 309-310 Now let's apply the same Savitzky-Golay filter to the wasp dataset. .. GENERATED FROM PYTHON SOURCE LINES 310-316 .. code-block:: Python window = int(0.2 * ds_wasp.fps) ds_wasp_smooth.update( {"position": savgol_filter(ds_wasp.position, window, print_report=True)} ) .. rst-class:: sphx-glr-script-out .. code-block:: none No missing points (marked as NaN) in input. No missing points (marked as NaN) in output. .. raw:: html
<xarray.Dataset> Size: 61kB
    Dimensions:      (time: 1085, space: 2, keypoints: 2, individuals: 1)
    Coordinates:
      * time         (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U7 56B 'head' 'stinger'
      * individuals  (individuals) <U12 48B 'individual_0'
    Data variables:
        position     (time, space, keypoints, individuals) float64 35kB 1.086e+03...
        confidence   (time, keypoints, individuals) float64 17kB 0.05305 ... 0.0
    Attributes:
        source_software:  DeepLabCut
        ds_type:          poses
        fps:              40.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/DLC_single-wasp.predi...
        frame_path:       /home/runner/.movement/data/frames/single-wasp_frame-10...


.. GENERATED FROM PYTHON SOURCE LINES 317-320 .. code-block:: Python plot_raw_and_smooth_timeseries_and_psd( ds_wasp, ds_wasp_smooth, keypoint="stinger" ) .. image-sg:: /examples/images/sphx_glr_smooth_005.png :alt: Time Domain, Frequency Domain :srcset: /examples/images/sphx_glr_smooth_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 321-329 This example shows two important limitations of the Savitzky-Golay filter. First, the filter can introduce artefacts around sharp boundaries. For example, focus on what happens around the sudden drop in position during the final second. Second, the PSD appears to have large periodic drops at certain frequencies. Both of these effects vary with the choice of ``window`` and ``polyorder``. You can read more about these and other limitations of the Savitzky-Golay filter in `this paper `_. .. GENERATED FROM PYTHON SOURCE LINES 332-340 Combining multiple smoothing filters ------------------------------------ We can also combine multiple smoothing filters by applying them sequentially. For example, we can first apply the rolling median filter with a small ``window`` to remove "spikes" and then apply the Savitzky-Golay filter with a larger ``window`` to further smooth the data. Between the two filters, we can interpolate over small gaps to avoid the excessive proliferation of NaN values. Let's try this on the mouse dataset. .. GENERATED FROM PYTHON SOURCE LINES 340-364 .. code-block:: Python # First, we will apply the rolling median filter. window = int(0.1 * ds_mouse.fps) ds_mouse_smooth.update( { "position": rolling_filter( ds_mouse.position, window, min_periods=2, statistic="median" ) } ) # Next, let's linearly interpolate over gaps smaller # than 1 second (30 frames). ds_mouse_smooth.update( {"position": interpolate_over_time(ds_mouse_smooth.position, max_gap=30)} ) # Finally, let's apply the Savitzky-Golay filter # over a 0.4-second window (12 frames). window = int(0.4 * ds_mouse.fps) ds_mouse_smooth.update( {"position": savgol_filter(ds_mouse_smooth.position, window)} ) .. raw:: html
<xarray.Dataset> Size: 1MB
    Dimensions:      (time: 18485, space: 2, keypoints: 6, individuals: 1)
    Coordinates:
      * time         (time) float64 148kB 0.0 0.03333 0.06667 ... 616.1 616.1 616.1
      * space        (space) <U1 8B 'x' 'y'
      * keypoints    (keypoints) <U9 216B 'snout' 'left_ear' ... 'tail_end'
      * individuals  (individuals) <U4 16B 'id_0'
    Data variables:
        position     (time, space, keypoints, individuals) float32 887kB nan ... ...
        confidence   (time, keypoints, individuals) float32 444kB nan nan ... 0.7607
    Attributes:
        source_software:  SLEAP
        ds_type:          poses
        fps:              30.0
        time_unit:        seconds
        source_file:      /home/runner/.movement/data/poses/SLEAP_single-mouse_EP...
        frame_path:       /home/runner/.movement/data/frames/single-mouse_EPM_fra...


.. GENERATED FROM PYTHON SOURCE LINES 365-368 A record of all applied operations is stored in the ``log`` attribute of the ``ds_mouse_smooth.position`` data array. Let's inspect it to summarise what we've done. .. GENERATED FROM PYTHON SOURCE LINES 368-371 .. code-block:: Python print(ds_mouse_smooth.position.log) .. rst-class:: sphx-glr-script-out .. code-block:: none [ { "operation": "rolling_filter", "datetime": "2025-09-23 10:00:56.409888", "window": "3", "statistic": "'median'", "min_periods": "2", "print_report": "False" }, { "operation": "interpolate_over_time", "datetime": "2025-09-23 10:00:56.434722", "method": "'linear'", "max_gap": "30", "print_report": "False" }, { "operation": "savgol_filter", "datetime": "2025-09-23 10:00:56.438751", "window": "12", "polyorder": "2", "print_report": "False" } ] .. GENERATED FROM PYTHON SOURCE LINES 372-374 Now let's visualise the difference between the raw data and the final smoothed result. .. GENERATED FROM PYTHON SOURCE LINES 374-382 .. code-block:: Python plot_raw_and_smooth_timeseries_and_psd( ds_mouse, ds_mouse_smooth, keypoint="snout", time_range=slice(0, 80), ) .. image-sg:: /examples/images/sphx_glr_smooth_006.png :alt: Time Domain, Frequency Domain :srcset: /examples/images/sphx_glr_smooth_006.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 383-385 Feel free to play around with the parameters of the applied filters and to also look at other keypoints and time ranges. .. GENERATED FROM PYTHON SOURCE LINES 387-391 .. seealso:: :ref:`examples/filter_and_interpolate:Filtering multiple data variables` in the :ref:`sphx_glr_examples_filter_and_interpolate.py` example. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.654 seconds) .. _sphx_glr_download_examples_smooth.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/neuroinformatics-unit/movement/gh-pages?filepath=notebooks/examples/smooth.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: smooth.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: smooth.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: smooth.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_