Cosmos-UK Soil Moisture (UKCEH)#

To load and visualise daily hydrometeorological and soil data from the 2013-2019 public COSMOS-UK dataset.

Sensor description#

Since 2013 the UK Centre for Ecology & Hydrology (UKCEH) has established the world’s most spatially dense national network of cosmic-ray neutron sensors (CRNSs) to monitor soil moisture across the UK. The Cosmic-ray Soil Moisture Observing System for the UK (COSMOS-UK) delivers field-scale soil water volumetric content (VWC) measurements for around 50 sites in near-real time. In addition to measuring field-scale (or local) soil moisture, the network collects a large number of hydrometeorological and soil data variables, including VWC measured by point-scale (or site) soil moisture sensors.

This notebook explores a subset of 4 out of 51 stations available in the public COSMOS-UK dataset. These stations represent the first sites to prototype COSMOS sensors in the UK, see further details in Evans et al. (2016) and they are situated in human-intervened areas (grassland and cropland), except for one in a woodland land cover site.

The media below, available in the UKCEH YouTube channel, summarises the concept of cosmic-ray neutron sensors and how they provide non-invasive soil moisture measurments at field scale.


  • Fetch COSMOS-UK dataset files through intake.

  • Inspect the available metadata with information about the sites, their locations and other site-specific attributes.

  • Explore relationships between daily mean soil moisture and potential evapotranspiration derived from the meteorological measurements at the site.

  • Analyse yearly change of daily mean soil moisture observations.

  • Compare local and site soil moisture measurements at daily resolution.



  • Alejandro Coca-Castro (author), The Alan Turing Institute, @acocac

  • Doran Khamis (reviewer), UK Centre for Ecology & Hydrology, @dorankhamis

  • Matt Fry (reviewer), UK Centre for Ecology & Hydrology, @mattfry-ceh

Dataset originator/creator#

  • UK Centre for Ecology & Hydrology (creator)

  • Natural Environment Research Council (support)

Dataset reference and documentation#

  • S. Stanley, V. Antoniou, A. Askquith-Ellis, L.A. Ball, E.S. Bennett, J.R. Blake, D.B. Boorman, M. Brooks, M. Clarke, H.M. Cooper, N. Cowan, A. Cumming, J.G. Evans, P. Farrand, M. Fry, O.E. Hitt, W.D. Lord, R. Morrison, G.V. Nash, D. Rylett, P.M. Scarlett, O.D. Swain, M. Szczykulska, J.L. Thornton, E.J. Trill, A.C. Warwick, and B. Winterbourn. Daily and sub-daily hydrometeorological and soil data (2013-2019) [cosmos-uk]. 2021. URL:, doi:10.5285/b5c190e4-e35d-40ea-8fbe-598da03a1185.

Further references#

  • Jonathan G. Evans, H. C. Ward, J. R. Blake, E. J. Hewitt, R. Morrison, M. Fry, L. A. Ball, L. C. Doughty, J. W. Libre, O. E. Hitt, D. Rylett, R. J. Ellis, A. C. Warwick, M. Brooks, M. A. Parkes, G. M.H. Wright, A. C. Singer, D. B. Boorman, and A. Jenkins. Soil water content in southern england derived from a cosmic-ray soil moisture observing system – cosmos-uk. Hydrological Processes, 30:4987–4999, 12 2016. doi:10.1002/hyp.10929.

  • M. Zreda, W. J. Shuttleworth, X. Zeng, C. Zweck, D. Desilets, T. Franz, and R. Rosolem. Cosmos: the cosmic-ray soil moisture observing system. Hydrology and Earth System Sciences, 16(11):4079–4099, 2012. URL:, doi:10.5194/hess-16-4079-2012.


Data from COSMOS-UK up to the end of 2019 are available for download from the UKCEH Environmental Information Data Centre (EIDC). The data are accompanied by documentation that describes the site-specific instrumentation, data and processing including quality control. The dataset is available for download under the terms of the Open Government Licence.

COSMOS-UK work was supported by the Natural Environment Research Council award number NE/R016429/1 as part of the UK-SCAPE programme delivering National Capability.

Load libraries#

import os
import pandas as pd
import intake
import holoviews as hv
import panel as pn
import matplotlib.pyplot as plt
from bokeh.models.formatters import DatetimeTickFormatter
from datetime import datetime

import hvplot.pandas
import hvplot.xarray  # noqa

import pooch

import warnings

pd.options.display.max_columns = 10

Set project structure#

notebook_folder = './notebook'
if not os.path.exists(notebook_folder):

Fetch and load data#

Let’s download the sample data. We use pooch to fetch and unzip them directly from a Zenodo repository.

Load an intake catalog for the downloaded data

# set catalogue location
catalog_file = os.path.join(notebook_folder, 'catalog.yaml')

with open(catalog_file, 'w') as f:
    driver: csv
        description: five letter code for the COSMOS-UK site
        type: str
        default: CHIMN
        description: temporal resolution
        type: str
        default: Daily
          - Daily
          - Hourly
          - SH
      urlpath: "{{ CATALOG_DIR }}/data/COSMOS-UK_HydroSoil_{{resolution}}_2013-2019/COSMOS-UK_{{stationid}}_HydroSoil_{{resolution}}_2013-2019.csv"
        na_values: [-9999]
        parse_dates: ['DATE_TIME']

    driver: csv
        description: temporal resolution
        type: str
        default: Daily
          - Daily
          - Hourly
          - SH
      urlpath: "{{ CATALOG_DIR }}/data/COSMOS-UK_HydroSoil_{{resolution}}_2013-2019/COSMOS-UK_*.csv"
        na_values: [-9999]
        parse_dates: ['DATE_TIME']

    driver: csv
      urlpath: "{{ CATALOG_DIR }}/data/COSMOS-UK_SiteMetadata_2013-2019.csv"
        header: 0
        parse_dates: [ 'START_DATE','END_DATE']

    driver: csv
        description: temporal resolution
        type: str
        default: Daily
          - Daily
          - Hourly
          - SH
      urlpath: "{{ CATALOG_DIR }}/data/COSMOS-UK_HydroSoil_{{resolution}}_2013-2019_Metadata.csv"
        header: 0

    driver: intake_xarray.image.ImageSource
        description: five letter code for the COSMOS-UK site
        type: str
        default: CHIMN
      urlpath: "{{stationid}}.jpg"
    storage_options: {'anon': True}
cat = intake.open_catalog(catalog_file)

Load metadata#

Here we load COSMOS-UK metadata into memory. The metadata contains multiple columns about the sites, their locations and other site-specific attributes.

metadata = cat.metadata_sites().read()
0 Chimney Meadows CHIMN 2013-10-02 NaT 436113 ... 0.095 0.027 0.0016 0.011 0.0018
1 Sheepdrove SHEEP 2013-10-24 NaT 436039 ... 0.140 0.059 0.0086 0.027 0.0060
2 Waddesdon WADDN 2013-11-04 NaT 472548 ... 0.160 0.034 0.0023 0.021 0.0021
3 Wytham Woods WYTH1 2013-11-26 2016-10-01 445738 ... 0.190 0.028 0.0044 0.017 0.0047

4 rows × 19 columns

For this example, we will explore a subset of four stations, all of them with start date in 2013. Only the Wytham Woods station ceased on 10th January 2016. This station is situated in a Broadleaf woodland land cover which also hosts Environmental Change Network (ECN) and FLUXNET monitoring sites (see further details here). The dataframe contains each site name, id and corresponding land cover. CHIMN and WADDN are located situated in improved grassland, and SHEEP is in arable and horticulture.

0 Chimney Meadows CHIMN Improved grassland
1 Sheepdrove SHEEP Arable and horticulture (previously improved g...
2 Waddesdon WADDN Improved grassland
3 Wytham Woods WYTH1 Broadleaf woodland

A key feature of COSMOS-UK stations is their capability of monitoring field-scale soil moisture. The CRNSs VWC value is an average soil moisture measurement (%) across an estimated, variable footprint of radius up to 200 m and estimated variable measurement depth of between approximately 0.1 and 0.8 m. It is worth mentioning the measurement depth depends on the soil moisture content as well as lattice water and soil organic matter water equivalent (see Cooper et al. 2021). The greater the actual soil water content, the shallower the penetrative depth. Let’s explore the notional footprint of the analysed stations from the CEH COSMOS-UK website.

# set sliders
station_list = list(metadata.SITE_ID.tolist())

target_station = pn.widgets.Select(name = 'Station', options = station_list)

def plot_footprint(station):
    location_da = cat.location(stationid=station).to_dask()
    p = location_da.hvplot.rgb(x='x', y='y', bands='channel', data_aspect=1, flip_yaxis=True, xaxis=False, yaxis=None, hover=False)
    return p

plot_stations = pn.Row(
    pn.Column(pn.Spacer(height=5), target_station, background='#f0f0f0', sizing_mode="fixed"),
    width_policy='max', height_policy='max',


Load daily data#

Here we load COSMOS-UK daily data into memory. The daily data is the level with the highest processing and derived from subhourly data. Note only certain variables are provided at this level.

site_daily_all = cat.data_all(resolution='Daily').read()
['DATE_TIME', 'SITE_ID', 'COSMOS_VWC', 'D86_75M', 'SWE', 'SNOW', 'ALBEDO', 'PE']

To further understand the meaning of above columns, the COSMOS-UK dataset include a separate metadata file by time resolution. Let’s explore the metadata for daily measurements. The dataframe below includes further details of each variable, including the unit, aggregation and data type. For instance, soil moisture measurements at daily resolution refer to the daily mean derived from CRNSs.

metadata_daily = cat.metadata_measurements(resolution='Daily').read()
0 PE Potential evapotranspiration 1 Day mm Daily total Derived
1 COSMOS_VWC Soil moisture (CRNS VWC) 1 Day % Daily mean Derived
2 D86_75M Effective depth of CRNS (D86 at 75m) 1 Day cm Daily mean Derived
3 SWE Snow water equivalent (from CRNS) 1 Day mm Value at midday Derived
4 ALBEDO Albedo 1 Day UNITLESS Mean between 10:00 and 14:00 Derived
5 SNOW Snow days 1 Day UNITLESS Daily value Derived


The plot below shows two timeseries, soil moisture and potential evapotranspiration (PE), provided by the daily COSMOS-UK dataset. PE refers to the potential evaporation from soils plus transpiration by plants (so called evapotranspiration). PE assumes there is always adequate moisture to match the evapotranspiration demand. We evidence this relationship in the daily aggregated data of both variables as it is shown in the plot below. We also note each station has a different time span with the SITE_ID equal to WYTH1 containing the shortest records.

# set sliders
station_list = list(metadata.SITE_ID.tolist())

target_station = pn.widgets.Select(name = 'Station', options = station_list)

# set formater for dates
formatter = DatetimeTickFormatter(months='%b %Y')

def plot_pe_vwc(station):
    daily_dataset = cat.data_siteid(resolution='Daily', stationid=station).read()
    daily_dataset.dropna(subset = ['COSMOS_VWC','PE'], inplace=True) #remove empty rows
    p1=daily_dataset.hvplot(x='DATE_TIME', y=['COSMOS_VWC'], xformatter=formatter, xlabel = 'Date', ylabel = 'Volumetric Water Content (%)', color='blue', title='Soil Moisture (CRNS VWC)', line_width=0.8, fontscale=1.2, padding=0.2)
    p2=daily_dataset.hvplot(x='DATE_TIME', y=['PE'], xformatter=formatter, xlabel = 'Date', ylabel = 'Potential Evapotranspiration (mm)', color='red', title='Potential Evapotranspiration (1 day)', line_width=0.8, fontscale=1.2, padding=0.2)

    return (p1 + p2).cols(1)

plot_scatterplot = pn.Row(
    pn.Column(pn.Spacer(height=5), target_station, background='#f0f0f0', sizing_mode="fixed"),
    width_policy='max', height_policy='max',



To explore further the seasonal dynamics of the above variables, let’s generate correlation charts grouped by season. The highest values of PE are in the summer followed by spring, fall and winter. The forest site, WYTH1, has higher soil moisture values than the human-intervened places, CHIMN and WADDN (improved grassland), and SHEEP (arable and horticulture site).

def season(df):
    """Add season column based on lat and month
    seasons = {3: 'spring',  4: 'spring',  5: 'spring',
                   6: 'summer',  7: 'summer',  8: 'summer',
                   9: 'fall',   10: 'fall',   11: 'fall',
                  12: 'winter',  1: 'winter',  2: 'winter'}
    return df.assign(

site_daily_all = season(site_daily_all)

custom_dict = {'winter': 0, 'spring': 1, 'summer': 3, 'fall':4}
plot_season = site_daily_all.sort_values('season', key=lambda x:'COSMOS_VWC', y='PE',
    row='season', col='SITE_ID', alpha=0.2, ylabel='PE (mm)', xlabel='VWC (%)',
    fontsize = {'title': 15, 'xticks': 9, 'yticks': 9, 'labels':11}, shared_axes=True                                                                                 


The heatmap below allow us to discover temporal patterns from daily means of soil moisture. We observe 2018 contains the lowest consecutive values of VWC.

plot_heatmap = site_daily_all.hvplot.heatmap(
    title='Time series of CRNS soil moisture',
    ylabel='Site ID',
    fontsize = {'title': 15, 'xticks': 12, 'yticks': 15}

Load sub-hourly#

The subhourly data contains all preprocessed weather and soil variables, except CRNSs. Let’s explore the columns of the subhourly datasets of one of the stations, SHEEP.

subhourly_dataset = cat.data_siteid(resolution='SH', stationid='SHEEP').read()

Similar to the daily observation, the metadata file for subhourly resolution informs variable long names, their resolution, units, aggregation details and data types. In this case, most of the variables are measured. For soil moisture, the measurements provided are by time domain transmissometry (TDT) sensors. These sensors provide point measurements of soil moisture at different depths as it commonly conducted in soil moisture in-situ sensing.

metadata_subhourly = cat.metadata_measurements(resolution='SH').read()
0 G1 Soil heat flux 1 30 Minute Wm-2 Mean over preceding 30 mins Measured
1 G2 Soil heat flux 2 30 Minute Wm-2 Mean over preceding 30 mins Measured
2 LWIN Incoming longwave radiation 30 Minute Wm-2 Mean over preceding 30 mins Measured
3 LWOUT Outgoing longwave radiation 30 Minute Wm-2 Mean over preceding 30 mins Measured
4 PA Atmospheric pressure 30 Minute hPa Mean over preceding 30 mins Measured
5 PRECIP Precipitation 30 Minute mm Total over preceding 30 mins Measured
6 Q Absolute humidity 30 Minute gm-3 Mean over preceding 30 mins Derived
7 RH Relative humidity 30 Minute % Mean over preceding 30 mins Measured
8 RN Net radiation 30 Minute Wm-2 Mean over preceding 30 mins Derived
9 SNOWD_DISTANCE_COR Snow depth 30 Minute mm Instantaneous Measured
10 STP_TSOIL10 Soil temperature at depth 10cm 30 Minute Deg Celsius Mean over preceding 30 mins Measured
11 STP_TSOIL2 Soil temperature at depth 2cm 30 Minute Deg Celsius Mean over preceding 30 mins Measured
12 STP_TSOIL20 Soil temperature at depth 20cm 30 Minute Deg Celsius Mean over preceding 30 mins Measured
13 STP_TSOIL5 Soil temperature at depth 5cm 30 Minute Deg Celsius Mean over preceding 30 mins Measured
14 STP_TSOIL50 Soil temperature at depth 50cm 30 Minute Deg Celsius Mean over preceding 30 mins Measured
15 SWIN Incoming shortwave radiation 30 Minute Wm-2 Mean over preceding 30 mins Measured
16 SWOUT Outgoing shortwave radiation 30 Minute Wm-2 Mean over preceding 30 mins Measured
17 TA Air temperature 30 Minute Deg Celsius Mean over preceding 30 mins Measured
18 TDT1_TSOIL Soil temperature at depth 10cm 30 Minute Deg Celsius Instantaneous Measured
19 TDT1_VWC Soil moisture at depth 10cm 30 Minute % Instantaneous Measured
20 TDT10_TSOIL Soil temperature at depth 50cm 30 Minute Deg Celsius Instantaneous Measured
21 TDT10_VWC Soil moisture at depth 50cm 30 Minute % Instantaneous Measured
22 TDT2_TSOIL Soil temperature at depth 10cm 30 Minute Deg Celsius Instantaneous Measured
23 TDT2_VWC Soil moisture at depth 10cm 30 Minute % Instantaneous Measured
24 TDT3_TSOIL Soil temperature at depth 5cm 30 Minute Deg Celsius Instantaneous Measured
25 TDT3_VWC Soil moisture at depth 5cm 30 Minute % Instantaneous Measured
26 TDT4_TSOIL Soil temperature at depth 5cm 30 Minute Deg Celsius Instantaneous Measured
27 TDT4_VWC Soil moisture at depth 5cm 30 Minute % Instantaneous Measured
28 TDT5_TSOIL Soil temperature at depth 15cm 30 Minute Deg Celsius Instantaneous Measured
29 TDT5_VWC Soil moisture at depth 15cm 30 Minute % Instantaneous Measured
30 TDT6_TSOIL Soil temperature at depth 15cm 30 Minute Deg Celsius Instantaneous Measured
31 TDT6_VWC Soil moisture at depth 15cm 30 Minute % Instantaneous Measured
32 TDT7_TSOIL Soil temperature at depth 25cm 30 Minute Deg Celsius Instantaneous Measured
33 TDT7_VWC Soil moisture at depth 25cm 30 Minute % Instantaneous Measured
34 TDT8_TSOIL Soil temperature at depth 25cm 30 Minute Deg Celsius Instantaneous Measured
35 TDT8_VWC Soil moisture at depth 25cm 30 Minute % Instantaneous Measured
36 TDT9_TSOIL Soil temperature at depth 50cm 30 Minute Deg Celsius Instantaneous Measured
37 TDT9_VWC Soil moisture at depth 50cm 30 Minute % Instantaneous Measured
38 UX X component of wind speend 30 Minute ms-1 Mean over preceding 30 mins Measured
39 UY Y component of wind speend 30 Minute ms-1 Mean over preceding 30 mins Measured
40 UZ Z component of wind speend 30 Minute ms-1 Mean over preceding 30 mins Measured
41 WD Wind direction 30 Minute Degrees Mean over preceding 30 mins Measured
42 WS Wind speed 30 Minute ms-1 Mean over preceding 30 mins Measured

Comparison of soil moisture probes#

To compare CNRSs (local) and TDT (site) soil moisture measurements at daily resolution, it is necessary to resample the TDT measurements from subhourly to daily. The cell below defines a function to resample and join daily CNRS and resampled TDT. The function yields an interactive hvplot by station ID from the merged observations.

def site_daily(target_station):
    """Timeseries plot showing the daily mean soil moisture by sensor type"""

    # subhourly
    daily_dataset = cat.data_siteid(resolution='Daily', stationid=target_station).read()
    daily_dataset.index = daily_dataset.DATE_TIME.astype('datetime64[ns]')

    subhourly_dataset = cat.data_siteid(resolution='SH', stationid=target_station).read()

    target_columns = subhourly_dataset.columns.str.endswith('_VWC')

    daily_aggregate = subhourly_dataset.groupby(subhourly_dataset['DATE_TIME'], as_index=True)[subhourly_dataset.columns[subhourly_dataset.columns.str.endswith('_VWC')]].mean()
    daily_aggregate.index = daily_aggregate.index.astype('datetime64[ns]')

    daily_joined = daily_dataset.join(daily_aggregate)
    target_columns = subhourly_dataset.columns[subhourly_dataset.columns.str.endswith('_VWC')].tolist() + ['COSMOS_VWC']
    daily_joined = daily_joined[target_columns]
    daily_joined = daily_joined.reset_index()
    daily_joined.index = daily_joined.DATE_TIME.astype('datetime64[ns]')
    daily_joined.dropna(axis=1, how='all', inplace=True)

    daily_joined_long = pd.melt(daily_joined, id_vars='DATE_TIME',
                     var_name="Sensor", value_name="VWC")

    plot_daily = daily_joined_long.hvplot(x='DATE_TIME', y='VWC', by='Sensor',
                            label='Variation in VWC by sensor type',
                            ylabel='Volumetric Water Content (%)',
                            xlabel='Time', xlim=(datetime(2014,1,1), datetime(2019,12,31)))

    return plot_daily.opts(legend_position='top', **settings_lineplots)

settings_lineplots = dict(padding=0.1, height=400, width=700, fontsize={'title': '120%','labels': '120%', 'ticks': '100%'})

plot_timeseries = pn.Row(
    pn.Column(pn.Spacer(height=5), target_station, background='#f0f0f0', sizing_mode="fixed"),
    width_policy='max', height_policy='max',


We conclude all sites contain at least two TDT probes, and their temporal sequence follow a similar pattern as the CRNS. It is worth mentioning the pattern might differ when we explore other stations in the full COSMOS-UK dataset which can contain more than two TDT probes.

Soils contain a complex porous structure which means moisture can be non-uniformly distributed horizontally and vertically. For site measurements such as TDTS even distanced a few metres apart they measure “extremely local” moisture (and can sometimes be trapped in a water pocket leading to artificially high VWC or be pressed against a rock and produce artificially low VWC). In contrast, local measurements such as CRNS average over all of this heterogeneity but introduces its own sources of noise (biomass water, surface water, variable depth and horizontal footprint).


This notebook has demonstrated the use of certain open-source python packages to explore the 2013-2019 COSMOS-UK dataset:

  • intake to easily fetch and manipulate daily and subhourly data, their metadata and other data types (remote images).

  • hvplot to propose some interactive visualisations of hydrometeorological and soil data.

  • pandas to resample subhourly data and merge them into a daily dataset of soil moisture.

Citing this Notebook#

Please see CITATION.cff for the full citation information. The citation file can be exported to APA or BibTex formats (learn more here).

Additional information#

Dataset: 2013-2019 COSMOS-UK dataset (further details of the version here).

License: The code in this notebook is licensed under the MIT License. The Environmental Data Science book is licensed under the Creative Commons by Attribution 4.0 license. See further details here.

Contact: If you have any suggestion or report an issue with this notebook, feel free to create an issue or send a direct message to

Notebook repository version: v1.0.6
Last tested: 2023-09-03