GNSS ML, IOT & Big Data

Machine learning is an Artificial Intelligence application that involves algorithms able to extract knowledge and discover patterns between input and output variables. ML includes a great number of algorithms with different types of learning methods. Artificial neural networks, evolutionary algorithms, decision tree classifiers, clustering algorithms or fuzzy logic inference are some of the most commonly used techniques to identify particularities and correlations in data.

The volume of data produced in the world is growing rapidly, from 33 zettabytes in 2018 to an expected 175 zettabytes in 2025. Today 80% of the processing and analysis of data takes place in data centres and 20% in smart connected objects, such as cars or home appliances, and in computing facilities close to the user (‘edge computing’). By 2025 these proportions are likely to be inverted.  This massive landscape has led to a new golden age of Machine learning (ML), able to extract knowledge and discover patterns between input and output variables given a sheer volume of training data.

At the GNSS space segment, four global constellations are operational, including the European Galileo system. On ground, thousands of permanent GNSS stations and millions of Internet-of-things (IoT) devices, including smartphones, have contributed to the deployment of a “de-facto” large IoT GNSS receiver. Hence, the application of ML on the data produced by this global and permanent GNSS infrastructure constitutes a major opportunity for GNSS science applications.

Therefore, the GNSS Science Support Centre (GSSC) leverages on mainstream Big Data, Cloud, Virtualisation and Container technologies to address key GNSS science Use Cases, through ML science pipelines.

In general, the GNSS navigation chain is composed of a network of GNSS sensors aiming at collecting some data from a space segment (core-constellation of satellites), and a set of algorithms processing this data to produce a navigation message. The accuracy of physical modelling and robustness against errors will determine the ability to interpret GNSS data.

One relevant source of these errors is the ionosphere. In this layer, the ionizing radiation from the Sun originates the existence of electrons, in quantities that affect the propagation of radio signals. Correlations across data from crowdsourced and ionospheric enabled GNSS dedicated receivers, would contribute to the definition of ML enhanced Total Electron Content (TEC) maps to model and predict ionospheric parameters relevant for PNT.

Interference/man-made vulnerabilities are also at the core of many PNT error sources. In this domain, the availability of raw data measurements from crowdsourced devices combined with ML techniques can unveil new interference patterns and countermeasures with potential for the introduction of adaptive signal processing algorithms.

Moreover, current research work has shown the potential of GNSS observations in providing accurate and reliable information for retrieval of atmospheric parameters like water vapour or temperature. Measurements or predicted values of these data could be provided as inputs to a high-fidelity atmospheric density model to calculate, in a more precise way, the atmospheric density. In this field, the information gathered from IoT GNSS devices combined with ML algorithms represents another opportunity for a better understanding of weather effects.

Big Data from Space refers to Space and Earth observation data collected by space-borne and ground-based sensors, as well as other space domains such as Satellite Navigation. Systems in these domains qualify being called “big data” given the sheer volume of sensed data.

In the case of GNSS data the scenario depicted by the storage of digitized intermediate frequency data represents a clear example of Big Data from Space.

Digitized intermediate frequency data is the first and most fundamental measurement available following antenna signal receipt. Due to its data rate, digital data cannot be stored consistently and is converted to lower density measurements such as pseudoranges, code- and carrier phase which generate much lower data rate. The algorithms to derive observables are however specific to each receiver and vendor. The conversion step from IF to observables therefore leads to an unrecoverable loss of information.

The systematic recording of digital IF would allow offline re-processing at any computational speed, using any signal processing technique (e.g. acquisition and tracking algorithms), including those yet to be developed. This might permit the recovery of much more information that can be obtained using observables only. Short samples of this type of data may be useful for GNSS monitoring or identification of vulnerabilities, but also for permanent archiving and later processing for innovative techniques or future scientific applications. For these purposes, some regional/local networks are now collecting IF data during very limited periods of time where the majority of the recorded data is not stored due to the nature of each application.

This activity attempts to develop a pilot system demonstrating the scientific potential derived from systematic recording of digital IF data.  Applications to be prototyped as part of this system would include: innovative processing techniques assessment, identification of environmental effects, interference and other natural or man-made vulnerabilities, liability aspects, feared-events assessment, scientific long-term archiving.

GNSS Big Data information services and resources will be available in this area soon.