Even when the data is correctly collected, the data may still be incorrect because of noise.
Therefore, very often the data captured or processed has to be cleaned, treated, or imputed to obtain better and reliable results.
Therefore, the data validation methods with verified reliability during the conception should be also used to validate the data automatically during the execution time.
The remainder of this paper is organized as follows: this paragraph concludes Section 1, where a short introduction to the problem and a proposal to achieve its mitigation are disclosed; Section 2 presents the most commonly found data validation methods, along with a critical comparison of its usage scenarios; Section 3 deepens the analysis presenting a classification of the data validation methods; Section 4 discusses the applicability of these methods, including the discussion of the degree of trust the data can be expected to provide; finally, Section 5 presents relevant conclusions.
Following the existence of missing values at random instants of time, the causes may be the mechanical problems or power failures of sensors.
At this case, data correction methods should be applied, including data imputation and data cleaning.
Moreover, the use of the sensors’ data to feed higher-level algorithms needs to guarantee a minimum degree of error, with this error being the difference between the output of these applications, built on limited computational mobile platforms, and the output of a golden standard.
To achieve a minimum degree of error, statistical methods need to be applied to ensure that the output of the mobile application is to maximum extent similar to the output given by the relevant golden standard, if and when this is possible.
To help developers and researchers and to provide a common ground of data validation algorithms and techniques, this paper presents a review of the most commonly used data validation algorithms, along with its usage scenarios, and proposes a classification for these algorithms.