Author: Haroon Khalil
-
Predictive analytics
Prediction of rainfall data was done with the help of time series analysis. ARIMA is one of the models used for time series analysis. Figure 2.10 shows the normal flow of time series. Figure 2.11 explains the decomposition of additive and multiplicative time series of rainfall data. To make the data stationary, the components such as trend, seasonality,…
-
Nature of data: skewness and kurtosis
Figure 2.7 shows the skewness and kurtosis of the dataset. The skewness of the rainfall data is 0.01999941. It shows that the distribution of data is positively skewed. Kurtosis value is 2.763914. This is less than 3 which means the data contain low-level outliers. Table 2.3 explains the overall summary of data with parameters like minimum, maximum,…
-
Results and discussion
Descriptive analytics Figure 2.6 shows the overall rainfall level of India from 1901 to 2015. Maximum and minimum rainfall values are highlighted. Maximum rainfall was 1480.3 mm, which occurred in 1917, and minimum rainfall was 920.8 mm, which occurred in 2002.
-
Predictive analytics
There are various models that can be used for prediction purpose. Time series is a sequence of data points which is well-ordered based on time. Time series can be expressed asYt=f(t),(2.1) where Yt is the variable’s value in the study at time t. The components of time series analysis are Trend: It refers to increasing or…
-
Model planning
Descriptive analytics: Descriptive analytics is an analysis based on descriptive statistical methods such as mean, median, mode, standard deviation, variability, skewness and kurtosis. These are collectively called measures of central tendency and dispersion of data.
-
Data cleansing
The dataset contains null values in the some of the rows. To identify the null values isnull() function is used and the missing values are filled by the mean values of the rows. Applying the mean values is one of the methods used to handle the missing values.
-
Data collection
The rainfall data are collected from the website of agriculture department (data.gov.in). Rainfall dataset consists of 150 years of data from 1901 to 2015. The attributes of the data such as annual rainfall and rainfall of every season for each year are provided on the website.
-
Methodology
In this chapter, an illustration of forecasting the rainfall level using the time series analysis method is provided. The flow of the process goes as data collection, then cleansing of the data, model building and finally visualization (Figure 2.5).
-
Review of data analytics
The study of data analytics can be divided in traditional data analytics and big data analytics. The main processes are input, processing and output, and the framework of the analytics is determined on the basis of perspective-oriented and result-oriented concerns. The tools used for processing the data analytics environment are Apache, SPSS, Storm, Dryad, R,…
-
Literature review
A wide range of statistical methods are used for the various analyses done in agriculture sector to make today’s agriculture smarter. Technology comprising models has proved to be helpful in getting more production in agriculture and it also can easily handle some of unexpected problems such as floods, draughts and demands during food shortages. Table 2.2 shows the…