Holt Winter.Rmd

---
title: "STAT4181 HW6 Min Yang"
output:
  pdf_document: default
  html_notebook: default
---
##A.
###1.
```{r}
library(rdatamarket)
l <- dmseries("https://datamarket.com/data/set/22pw/monthly-lake-erie-levels-1921-
1970#!ds=22pw&display=line")
str(l)
```
###2.
```{r}
lH <- HoltWinters(l)
plot(lH)
```

###3.
```{r}
lHM <- HoltWinters(l, seasonal = "multiplicative")
plot(lHM)
```

```{r}
lH$alpha
```
```{r}
cat("the sse of additive seasonality HW model is", lH$SSE)
```
```{r}
lHM$alpha
```
```{r}
cat("the sse of multiplicative seasonality HW model is", lHM$SSE)
```
```{r}
ER <-as.ts(l)
ER.HW <- decompose(ER)
plot(ER.HW)
```

Since the additive seasonality model has a lower SSE, it is more appropriate for my dataset. 

###4.
```{r}
library(forecast)
library(astsa)
```

```{r}
sarimamodel <- sarima(l,p = 0,d = 1,q = 2,P=2,D = 0,Q = 0,S = 12,details=FALSE)

TT <- length(l)
T0 <- TT-24

eps <- coredata(l)
eps[1:T0] <- 0
for(t in (T0+1):TT){ #at every time step

#evaluate previous prediction
if(t!=(T0+1)){
eps[t] <- coredata(prev_pred)-coredata(l[t])
}

#fit model to past data
temp_model <- arima(l[1:t],order=c(0,1,2),seasonal = list(order=c(2,0,0),period=12))

#predict from the model
prev_pred <- predict(temp_model,n_ahead=1)$pred
}

eps <- eps[(T0+1):TT]
mse <- mean(eps^2)
mse
```

```{r}
#HoltWinters
TT <- length(l)
T0 <- TT-24

eps <- coredata(l)
eps[1:T0] <- 0
for(t in (T0+1):TT){ #at every time step

#evaluate previous prediction
if(t!=(T0+1)){
eps[t] <- coredata(prev_pred)-coredata(l[t])
}

#fit model to past data
temp_model <- HoltWinters(l[1:t],beta=FALSE,gamma=FALSE)

#predict from the model
prev_pred <- predict(temp_model,n_ahead=1)
}

eps <- eps[(T0+1):TT]
mse <- mean(eps^2)
mse
```
```{r}
ER <- ts(l[1:576],frequency = 12)
ER.HW <- HoltWinters(ER)
ER.pred <- predict(ER.HW,n.ahead = 24)
plot(l[577:600],ylim=c(0,20))
lines(as.numeric(ER.pred),col=2)
```

Since SARIMA model has a smaller MSE, it is more accurate than Holt Winters Model. 

##B.
```{r}
x_t <- sin(pi*1:200*2/100)
plot.ts(x_t)
```
```{r}
y_t <- sin(pi*1:200*10/100)
plot.ts(y_t)
```
```{r}
 z_t <- sin(pi*1:200*20/100)
plot.ts(z_t)
```

###2.
```{r}
 Px <- abs(2*fft(x_t)/200)^2
Fr <- 0:199/200
plot(Fr, Px, type="o", xlab="frequency, x_t", ylab="periodogram")
n<-length(x_t)
Per   <- Mod(fft(x_t-mean(x_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 3 freq=3/200=.015 cycles/period
uu    <- which.max(Per[1:100][-u]) # 84 freq=84/200=.42 cycles/period
1/Freq[3]; 1/Freq[85]           # period = period/cycle
text(.07, 43, "100 period cycle") 
```

There is a maximum of 100 period cycle, 0.015 cycle/period at 3, which is expected as shown in the graph there is a peak there.

```{r}
 P <- abs(2*fft(y_t)/200)^2
Fr <- 0:199/200
plot(Fr, P, type="o", xlab="frequency, y_t", ylab="periodogram")
```
```{r}
n<-length(y_t)
Per   <- Mod(fft(y_t-mean(y_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 11 freq=11/200=.055 cycles/period
uu    <- which.max(Per[1:100][-u]) # 14 freq=14/200=.07 cycles/period
1/Freq[11]; 1/Freq[15]           # period = period/cycle
text(.1, 43, "20 period cycle") 

```

There is a maximum of 20 period cycle, 0.055 cycle/period at 11, which is expected as shown in the graph there is a peak there.
```{r}
 P <- abs(2*fft(z_t)/200)^2
Fr <- 0:199/200
plot(Fr, P, type="o", xlab="frequency, z_t", ylab="periodogram")
```
```{r}
n<-length(z_t)
Per   <- Mod(fft(z_t-mean(z_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 21 freq=21/200=.105 cycles/period
uu    <- which.max(Per[1:100][-u]) # 52 freq=52/200=.26 cycles/period
1/Freq[21]; 1/Freq[53]           # period = period/cycle
text(.15, 50, "10 period cycle") 
```

There is a maximum of 10 period cycle, 0.105 cycle/period at 21, which is expected as shown in the graph there is a peak there.

###3.
```{r}
 set.seed(1)
X_t <- x_t + rnorm(200,sd=0.5)
Y_t <- y_t + rnorm(200,sd=0.5)
Z_t <- z_t + rnorm(200,sd=0.5)
Px <- abs(2*fft(X_t)/200)^2
Fr <- 0:199/200
plot(Fr, Px, type="o", xlab="frequency, X_t", ylab="periodogram")
```

```{r}
n<-length(X_t)
Per   <- Mod(fft(X_t-mean(X_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 3 freq=3/200=.015 cycles/period
uu    <- which.max(Per[1:100][-u]) # 84 freq=84/200=.42 cycles/period
1/Freq[3]; 1/Freq[85]           # period = period/cycle
text(.07, 43, "100 period cycle")
u
uu
```

```{r}
 Py <- abs(2*fft(Y_t)/200)^2
Fr <- 0:199/200
plot(Fr, Py, type="o", xlab="frequency, Y_t", ylab="periodogram")
```
```{r}
n<-length(Y_t)
Per   <- Mod(fft(Y_t-mean(Y_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 11 freq=11/200=.055 cycles/period
uu    <- which.max(Per[1:100][-u]) # 14 freq=14/200=.07 cycles/period
1/Freq[11]; 1/Freq[15]           # period = period/cycle
text(.1, 43, "20 period cycle") 
u
uu
```

```{r}
 Pz <- abs(2*fft(Z_t)/200)^2
Fr <- 0:199/200
plot(Fr, Pz, type="o", xlab="frequency, z_t", ylab="periodogram")
```
```{r}
n<-length(Z_t)
Per   <- Mod(fft(Z_t-mean(Z_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 21 freq=21/200=.105 cycles/period
uu    <- which.max(Per[1:100][-u]) # 52 freq=52/200=.26 cycles/period
1/Freq[21]; 1/Freq[53]           # period = period/cycle
text(.15, 50, "10 period cycle")
u
uu
```

Similar to question 2, there are same peaks as shown before.
###4.
```{r}
 s_t <- X_t + Y_t + Z_t
n <- length(s_t)
plot.ts(s_t, ylab="s_t", xlab="number")
```

```{r}
 Per <- Mod(fft(s_t-mean(s_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency
")
```
```{r}
n<-length(s_t)
Per   <- Mod(fft(s_t-mean(s_t)))^2/n
Freq  <- (1:n -1)/n
plot(Freq[1:100], Per[1:100], type='h', lwd=3, ylab="Periodogram", xlab="Frequency")
u     <- which.max(Per[1:100])     # 11 freq=21/200=.105 cycles/period
uu    <- which.max(Per[1:100][-u]) # 20 freq=52/200=.1 cycles/period
1/Freq[11]; 1/Freq[21]           # period = period/cycle
text(.05, 50, "20 period cycle") 
text(.15, 45, "10 period cycle") 
u
uu
```

There are two peaks at 11 and 20, which give 20 period cycle (0.105 cycle/period) and 10 period cycle (0.1 cycle/period). These peaks match with those we find in 3-B.