-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wine quality example #104
Comments
SGD seems to be comparable to full GLM here:
I get
|
What would be the criterias to use SGD for classification on others classification methods? |
Note that SGD is not a classification method per se. It is a method to fit models, including classification models. However, it is a noisy method and in many cases it will perform worse than a deterministic method, such as maximum likelihood (the glm() output is the maximum likelihood estimate in the example above). So, in the example above you wouldn't want to use SGD because it is small enough to use glm(). |
I am trying to SGD for classification of a huge dataset and wanted first to try it on the example you provided about the wine quality. But with the wine quality, something doesn't work for me. When I am trying to predict using the sgd.theta all the results are false and don't predict the quality at all.
Here is the code:
Wine quality (Cortez et al., 2009): Logistic regression
set.seed(42)
#data("winequality")
#dat <- winequality
dat <- read.csv("../Data/winequality-red.csv", header = TRUE, sep = ";", stringsAsFactors = FALSE)
dat$quality <- as.numeric(dat$quality > 5) # transform to binary
test.set <- sample(1:nrow(dat), size=nrow(dat)/8, replace=FALSE)
dat.test <- dat[test.set, ]
dat <- dat[-test.set, ]
sgd.theta <- sgd(quality ~ ., data=dat,
model="glm", model.control=binomial(link="logit"),
sgd.control=list(reltol=1e-5, npasses=200),
lr.control=c(scale=1, gamma=1, alpha=30, c=1))
sgd.theta
sprintf("Mean squared error: %0.3f", mean((theta - as.numeric(sgd.theta$coefficients))^2))
sum(dat[190,]*sgd.theta$coefficients)
Output>
Maybe am I missing something?
The text was updated successfully, but these errors were encountered: