Logistic regression and Non linear transformations with python coding
Description
Non-linear transformation. The file Voices contains data related to the
voices of 1000 different humans. The first column represents the intensity
of the voice. The second column represents the tone of the voice. The
final column says whether the voice belongs to a young adult, 18 to 30
years old (1 ? yes, ?1, ? N o). Notice that both, intensity and tone, are
normalized variables, going from ?1 to 1.
(a) Use a non-linear circular hypothesis to classify the data (slide 4). For
that:
etermine by inspection r, the radius of the circle.
ompute the predicted output y = sign(r2 ?intensity2 ?tone2).
ompute Ein.
- (b) Transform the data to its linear version (slide 5), and compute a
linear regression over the transformed data to calculate w. Com-
pute the Ein obtained with the linear regression and compare it with
the the error obtained with the non-linear hypothesis. Draw your
conclusions.
2. Logistic regression. The file Weekly consists of 1, 089 weekly returns
for 21 years, from the beginning of 1990 until the end of 2010. For each
week, we have recorded the percentage returns for each of the five previous
trading weeks, Lag1 through Lag5. We have also recorded Volume (the
number of shares traded on the previous week, in billions), Today (the
percentage return on the week in question) and Direction (whether the
market was Up or Down on this week).
(a) Use the full data set to perform a logistic regression (slide 39) with
Direction as the response and the five lag variables plus Volume
as predictors. The learning rate ?, and the number iterations are
already defined in the template as 0.1 and 200. Report the coefficients
(w). Draw your conclusions. Do any of the predictors appear to be
significant?
(b) Show how Ein evolves with the iterations (slide 37, second equation).
(c) Report the probability output g(x) for the first 10 data points of your
training set (use the sigmoid function, slide 30).
(d) Compute the overall fraction of correct predictions (using the training
set). Hint: Use 0.5 as a threshold for the probability g(x) to obtain
a binary output.
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."