[Machine Learning] Exam 2(Week 3)

해당 내용은 Andrew Ng 교수님의 Machine Learning 강의(Coursera)를 정리한 내용입니다.

※ 아래에 문제 풀이가 있습니다. 원하지 않는다면 스크롤을 내리지 마세요.

3주차 과제는 아래와 같다.

plotData.m - classification data 그래프 그리기

sigmoid.m - Sigmoid 함수를 반환하는 코드 작성

costFunction.m - Logistic Regression의 Cost Function 코드 작성

predict.m - Logistic Regression의 예측값을 반환하는 코드 작성(0 또는 1의 값을 나타내야 한다)

costFunctionReg.m - 정규화(regularization)된 Logistic Regression의 cost와 gradient를 계산하는 코드 작성

코드는 아래 Github을 참조하면 된다. ~~풀이는 추후에 해보도록 하겠다.~~

https://github.com/junstar92/Coursera/tree/master/MachineLearning/ex2

아래는 풀이이다.

[plotData.m]

단순히 읽어들인 'ex2data1.txt'를 그래프로 표현하는 함수이다. 결과 코드는 이미 해당 파일 안에 있어서 그대로 따라하면 된다. plot함수에 대해서 좀 알아보는 시간이 된 것 같다.

코드는 아래와 같다.

 function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure 
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.
 
% Create New Figure
figure; hold on;
 
% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%
pos = find(y==1); neg = find(y == 0);
 
plot(X(pos, 1), X(pos, 2), 'k+', 'LineWidth', 2, 'MarkerSize', 7);
plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7);
 
 
% =========================================================================
 
 
hold off;
 
end

line 14를 실행하면 pos는 y가 1인 행으로 이루어진 n x 1 vector를 반환한다. neg는 y가 0은 vector를 반환한다.

line 16, 17의 plot 함수 내부 인자 의미는 help plot 명령어를 통해서 나중에 다시 정리가 필요할 것 같다.

'k+'와 'ko'에서 'k'는 blacK을 의미하며 '+'는 십자모양 'o'는 동그라미 모양을 의미한다. 즉, 검은색 십자표시와 검은색 동그라미로 그래프에 나타내겠다는 의미다.

결과로 다음과 같은 figure가 생성된다.

[sigmoid.m]

$g(z) = \frac{1}{1 + e^{-z}}$

위의 sigmoid function을 반환하는 코드를 작성하는 과제이다.

간단한 코드이며, 아래와 같이 작성할 수 있다.

 function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.
 
% You need to return the following variables correctly 
g = zeros(size(z));
 
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).
 
g = 1./(1 + exp(-z));
 
 
 
% =============================================================
 
end

주의해야할 점은 1 + exp(-z)를 나눌 때, element끼리 나누어질 수 있도록 './'로 나누어 주는 것이다.

[costFunction.m]

Logistic Regression의 Cost Function 값을 반환하는 코드 작성 과제이다.

Hypothesis Function은 아래와 같이 나타낼 수 있다.

$h_\theta(x) = g(\theta^TX)\text{, } g(z) = \frac{1}{1+ e^-{z}}$

여기서,

$X = \begin{bmatrix} x_0 = 1 \\ x_1 \\ ... \\ x_n \end{bmatrix} \text{, } \theta = \begin{bmatrix} \theta_0 \\ \theta_1 \\ ... \\ \theta_n \end{bmatrix} \rightarrow \theta^T = \begin{bmatrix} \theta_0 && \theta_1 && ... && \theta_n \end{bmatrix}$

이다.

하지만, Ex2에서 다루는 X는

$X = \begin{bmatrix} x_1^{(0)} && x_1^{(1)} && ... && x_1^{(n)} \\ ... && ... && ... && ... \\ x_m^{(0)} && x_m^{(1)} && ... && x_m^{(n)} \end{bmatrix}$

로 표현되기 때문에,

우리는 Hypothesis Function을 벡터화해서 표현하면 다음과 같이 표현할 수 있다.

$h_\theta(x) = g(X\theta)$

Logistic Regression의 Cost Function은 linear regression과는 다르며, 아래와 같다.

$J(\theta) = \frac{1}{m}\sum_{i = 1}^{m}Cost(h_\theta(x^{(i)}, y^{(i)})$ 이며,

여기서

$Cost(h_\theta(x^{(i)}, y^{(i)}) = \left \{ \begin{matrix} -ylog(h_\theta(x)) \text{ if } y == 1 \\ -(1-y)log(1 - h_\theta(x)) \text{ if } y == 0 \end{matrix} \right.$

이다.

위 두 식이 y = 0, 1일 때 각각 유효하므로, 하나로 합쳐서 표현할 수 있고, 최종적으로 Cost Function은 아래의 식과 같다.

$J(\theta) = -\frac{1}{m}\left [ \sum_{i = 1}^{m}y^{(i)}log(h_\theta(x^{(i)})) + (1 - y^{(i)})log(1 - h_\theta(x^{(i)})) \right ]$

그리고, Cost Function을 미분하면 아래와 같다.

$\frac{\partial}{\partial\theta_j}J(\theta) = -\frac{1}{m}\sum_{i = 1}^{m}(h_\theta(x^{(i)}) - y^{(i)})x_j^{(i)}$

위 식을 코드로 작성하면 이렇게 작성할 수 있다.

 function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.
 
% Initialize some useful values
m = length(y); % number of training examples
 
% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));
 
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
 
%J = (-1/m)*sum(y.*log(sigmoid(X*theta)) + (1-y).*log(1-sigmoid(X*theta)));
J = (-1/m)*(y'*log(sigmoid(X*theta)) + (1-y)'*log(1-sigmoid(X*theta)));
h_theta = sigmoid(X*theta);
error = h_theta - y;
grad = (1/m)*(X'*error);
 
% =============================================================
 
end

여기서 J를 구하는 코드를 두 가지로 구현해보았는데, line 23과 line 24 부분이다.

y가 m x 1 행렬이고, X*theta도 m x 1 행렬(X가 m x (n+1) 행렬, theta가 (n+1) x 1 행렬)이기 때문에

line 23은 y(m x 1)과 X*theta(m x 1)의 요소 곱을 행하고 모든 요소를 더해준 것이고,

line 24는 y의 transform(1 x m)을 X*theta과 곱한 결과로 구현했다.

두 코드 모두 결과는 동일하다.

J의 편미분 항을 구할 때, error(line 26)은 $h_\theta(x^{(i)}) - y^{(i)}$ 부분이고, 최종 미분항은 line 27과 같이 구현할 수 있다.

ex2를 실행해보면, 아래와 같이 기대값과 동일하게 결과가 나온 것을 확인할 수 있다.

[predict.m]

앞서 구현한, costFunction을 사용하여 구한 theta를 Hypothesis Function에 대입해서 결과값을 추측하는 코드를 구현하면 된다. Binary Logistic Regression이고, 우리는 결과값 y가 0 또는 1이라는 것을 알고 있다. 즉, 우리가 구한 Hypothesis Function의 값이 0.5 이상이면 결과값은 1, 0.5 미만이면 0으로 출력하는 예측함수를 작성하면 된다.

round 함수를 사용하여서 반올림해준다.

 function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a 
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)
 
m = size(X, 1); % Number of training examples
 
% You need to return the following variables correctly
p = zeros(m, 1);
 
% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters. 
%               You should set p to a vector of 0's and 1's
%
 
p = round(sigmoid(X*theta));
 
% =========================================================================
 
 
end

[costFunctionReg.m]

Regularized Logistic Regression의 Cost Function을 구현하는 과제이다.

정규화된 Logistic Regression의 Cost Function J는 아래와 같다.

$J(\theta) = -\frac{1}{m} \left [ \sum_{i = 1}^{m}y^{(i)}log(h_\theta(x^{(i)})) + (1 - y^{(i)})log(1 - h_\theta(x^{(i)})) + \lambda\sum_{j = 1}{n}\theta_j^2 \right ]$

그리고 Cost Function을 편미분한 것은 아래와 같다.

$\frac{\partial}{\partial\theta_0}J(\theta) = \frac{1}{m} \left [ \sum_{i = 1}{m}(h_\theta(x^{(i)}) - y^{(i)})x_j^{(i)} \right ]$

$\frac{\partial}{\partial\theta_j}J(\theta) = \frac{1}{m} \left [ \sum_{i = 1}{m}(h_\theta(x^{(i)}) - y^{(i)})x_j^{(i)} + \lambda\theta_j \right] \text{ (j = 1, 2, ..., n)}$

그리고 위 식을 벡터화해서 나타내면 아래와 같이 나타낼 수 있다.

$J(\theta) = -\frac{1}{m} (y^Tlog(g(X\theta)) + (1 - y)^Tlog(1 - g(X\theta))) + \frac{\lambda}{2m}\theta^T\theta$

$\frac{\partial}{\partial\theta_j}J(\theta) = \frac{1}{m}X^T(g(X\theta) - y)$

아래는 과제에 대한 코드이다.

 function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 
 
% Initialize some useful values
m = length(y); % number of training examples
 
% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));
 
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
tempTheta = theta;
tempTheta(1) = 0;
 
% comput J without Regularization
J = (-1/m) * sum(y.*log(sigmoid(X*theta))+(1-y).*log(1-sigmoid(X*theta)));
% comput J with Regularization
J = J + ((lambda)/(2*m))*(sum(tempTheta.^2));
 
temp = sigmoid(X*theta);
error = temp - y;
grad = (1/m) * (X' * error) +(lambda/m)*tempTheta;
 
% =============================================================
 
end

위에서 정규화된 Cost Function의 식은 아래와 같이 한 번에 작성해도 된다.

 J = (-1/m) * sum(y.*log(sigmoid(X*theta))+(1-y).*log(1-sigmoid(X*theta))) ...
        + ((lambda)/(2*m))*(tempTheta'*tempTheta);

'Coursera 강의 > Machine Learning' 카테고리의 다른 글

[Machine Learning] Exam 3 (Week 4) (0)	2020.08.14
[Machine Learning] Neural Networks : Model Representation(신경망 모델) (0)	2020.08.12
[Machine Learning] Regularization 정규화 (0)	2020.08.08
[Machine Learning] Logistic Regression 2 (Cost Function, Gradient Descent, Multi-Class Classification) (5)	2020.08.07
[Machine Learning] Logistic Regression 1 (1)	2020.08.07

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

별준

[Machine Learning] Exam 2(Week 3)

※ 아래에 문제 풀이가 있습니다. 원하지 않는다면 스크롤을 내리지 마세요.

[plotData.m]

[sigmoid.m]

[costFunction.m]

[predict.m]

[costFunctionReg.m]

'Coursera 강의 > Machine Learning' 카테고리의 다른 글

댓글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

	function plotData(X, y)
	%PLOTDATA Plots the data points X and y into a new figure
	% PLOTDATA(x,y) plots the data points with + for the positive examples
	% and o for the negative examples. X is assumed to be a Mx2 matrix.

	% Create New Figure
	figure; hold on;

	% ====================== YOUR CODE HERE ======================
	% Instructions: Plot the positive and negative examples on a
	% 2D plot, using the option 'k+' for the positive
	% examples and 'ko' for the negative examples.
	%
	pos = find(y==1); neg = find(y == 0);

	plot(X(pos, 1), X(pos, 2), 'k+', 'LineWidth', 2, 'MarkerSize', 7);
	plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7);


	% =========================================================================


	hold off;

	end

	function g = sigmoid(z)
	%SIGMOID Compute sigmoid function
	% g = SIGMOID(z) computes the sigmoid of z.

	% You need to return the following variables correctly
	g = zeros(size(z));

	% ====================== YOUR CODE HERE ======================
	% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
	% vector or scalar).

	g = 1./(1 + exp(-z));



	% =============================================================

	end

	function [J, grad] = costFunction(theta, X, y)
	%COSTFUNCTION Compute cost and gradient for logistic regression
	% J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
	% parameter for logistic regression and the gradient of the cost
	% w.r.t. to the parameters.

	% Initialize some useful values
	m = length(y); % number of training examples

	% You need to return the following variables correctly
	J = 0;
	grad = zeros(size(theta));

	% ====================== YOUR CODE HERE ======================
	% Instructions: Compute the cost of a particular choice of theta.
	% You should set J to the cost.
	% Compute the partial derivatives and set grad to the partial
	% derivatives of the cost w.r.t. each parameter in theta
	%
	% Note: grad should have the same dimensions as theta
	%

	%J = (-1/m)sum(y.log(sigmoid(Xtheta)) + (1-y).log(1-sigmoid(X*theta)));
	J = (-1/m)(y'log(sigmoid(Xtheta)) + (1-y)'log(1-sigmoid(X*theta)));
	h_theta = sigmoid(X*theta);
	error = h_theta - y;
	grad = (1/m)(X'error);

	% =============================================================

	end

	function p = predict(theta, X)
	%PREDICT Predict whether the label is 0 or 1 using learned logistic
	%regression parameters theta
	% p = PREDICT(theta, X) computes the predictions for X using a
	% threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

	m = size(X, 1); % Number of training examples

	% You need to return the following variables correctly
	p = zeros(m, 1);

	% ====================== YOUR CODE HERE ======================
	% Instructions: Complete the following code to make predictions using
	% your learned logistic regression parameters.
	% You should set p to a vector of 0's and 1's
	%

	p = round(sigmoid(X*theta));

	% =========================================================================


	end

	function [J, grad] = costFunctionReg(theta, X, y, lambda)
	%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
	% J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
	% theta as the parameter for regularized logistic regression and the
	% gradient of the cost w.r.t. to the parameters.

	% Initialize some useful values
	m = length(y); % number of training examples

	% You need to return the following variables correctly
	J = 0;
	grad = zeros(size(theta));

	% ====================== YOUR CODE HERE ======================
	% Instructions: Compute the cost of a particular choice of theta.
	% You should set J to the cost.
	% Compute the partial derivatives and set grad to the partial
	% derivatives of the cost w.r.t. each parameter in theta
	tempTheta = theta;
	tempTheta(1) = 0;

	% comput J without Regularization
	J = (-1/m) * sum(y.log(sigmoid(Xtheta))+(1-y).log(1-sigmoid(Xtheta)));
	% comput J with Regularization
	J = J + ((lambda)/(2m))(sum(tempTheta.^2));

	temp = sigmoid(X*theta);
	error = temp - y;
	grad = (1/m) * (X' * error) +(lambda/m)*tempTheta;

	% =============================================================

	end

	J = (-1/m) * sum(y.log(sigmoid(Xtheta))+(1-y).log(1-sigmoid(Xtheta))) ...
	+ ((lambda)/(2m))(tempTheta'*tempTheta);

[Machine Learning] Exam 2(Week 3)

※ 아래에 문제 풀이가 있습니다. 원하지 않는다면 스크롤을 내리지 마세요.

[plotData.m]

[sigmoid.m]

[costFunction.m]

[predict.m]

[costFunctionReg.m]

'Coursera 강의 > Machine Learning' 카테고리의 다른 글

관련글

댓글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역