机器学习实验6

本次实验需要构造一个简单的 softmax 分类模型,先复习一下基础知识

利用softmax进行预测,得到预测的分类结果

对于损失函数的定义如下:

对损失函数的定义需要重点关注!!!

有了损失函数和梯度就可以构建模型

1
2
3
4
5
def cost_gradient(W, X, Y, n):
G = (1 / n) * np.dot(X.T, predict(X, W) - Y) # 计算梯度
j = -np.sum(Y * np.log(predict(X, W))) / n

return (j, G)

完整代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt

def Softmax(z):
exp_scores = np.exp(z)
return exp_scores / np.sum(exp_scores, axis=1, keepdims=True)

def predict(X,W):
return Softmax(np.dot(X,W))

def cost_gradient(W, X, Y, n):
G = (1 / n) * np.dot(X.T, predict(X, W) - Y)
j = -np.sum(Y * np.log(predict(X, W))) / n

return (j, G)

def train(W, X, Y, n, lr, iterations):
J = np.zeros([iterations, 1])

for i in range(iterations):
(J[i], G) = cost_gradient(W, X, Y, n)
W = W - lr*G

return (W,J)

def error(W, X, Y):
Y_hat = predict(X,W) ###### Output Y_hat by the trained model
pred = np.argmax(Y_hat, axis=1)
label = np.argmax(Y, axis=1)

return (1-np.mean(np.equal(pred, label)))

iterations = 150000 ###### Training loops
lr = 0.00025 ###### Learning rate

data = np.loadtxt('SR.txt', delimiter=',')

n = data.shape[0]
X = np.concatenate([np.ones([n, 1]),
np.expand_dims(data[:,0], axis=1),
np.expand_dims(data[:,1], axis=1),
np.expand_dims(data[:,2], axis=1)],
axis=1)
Y = data[:, 3].astype(np.int32)
c = np.max(Y)+1
Y = np.eye(c)[Y]

W = np.random.random([X.shape[1], c])

(W,J) = train(W, X, Y, n, lr, iterations)

plt.figure()
plt.plot(range(iterations), J)

print(error(W,X,Y))
# 0.021333333333333315

loss 曲线

经过训练,模型在测试集上的正确率是 0.0213 左右,但是这里想要设计一个优化算法用来寻找最优的学习率,常见的寻参方法有网格搜索或随机搜索

代码实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt

def Softmax(z):
exp_scores = np.exp(z - np.max(z, axis=1, keepdims=True)) # 防止数值溢出
return exp_scores / np.sum(exp_scores, axis=1, keepdims=True)

def predict(X, W):
return Softmax(np.dot(X, W))

def cost_gradient(W, X, Y, n):
G = (1 / n) * np.dot(X.T, predict(X, W) - Y)
j = -np.sum(Y * np.log(predict(X, W))) / n

return (j, G)

def train(W, X, Y, n, lr, iterations):
J = np.zeros([iterations, 1])

for i in range(iterations):
(J[i], G) = cost_gradient(W, X, Y, n)
W = W - lr * G

return (W, J)

def error(W, X, Y):
Y_hat = predict(X, W)
pred = np.argmax(Y_hat, axis=1)
label = np.argmax(Y, axis=1)

return (1 - np.mean(np.equal(pred, label)))

# 参数设置
iterations = 200000 # 训练轮数
learning_rates = [0.0001, 0.00025, 0.0005, 0.001, 0.002] # 学习率候选列表

# 加载数据
data = np.loadtxt('SR.txt', delimiter=',')
n = data.shape[0]
X = np.concatenate([np.ones([n, 1]),
np.expand_dims(data[:, 0], axis=1),
np.expand_dims(data[:, 1], axis=1),
np.expand_dims(data[:, 2], axis=1)],
axis=1)
Y = data[:, 3].astype(np.int32)
c = np.max(Y) + 1
Y = np.eye(c)[Y]

# 初始化权重
W = np.random.random([X.shape[1], c])

# 网格搜索最佳学习率
best_lr = None
best_error = float('inf')
for lr in learning_rates:
(W, J) = train(W.copy(), X, Y, n, lr, iterations)
err = error(W, X, Y)
print(f"Learning rate {lr}: Error = {err}")
if err < best_error:
best_lr = lr
best_error = err

print(f"Best learning rate: {best_lr}, Best error: {best_error}")

# 使用最佳学习率重新训练模型
(W, J) = train(W, X, Y, n, best_lr, iterations)

# 绘制损失函数变化图
plt.figure()
plt.plot(range(iterations), J)
plt.xlabel('Iterations')
plt.ylabel('Loss')
plt.title('Loss over Iterations with Best Learning Rate')
plt.show()

# 输出最终误差
print(f"Final error with best learning rate: {error(W, X, Y)}")
1
2
3
4
5
6
7
# 输出结果
Learning rate 0.0001: Error = 0.02733333333333332
Learning rate 0.00025: Error = 0.021333333333333315
Learning rate 0.0005: Error = 0.021333333333333315
Learning rate 0.001: Error = 0.021333333333333315
Learning rate 0.002: Error = 0.020000000000000018
Best learning rate: 0.002, Best error: 0.020000000000000018

loss = 0.021333333333333315

其实感觉效果也一般...

应该只能优化到这种水平了


机器学习实验6
http://jrhu0048.github.io/2024/10/16/ji-qi-xue-xi/ji-qi-xue-xi-shi-yan-6/
作者
JR.HU
发布于
2024年10月16日
更新于
2024年10月17日
许可协议