机器学习实验6

本次实验需要构造一个简单的 softmax 分类模型，先复习一下基础知识

利用softmax进行预测，得到预测的分类结果

对于损失函数的定义如下：

对损失函数的定义需要重点关注！！！

有了损失函数和梯度就可以构建模型

def cost_gradient(W, X, Y, n):
    G = (1 / n) * np.dot(X.T, predict(X, W) - Y)  # 计算梯度
    j = -np.sum(Y * np.log(predict(X, W))) / n

    return (j, G)

完整代码如下：

# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt

def Softmax(z):
    exp_scores = np.exp(z)
    return exp_scores / np.sum(exp_scores, axis=1, keepdims=True)

def predict(X,W):
    return Softmax(np.dot(X,W))    

def cost_gradient(W, X, Y, n):
    G = (1 / n) * np.dot(X.T, predict(X, W) - Y)
    j = -np.sum(Y * np.log(predict(X, W))) / n

    return (j, G)

def train(W, X, Y, n, lr, iterations):
    J = np.zeros([iterations, 1])

    for i in range(iterations):
        (J[i], G) = cost_gradient(W, X, Y, n)
        W = W - lr*G

    return (W,J)

def error(W, X, Y):
    Y_hat = predict(X,W)  ###### Output Y_hat by the trained model
    pred = np.argmax(Y_hat, axis=1)
    label = np.argmax(Y, axis=1)
    
    return (1-np.mean(np.equal(pred, label)))

iterations = 150000 ###### Training loops
lr = 0.00025 ###### Learning rate

data = np.loadtxt('SR.txt', delimiter=',')

n = data.shape[0]
X = np.concatenate([np.ones([n, 1]),
                    np.expand_dims(data[:,0], axis=1),
                    np.expand_dims(data[:,1], axis=1),
                    np.expand_dims(data[:,2], axis=1)],
                   axis=1)
Y = data[:, 3].astype(np.int32)
c = np.max(Y)+1
Y = np.eye(c)[Y]

W = np.random.random([X.shape[1], c])

(W,J) = train(W, X, Y, n, lr, iterations)

plt.figure()
plt.plot(range(iterations), J)

print(error(W,X,Y))
# 0.021333333333333315

loss 曲线

经过训练，模型在测试集上的正确率是 0.0213 左右，但是这里想要设计一个优化算法用来寻找最优的学习率，常见的寻参方法有网格搜索或随机搜索

代码实现：

# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt

def Softmax(z):
    exp_scores = np.exp(z - np.max(z, axis=1, keepdims=True))  # 防止数值溢出
    return exp_scores / np.sum(exp_scores, axis=1, keepdims=True)

def predict(X, W):
    return Softmax(np.dot(X, W))    

def cost_gradient(W, X, Y, n):
    G = (1 / n) * np.dot(X.T, predict(X, W) - Y)
    j = -np.sum(Y * np.log(predict(X, W))) / n

    return (j, G)

def train(W, X, Y, n, lr, iterations):
    J = np.zeros([iterations, 1])

    for i in range(iterations):
        (J[i], G) = cost_gradient(W, X, Y, n)
        W = W - lr * G

    return (W, J)

def error(W, X, Y):
    Y_hat = predict(X, W)
    pred = np.argmax(Y_hat, axis=1)
    label = np.argmax(Y, axis=1)
    
    return (1 - np.mean(np.equal(pred, label)))

# 参数设置
iterations = 200000  # 训练轮数
learning_rates = [0.0001, 0.00025, 0.0005, 0.001, 0.002]  # 学习率候选列表

# 加载数据
data = np.loadtxt('SR.txt', delimiter=',')
n = data.shape[0]
X = np.concatenate([np.ones([n, 1]),
                    np.expand_dims(data[:, 0], axis=1),
                    np.expand_dims(data[:, 1], axis=1),
                    np.expand_dims(data[:, 2], axis=1)],
                   axis=1)
Y = data[:, 3].astype(np.int32)
c = np.max(Y) + 1
Y = np.eye(c)[Y]

# 初始化权重
W = np.random.random([X.shape[1], c])

# 网格搜索最佳学习率
best_lr = None
best_error = float('inf')
for lr in learning_rates:
    (W, J) = train(W.copy(), X, Y, n, lr, iterations)
    err = error(W, X, Y)
    print(f"Learning rate {lr}: Error = {err}")
    if err < best_error:
        best_lr = lr
        best_error = err

print(f"Best learning rate: {best_lr}, Best error: {best_error}")

# 使用最佳学习率重新训练模型
(W, J) = train(W, X, Y, n, best_lr, iterations)

# 绘制损失函数变化图
plt.figure()
plt.plot(range(iterations), J)
plt.xlabel('Iterations')
plt.ylabel('Loss')
plt.title('Loss over Iterations with Best Learning Rate')
plt.show()

# 输出最终误差
print(f"Final error with best learning rate: {error(W, X, Y)}")

# 输出结果
Learning rate 0.0001: Error = 0.02733333333333332
Learning rate 0.00025: Error = 0.021333333333333315
Learning rate 0.0005: Error = 0.021333333333333315
Learning rate 0.001: Error = 0.021333333333333315
Learning rate 0.002: Error = 0.020000000000000018
Best learning rate: 0.002, Best error: 0.020000000000000018

loss = 0.021333333333333315

其实感觉效果也一般...

应该只能优化到这种水平了

机器学习

机器学习实验6

http://jrhu0048.github.io/2024/10/16/ji-qi-xue-xi/ji-qi-xue-xi-shi-yan-6/

作者

JR.HU

发布于

2024年10月16日

更新于

2024年10月17日

许可协议

ROS实验报告二上一篇

码题集周赛1 下一篇