如何使用TensorFlow实现r语言多元逻辑回归归

点击联系发帖人 时间：2018-06-23 02:13

多元逻辑回归分析

详解用TensorFlow实现逻辑回归算法
原创
83
这篇文章主要介绍了关于详解用TensorFlow实现逻辑回归算法，有着一定的参考价值，现在分享给大家，有需要的朋友可以参考一下本文将实现逻辑回归算法，预测低出生体重的概率。# Logistic Regression
# 逻辑回归
#----------------------------------
# This function shows how to use TensorFlow to
# solve logistic regression.
# y = sigmoid(Ax + b)
# We will use the low birth weight data, specifically:
# y = 0 or 1 = low birth weight
# x = demographic and medical history data
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import requests
from tensorflow.python.framework import ops
import os.path
import csv
ops.reset_default_graph()
# Create graph
sess = tf.Session()
# Obtain and prepare data for modeling
# name of data file
birth_weight_file = 'birth_weight.csv'
# download data and create data file if file does not exist in current directory
if not os.path.exists(birth_weight_file):
birthdata_url = 'https://github.com/nfmcclure/tensorflow_cookbook/raw/master/01_Introduction/07_Working_with_Data_Sources/birthweight_data/birthweight.dat'
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n')
birth_header = birth_data[0].split('\t')
birth_data = [[float(x) for x in y.split('\t') if len(x)&=1] for y in birth_data[1:] if len(y)&=1]
with open(birth_weight_file, &w&) as f:
writer = csv.writer(f)
writer.writerow(birth_header)
writer.writerows(birth_data)
# read birth weight data into memory
birth_data = []
with open(birth_weight_file, newline='') as csvfile:
csv_reader = csv.reader(csvfile)
birth_header = next(csv_reader)
for row in csv_reader:
birth_data.append(row)
birth_data = [[float(x) for x in row] for row in birth_data]
# Pull out target variable
y_vals = np.array([x[0] for x in birth_data])
# Pull out predictor variables (not id, not target, and not birthweight)
x_vals = np.array([x[1:8] for x in birth_data])
# set for reproducible results
np.random.seed(seed)
tf.set_random_seed(seed)
# Split data into train/test = 80%/20%
# 分割数据集为测试集和训练集
train_indices = np.random.choice(len(x_vals), round(len(x_vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
# Normalize by column (min-max norm)
# 将所有特征缩放到0和1区间（min-max缩放），逻辑回归收敛的效果更好
# 归一化特征
def normalize_cols(m):
col_max = m.max(axis=0)
col_min = m.min(axis=0)
return (m-col_min) / (col_max - col_min)
x_vals_train = np.nan_to_num(normalize_cols(x_vals_train))
x_vals_test = np.nan_to_num(normalize_cols(x_vals_test))
# Define Tensorflow computational graph?
# Declare batch size
batch_size = 25
# Initialize placeholders
x_data = tf.placeholder(shape=[None, 7], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
# Create variables for linear regression
A = tf.Variable(tf.random_normal(shape=[7,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
# Declare model operations
model_output = tf.add(tf.matmul(x_data, A), b)
# Declare loss function (Cross Entropy loss)
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=model_output, labels=y_target))
# Declare optimizer
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)
# Train model
# Initialize variables
init = tf.global_variables_initializer()
sess.run(init)
# Actual Prediction
# 除记录损失函数外，也需要记录分类器在训练集和测试集上的准确度。
# 所以创建一个返回准确度的预测函数
prediction = tf.round(tf.sigmoid(model_output))
predictions_correct = tf.cast(tf.equal(prediction, y_target), tf.float32)
accuracy = tf.reduce_mean(predictions_correct)
# Training loop
# 开始遍历迭代训练，记录损失值和准确度
loss_vec = []
train_acc = []
test_acc = []
for i in range(1500):
rand_index = np.random.choice(len(x_vals_train), size=batch_size)
rand_x = x_vals_train[rand_index]
rand_y = np.transpose([y_vals_train[rand_index]])
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})
loss_vec.append(temp_loss)
temp_acc_train = sess.run(accuracy, feed_dict={x_data: x_vals_train, y_target: np.transpose([y_vals_train])})
train_acc.append(temp_acc_train)
temp_acc_test = sess.run(accuracy, feed_dict={x_data: x_vals_test, y_target: np.transpose([y_vals_test])})
test_acc.append(temp_acc_test)
if (i+1)%300==0:
print('Loss = ' + str(temp_loss))
# Display model performance
# 绘制损失和准确度
plt.plot(loss_vec, 'k-')
plt.title('Cross Entropy Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Cross Entropy Loss')
plt.show()
# Plot train and test accuracy
plt.plot(train_acc, 'k-', label='Train Set Accuracy')
plt.plot(test_acc, 'r--', label='Test Set Accuracy')
plt.title('Train and Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.show()数据结果：Loss = 0.845124Loss = 0.658061Loss = 0.471852Loss = 0.643469Loss = 0.672077迭代1500次的交叉熵损失图迭代1500次的测试集和训练集的准确度图相关推荐：以上就是详解用TensorFlow实现逻辑回归算法的详细内容，更多请关注php中文网其它相关文章！
江湖传言：PHP是世界上最好的编程语言。真的是这样吗？这个梗究竟是从哪来的？学会本课程，你就会明白了。
PHP中文网出品的PHP入门系统教学视频，完全从初学者的角度出发，绝不玩虚的，一切以实用、有用...
ThinkPHP是国内最流行的中文PHP开发框架,也是您Web项目的最佳选择。《php.cn独孤九贱(5)－ThinkPHP5视频教程》课程以ThinkPHP5最新版本为例，从最基本的框架常识开始，将...
本套教程,以一个真实的学校教学管理系统为案例,手把手教会您如何在一张白纸上,从零开始,一步一步的用ThinkPHP5框架快速开发出一个商业项目。
所有计算机语言的学习都要从基础开始，《PHP入门视频教程之一周学会PHP》不仅是PHP的基础部分更主要的是PHP语言的核心技术，是学习PHP必须掌握的内容，任何PHP项目的实现都离不开这部分的内容，通...
《php.cn原创html5视频教程》课程特色：php中文网原创幽默段子系列课程，以恶搞，段子为主题风格的php视频教程！轻松的教学风格，简短的教学模式，让同学们在不知不觉中，学会了HTML知识。
本课以最新版ThinkPHP5.0.10为基础进行开发，全程实录一个完整企业点，从后台到前台，从控制器到路由的全套完整教程，不论是你是新人，还是有一定开发经验的程序员，都可以从中学到实用的知识~~
ThinkPHP是一个快速、开源的轻量级国产PHP开发框架，是业内最流行的PHP框架之一。本课程以博客系统为例，讲述如何使用TP实战开发，从中学习Thinkphp的实践应用。模版下载地址：http:/...
本课程是php实战开发课程，以爱奇艺电影网站为蓝本从零开发一个自己的网站。目的是让大家了解真实项目的架构及开发过程
本课以一个极简的PHP开发框架为案例,向您展示了一个PHP框架应该具有的基本功能，以及具体的实现方法，让您快速对PHP开发框架的底层实现有一个清楚的认识,为以后学习其实的开发框架打下坚实的基础。
javascript是运行在浏览器上的脚本语言，连续多年，被评为全球最受欢迎的编程语言。是前端开发必备三大法器中，最具杀伤力。如果前端开发是降龙十八掌，好么javascript就是第18掌：亢龙有悔。...
本站9月直播课已经结束,本套教程是直播实录,没有报上名或者漏听学员福利来了，赶紧看看吧，说不定这里就有你的菜
轻松明快，简洁生动，让你快速走入HTML5的世界，体会语义化开发的魅力
JavaScript能够称得上是史上使用最广泛的编程语言，也是前端开发必须掌握的三技能之一：描述网页内容的HTML、描述网页样式的CSS以及描述网页行为的JavaScript。本章节将帮助大家迅速掌握...
《php用户注册登录系统》主要介绍网站的登录注册功能,我们会从最简单的实现登录注册功能开始，增加验证码，cookie验证等，丰富网站的登录注册功能
Bootstrap 是最受欢迎的 HTML、CSS 和 JS 框架，用于开发响应式布局、移动设备优先的 WEB 项目。为所有开发者、所有应用场景而设计,它让前端开发更快速、简单，所有开发者都能快速上手...
《PHP学生管理系统视频教程》主要给大家讲解了HTML，PHP，MySQL之间的相互协作，实现动态的网页显示和获取数据.
《php.cn独孤九贱（2）－css视频教程》课程特色：php中文网原创幽默段子系列课程，以恶搞，段子为主题风格的php视频教程！轻松的教学风格，简短的教学模式，让同学们在不知不觉中，学会了CSS知识...
《弹指间学会HTML视频教程》从最基本的概念开始讲起，步步深入，带领大家学习HTML,了解各种常用标签的意义以及基本用法，学习HTML知识为以后的学习打下基础
jQuery是一个快速、简洁的JavaScript框架。设计的宗旨是“write Less，Do More”，即倡导写更少的代码，做更多的事情。它封装JavaScript常用的功能代码，提供一种简便的...
《最新微信小程序开发视频教程》本节课程是由微趋道录制，讲述了如何申请一个微信小程序，以及开发中需要使用哪些工具，和需要注意哪些等。
文章总浏览数利用TensorFlow实现多元逻辑回归和多元线性回归
数据挖掘入门与实战公众号： datadw
逻辑回归原理参考
以下是应用深度学习框架tensorflow 实现：
回复公众号“tw”获取相应的数据集。
多元回归原理参考
以下是应用深度学习框架tensorflow 实现：
回复公众号“tw”获取相应的数据集。
数据挖掘入门与实战
教你机器学习，教你数据挖掘
公众号： weic2c
据分析入门与实战
责任编辑：
声明：该文观点仅代表作者本人，搜狐号系信息发布平台，搜狐仅提供信息存储空间服务。
今日搜狐热点查看: 378|回复: 0
TensorFlow ML cookbook 第三章6-8节套索和岭回归、弹性网络回归and Logistic回归
主题听众收听
发表于 4&天前
问题导读：
1、如何实现套索和岭回归？
2、如何实现弹性网络回归？
3、如何实施Logistic回归？
4、如何理解将线性回归转化为二元分类？
实现套索和岭回归
这里也有办法限制系数对回归输出的影响。这些方法被称为正则化方法，两种最常见的正则化方法是套索和岭回归。我们介绍如何在这个章节中实现这两个。
除了我们加入正则化项以限制公式中的斜率（或偏斜率）之外，Lasso和岭回归与常规线性回归非常相似。这可能有多种原因，但常见的原因是我们希望限制对因变量有影响的功能。这可以通过向损失函数添加一个项来完成，该项取决于我们的斜率值A。
对于套索回归，如果斜率A超过某个值，我们必须添加一个可以大大增加损失函数的项。我们可以使用TensorFlow的逻辑运算，但它们没有与它们相关的梯度。相反，我们将使用一个连续的近似步进函数，称为连续重步函数，这个函数被放大并转换到我们选择的正则化截止点。我们将很快展示如何进行套索回归。
对于岭回归，我们只是给L2范数添加一个项，这是斜率系数的缩放L2范数。这个修改很简单，在这个章节的最后有更多...部分。
1.我们将再次使用虹膜数据集并以与之前相同的方式设置我们的脚本。我们首先加载库，启动会话，加载数据，声明批量大小，创建占位符，变量和模型输出，如下所示：
[Python] 纯文本查看复制代码import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([x[3] for x in iris.data])
y_vals = np.array([y[0] for y in iris.data])
batch_size = 50
learning_rate = 0.001
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[1,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
model_output = tf.add(tf.matmul(x_data, A), b)
2.我们增加了损失函数，这是一个修改后的连续重度阶梯函数。我们还将套索回归的截止值设置为0.9。这意味着我们要限制斜率系数小于0.9。使用下面的代码：
[Python] 纯文本查看复制代码lasso_param = tf.constant(0.9)
heavyside_step = tf.truediv(1., tf.add(1., tf.exp(tf.mul(-100., tf.sub(A, lasso_param)))))
regularization_param = tf.mul(heavyside_step, 99.)
loss = tf.add(tf.reduce_mean(tf.square(y_target - model_output)), regularization_param)
3.我们现在初始化我们的变量并声明我们的优化器，如下所示：
[Python] 纯文本查看复制代码init = tf.global_variables_initializer()
sess.run(init)
my_opt = tf.train.GradientDescentOptimizer(learning_rate)
train_step = my_opt.minimize(loss)
4.我们将再次运行训练循环，因为它可能需要一段时间才能收敛。我们可以看到斜率系数小于0.9。使用下面的代码：
[Python] 纯文本查看复制代码loss_vec = []
for i in range(1500):
rand_index = np.random.choice(len(x_vals), size=batch_size)
rand_x = np.transpose([x_vals[rand_index]])
rand_y = np.transpose([y_vals[rand_index]])
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss[0])
if (i+1)%300==0:
print('Step #''' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))
print('Loss = ' + str(temp_loss))
Step #300 A = [[ 0.]] b = [[ 2.]]
Loss = [[ 6.]]
Step #600 A = [[ 0.8200165]] b = [[ 3.]]
Loss = [[ 2.]]
Step #900 A = [[ 0.]] b = [[ 4.]]
Loss = [[ 0.]]
Step #1200 A = [[ 0.]] b = [[ 4.]]
Loss = [[ 0.]]
Step #1500 A = [[ 0.]] b = [[ 4.6360755]]
Loss = [[ 0.]]
我们通过向线性回归的损失函数添加连续的超重阶跃函数来实现套索回归。由于阶梯函数的陡峭性，我们必须小心步长。太大的步长，它不会收敛。对于岭回归，请参阅下一节中的必要更改。
对于岭回归，我们将损失函数更改为如下代码所示：
[Python] 纯文本查看复制代码ridge_param = tf.constant(1.)
ridge_loss = tf.reduce_mean(tf.square(A))
loss = tf.expand_dims(tf.add(tf.reduce_mean(tf.square(y_target - model_output)), tf.mul(ridge_param, ridge_loss)), 0)
实现弹性网络回归
弹性净回归是一种回归方法，它通过在损失函数中加入L1和L2正则化项来将套索回归和岭回归相结合。
在前两个配方之后实现弹性网络回归应该很简单，因此我们将在虹膜数据集上进行多重线性回归，而不是像以前那样坚持二维数据。我们将使用踏板长度，踏板宽度和萼片宽度来预测萼片长度。
1.首先我们加载必要的库并初始化一个图，如下所示：
[Python] 纯文本查看复制代码import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
sess = tf.Session()
2.现在我们将加载数据。这一次，x数据的每个元素将是一个三个值的列表，而不是一个。使用下面的代码：
[Python] 纯文本查看复制代码iris = datasets.load_iris()
x_vals = np.array([[x[1], x[2], x[3]] for x in iris.data])
y_vals = np.array([y[0] for y in iris.data])
3.接下来我们声明批量大小，占位符，变量和模型输出。这里唯一的区别是，我们更改了x数据占位符的大小规格以取三个值而不是一个，如下所示：
[Python] 纯文本查看复制代码batch_size = 50
learning_rate = 0.001
x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[3,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
model_output = tf.add(tf.matmul(x_data, A), b)
4.对弹性网而言，损失函数具有部分斜率的L1和L2范数。我们创建这些条款，然后将它们添加到损失函数中，如下所示：
[Python] 纯文本查看复制代码elastic_param1 = tf.constant(1.)
elastic_param2 = tf.constant(1.)
l1_a_loss = tf.reduce_mean(tf.abs(A))
l2_a_loss = tf.reduce_mean(tf.square(A))
e1_term = tf.mul(elastic_param1, l1_a_loss)
e2_term = tf.mul(elastic_param2, l2_a_loss)
loss = tf.expand_dims(tf.add(tf.add(tf.reduce_mean(tf.square(y_ target - model_output)), e1_term), e2_term), 0)
5.现在我们可以初始化变量，声明我们的优化器，然后运行训练循环并拟合我们的系数，如下所示：
[Python] 纯文本查看复制代码init = tf.global_variables_initializer()
sess.run(init)
my_opt = tf.train.GradientDescentOptimizer(learning_rate)
train_step = my_opt.minimize(loss)
loss_vec = []
for i in range(1000):
rand_index = np.random.choice(len(x_vals), size=batch_size)
rand_x = x_vals[rand_index]
rand_y = np.transpose([y_vals[rand_index]])
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss[0])
if (i+1)%250==0:
print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))
print('Loss = ' + str(temp_loss))
6.这里是代码的输出：
[Python] 纯文本查看复制代码Step #250 A = [[ 0.]
[ 0.1055888 ]
[ 1.]] b = [[ 1.]]
Loss = [ 2.]
Step #500 A = [[ 0.]
[ 1.]] b = [[ 1.]]
Loss = [ 1.8032167]
Step #750 A = [[ 0.]
[ 0.102514 ]
[ 1.]] b = [[ 1.]]
Loss = [ 1.]
Step #1000 A = [[ 0.6777274 ]
[ 0.8403284 ]] b = [[ 2.]]
Loss = [ 1.]
7.现在我们可以观察到训练迭代的损失，以确保它收敛，如下所示：
[Python] 纯文本查看复制代码plt.plot(loss_vec, 'k-')
plt.title('Loss' per Generation')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()
_145028.png (30.96 KB, 下载次数: 0)
4&天前上传
图10：1000次训练迭代绘制的弹性净回归损失
这里实施弹性净回归以及多元线性回归。我们可以看到，在损失函数中使用这些正则化项时，收敛速度比以前的部分慢。正则化就像在损失函数中添加适当的术语一样简单。
实施Logistic回归
对于这个配方，我们将实施逻辑回归来预测低出生体重的概率。
逻辑回归是将线性回归转化为二元分类的一种方法。这是通过在一个sigmoid函数中转换线性输出来实现的，该函数将输出在0和1之间进行缩放。目标是0或1，表示数据点是否在一个类中或另一个类中。由于我们预测的是0或1之间的数字，如果预测高于指定的截止值，则预测被分类到类别值1'''中，否则将类别0分类。为了这个例子的目的，我们将指定中断为0.5，这将使分类变得简单，就像对输出进行四舍五入。
我们将用于这个例子的数据将是通过马萨诸塞大学阿默斯特统计数据库存储库（）获得的低出生体重数据。我们将从其他几个因素预测低出生体重。
1.我们首先加载库，包括请求库，因为我们将通过超链接访问低出生体重数据。我们也将开始会议：
[Python] 纯文本查看复制代码import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import requests
from sklearn import datasets
from sklearn.preprocessing import normalize
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.Session()
请注意，在我们缩放数据集之前，我们将数据集分解为训练和测试。这是一个重要的区别。我们希望确保训练集完全不影响测试集。如果我们在分割之前缩放整个集合，那么我们不能保证它们不会相互影响。
2.接下来，我们将通过请求模块加载数据并指定我们要使用的功能。我们必须具体，因为一个特征是实际的出生体重，我们不希望用这个来预测出生体重是大于还是小于特定量。我们也不希望将ID列用作预测因子：
[Python] 纯文本查看复制代码birthdata_url = 'https://www.umass.edu/statdata/statdata/data/ lowbwt.dat'
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n')[5:]
birth_header = [x for x in birth_data[0].split( '') if len(x)&=1]
birth_data = [[float(x) for x in y.split( '') if len(x)&=1] for y in birth_data[1:] if len(y)&=1]
y_vals = np.array([x[1] for x in birth_data])
x_vals = np.array([x[2:9] for x in birth_data])
3.首先我们将数据集分解为测试和训练集：
[Python] 纯文本查看复制代码train_indices = np.random.choice(len(x_vals), round(len(x_ vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_ indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
4.当特征在0和1之间缩放（最小 - 最大比例缩放）时，Logistic回归收敛效果更好。接下来我们将扩展每个功能：
[Python] 纯文本查看复制代码def normalize_cols(m):
col_max = m.max(axis=0)
col_min = m.min(axis=0)
return (m-col_min) / (col_max - col_min)
x_vals_train = np.nan_to_num(normalize_cols(x_vals_train))
x_vals_test = np.nan_to_num(normalize_cols(x_vals_test))
请注意，在我们缩放数据集之前，我们将数据集分解为训练和测试。这是一个重要的区别。我们希望确保训练集完全不影响测试集。如果我们在分割之前缩放整个集合，那么我们不能保证它们不会相互影响。
5.现在我们可以开始我们的训练循环并记录损失和准确度：
[Python] 纯文本查看复制代码loss_vec = []
train_acc = []
test_acc = []
for i in range(1500):
rand_index = np.random.choice(len(x_vals_train), size=batch_ size)
rand_x = x_vals_train[rand_index]
rand_y = np.transpose([y_vals_train[rand_index]])
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss)
temp_acc_train = sess.run(accuracy, feed_dict={x_data: x_vals_ train, y_target: np.transpose([y_vals_train])})
train_acc.append(temp_acc_train)
temp_acc_test = sess.run(accuracy, feed_dict={x_data: x_vals_ test, y_target: np.transpose([y_vals_test])})
test_acc.append(temp_acc_test)
6.这里是代码来看看损失和准确性的情节：
[Python] 纯文本查看复制代码plt.plot(loss_vec, 'k-')
plt.title('Cross Entropy Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Cross' Entropy Loss')
plt.show()
plt.plot(train_acc, 'k-', label='Train Set Accuracy')
plt.plot(test_acc, 'r--', label='Test Set Accuracy')
plt.title('Train' and Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.show()
这是迭代和训练和测试集精度的损失。由于数据集只有189个观测值，因此数据集的随机分裂会导致列车和测试精度图变化：
_145209.png (62.12 KB, 下载次数: 0)
4&天前上传
图11：在1,500次迭代过程中绘制的交叉熵损失
_145217.png (48.12 KB, 下载次数: 0)
4&天前上传
图12：超过1,500代的测试和训练集精度。
Implementing Lasso and Ridge Regression
There are also ways to limit the influence of coefficients on the regression output. These methods are called regularization methods and two of the most common regularization methods are lasso and ridge regression. We cover how to implement both of these in this recipe.
Getting ready
Lasso and ridge regression are very similar to regular linear regression, except we adding regularization terms to limit the slopes (or partial slopes) in the formula. There may be multiple reasons for this, but a common one is that we wish to restrict the features that have an impact on the dependent variable. This can be accomplished by adding a term to the loss function that depends on the value of our slope, A.
For lasso regression, we must add a term that greatly increases our loss function if the slope, A, gets above a certain value. We could use TensorFlow's logical operations, but they do not have a gradient associated with them. Instead, we will use a continuous approximation to a step function, called the continuous heavy step function, that is scaled up and over to the regularization cut off we choose. We will show how to do lasso regression shortly.
For ridge regression, we just add a term to the L2 norm, which is the scaled L2 norm of the slope coefficient. This modification is simple and is shown in the There's more… section at the end of this recipe.
How to do it…
1.We will use the iris dataset again and set up our script the same way as before. We first load the libraries, start a session, load the data, declare the batch size, create the placeholders, variables, and model output as follows:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.Session()
iris = datasets.load_iris()
x_vals = np.array([x[3] for x in iris.data])
y_vals = np.array([y[0] for y in iris.data])
batch_size = 50
learning_rate = 0.001
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[1,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
model_output = tf.add(tf.matmul(x_data, A), b)
2.We add the loss function, which is a modified continuous heavyside step function. We also set the cutoff for lasso regression at 0.9. This means that we want to restrict the slope coefficient to be less than 0.9. Use the following code:
lasso_param = tf.constant(0.9)
heavyside_step = tf.truediv(1., tf.add(1., tf.exp(tf.mul(-100., tf.sub(A, lasso_param)))))
regularization_param = tf.mul(heavyside_step, 99.)
loss = tf.add(tf.reduce_mean(tf.square(y_target - model_output)), regularization_param)
3.We now initialize our variables and declare our optimizer, as follows:
init = tf.global_variables_initializer()
sess.run(init)
my_opt = tf.train.GradientDescentOptimizer(learning_rate)
train_step = my_opt.minimize(loss)
4.We will run the training loop a fair bit longer because it can take a while to converge. We can see that the slope coefficient is less than 0.9. Use the following code:
loss_vec = []
for i in range(1500):
rand_index = np.random.choice(len(x_vals), size=batch_size)
rand_x = np.transpose([x_vals[rand_index]])
rand_y = np.transpose([y_vals[rand_index]])
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss[0])
if (i+1)%300==0:
print('Step #''' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))
print('Loss = ' + str(temp_loss))
Step #300 A = [[ 0.]] b = [[ 2.]]
Loss = [[ 6.]]
Step #600 A = [[ 0.8200165]] b = [[ 3.]]
Loss = [[ 2.]]
Step #900 A = [[ 0.]] b = [[ 4.]]
Loss = [[ 0.]]
Step #1200 A = [[ 0.]] b = [[ 4.]]
Loss = [[ 0.]]
Step #1500 A = [[ 0.]] b = [[ 4.6360755]]
Loss = [[ 0.]]
How it works…
We implement lasso regression by adding a continuous heavyside step function to the loss function of linear regression. Because of the steepness of the step function, we have to be careful with the step size. Too big of a step size and it will not converge. For ridge regression, see the necessary change in the next section.
There's' more…
For ridge regression, we change the loss function to look like the following code:
ridge_param = tf.constant(1.)
ridge_loss = tf.reduce_mean(tf.square(A))
loss = tf.expand_dims(tf.add(tf.reduce_mean(tf.square(y_target - model_output)), tf.mul(ridge_param, ridge_loss)), 0)
Implementing Elastic Net Regression
Elastic net regression is a type of regression that combines lasso regression with ridge regression by adding a L1 and L2 regularization term to the loss function.
Getting ready
Implementing elastic net regression should be straightforward after the previous two recipes, so we will implement this in multiple linear regression on the iris dataset, instead of sticking to the two-dimensional data as before. We will use pedal length, pedal width, and sepal width to predict sepal length.
How to do it…
1.First we load the necessary libraries and initialize a graph, as follows:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
sess = tf.Session()
2.Now we will load the data. This time, each element of x data will be a list of three values instead of one. Use the following code:
iris = datasets.load_iris()
x_vals = np.array([[x[1], x[2], x[3]] for x in iris.data])
y_vals = np.array([y[0] for y in iris.data])
3.Next we declare the batch size, placeholders, variables, and model output. The only difference here is that we change the size specifications of the x data placeholder to take three values instead of one, as follows:
batch_size = 50
learning_rate = 0.001
x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
A = tf.Variable(tf.random_normal(shape=[3,1]))
b = tf.Variable(tf.random_normal(shape=[1,1]))
model_output = tf.add(tf.matmul(x_data, A), b)
4.For elastic net, the loss function has the L1 and L2 norms of the partial slopes. We create these terms and then add them into the loss function, as follows:
elastic_param1 = tf.constant(1.)
elastic_param2 = tf.constant(1.)
l1_a_loss = tf.reduce_mean(tf.abs(A))
l2_a_loss = tf.reduce_mean(tf.square(A))
e1_term = tf.mul(elastic_param1, l1_a_loss)
e2_term = tf.mul(elastic_param2, l2_a_loss)
loss = tf.expand_dims(tf.add(tf.add(tf.reduce_mean(tf.square(y_ target - model_output)), e1_term), e2_term), 0)
5.Now we can initialize the variables, declare our optimizer, and run the training loop and fit our coefficients, as follows:
init = tf.global_variables_initializer()
sess.run(init)
my_opt = tf.train.GradientDescentOptimizer(learning_rate)
train_step = my_opt.minimize(loss)
loss_vec = []
for i in range(1000):
&&rand_index = np.random.choice(len(x_vals), size=batch_size)
&&rand_x = x_vals[rand_index]
&&rand_y = np.transpose([y_vals[rand_index]])
&&sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
&&temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
&&loss_vec.append(temp_loss[0])
&&if (i+1)%250==0:
& & print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ' b = ' + str(sess.run(b)))
& & print('Loss = ' + str(temp_loss))
6.Here is the output of the code:
Step #250 A = [[ 0.]
[ 0.1055888 ]
[ 1.]] b = [[ 1.]]
Loss = [ 2.]
Step #500 A = [[ 0.]
[ 1.]] b = [[ 1.]]
Loss = [ 1.8032167]
Step #750 A = [[ 0.]
[ 0.102514 ]
[ 1.]] b = [[ 1.]]
Loss = [ 1.]
Step #1000 A = [[ 0.6777274 ]
[ 0.8403284 ]] b = [[ 2.]]
Loss = [ 1.]
7.Now we can observe the loss over the training iterations to be sure that it converged, as follows:
plt.plot(loss_vec, 'k-')
plt.title('Loss' per Generation')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()
_145028.png (30.96 KB, 下载次数: 0)
4&天前上传
Figure 10: Elastic net regression loss plotted over the 1,000 training iterations
How it works…
Elastic net regression is implemented here as well as multiple linear regression. We can see that with these regularization terms in the loss function the convergence is slower than in prior sections. Regularization is as simple as adding in the appropriate terms in the loss functions.
Implementing Logistic Regression
For this recipe, we will implement logistic regression to predict the probability of low birthweight.
Getting ready
Logistic regression is a way to turn linear regression into a binary classification. This is accomplished by transforming the linear output in a sigmoid function that scales the output between zero and 1. The target is a zero or 1, which indicates whether or not a data point is in one class or another. Since we are predicting a number between zero or 1, the prediction is classified into class value 1''' if the prediction is above a specified cut off value and class 0 otherwise. For the purpose of this example, we will specify that cut off to be 0.5, which will make the classification as simple as rounding the output.
The data we will use for this example will be the low birthweight data that is obtained through the University of Massachusetts Amherst statistical dataset repository (https://www. umass.edu/statdata/statdata/). We will be predicting low birthweight from several other factors.
How to do it…
1.We start by loading the libraries, including the request library, because we will access the low birth weight data through a hyperlink. We will also initiate a session:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import requests
from sklearn import datasets
from sklearn.preprocessing import normalize
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.Session()
Note that we split the dataset into train and test before we scaled the dataset. This is an important distinction to make. We want to make sure that the training set does not influence the test set at all. If we scaled the whole set before splitting, then we cannot guarantee that they don't influence each other.
2.Next we will load the data through the request module and specify which features we want to use. We have to be specific because one feature is the actual birth weight and we don't want to use this to predict if the birthweight is greater or less than a specific amount. We also do not want to use the ID column as a predictor either:
birthdata_url = 'https://www.umass.edu/statdata/statdata/data/ lowbwt.dat'
birth_file = requests.get(birthdata_url)
birth_data = birth_file.text.split('\r\n')[5:]
birth_header = [x for x in birth_data[0].split( '') if len(x)&=1]
birth_data = [[float(x) for x in y.split( '') if len(x)&=1] for y in birth_data[1:] if len(y)&=1]
y_vals = np.array([x[1] for x in birth_data])
x_vals = np.array([x[2:9] for x in birth_data])
3.First we split the dataset into test and train sets:
train_indices = np.random.choice(len(x_vals), round(len(x_ vals)*0.8), replace=False)
test_indices = np.array(list(set(range(len(x_vals))) - set(train_ indices)))
x_vals_train = x_vals[train_indices]
x_vals_test = x_vals[test_indices]
y_vals_train = y_vals[train_indices]
y_vals_test = y_vals[test_indices]
4.Logistic regression convergence works better when the features are scaled between 0 and 1 (min-max scaling). So next we will scale each feature:
def normalize_cols(m):
col_max = m.max(axis=0)
col_min = m.min(axis=0)
return (m-col_min) / (col_max - col_min)
x_vals_train = np.nan_to_num(normalize_cols(x_vals_train))
x_vals_test = np.nan_to_num(normalize_cols(x_vals_test))
Note that we split the dataset into train and test before we scaled the dataset. This is an important distinction to make. We want to make sure that the training set does not influence the test set at all. If we scaled the whole set before splitting, then we cannot guarantee that they don't influence each other.
5.Now we can start our training loop and recording the loss and accuracies:
loss_vec = []
train_acc = []
test_acc = []
for i in range(1500):
rand_index = np.random.choice(len(x_vals_train), size=batch_ size)
rand_x = x_vals_train[rand_index]
rand_y = np.transpose([y_vals_train[rand_index]])
sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_ target: rand_y})
loss_vec.append(temp_loss)
temp_acc_train = sess.run(accuracy, feed_dict={x_data: x_vals_ train, y_target: np.transpose([y_vals_train])})
train_acc.append(temp_acc_train)
temp_acc_test = sess.run(accuracy, feed_dict={x_data: x_vals_ test, y_target: np.transpose([y_vals_test])})
test_acc.append(temp_acc_test)
6.Here is the code to look at the plots of the loss and accuracies:
plt.plot(loss_vec, 'k-')
plt.title('Cross Entropy Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Cross' Entropy Loss')
plt.show()
plt.plot(train_acc, 'k-', label='Train Set Accuracy')
plt.plot(test_acc, 'r--', label='Test Set Accuracy')
plt.title('Train' and Test Accuracy')
plt.xlabel('Generation')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
plt.show()
How it works…
Here is the loss over the iterations and train and test set accuracies. Since the dataset is only 189 observations, the train and test accuracy plots will change owing to the random splitting of the dataset:
_145209.png (62.12 KB, 下载次数: 0)
4&天前上传
Figure 11: Cross-entropy loss plotted over the course of 1,500 iterations
_145217.png (48.12 KB, 下载次数: 0)
4&天前上传
Figure 12: Test and train set accuracy plotted over 1,500 generations.
积极上进，爱好学习
经常帮助其他会员答疑
站长推荐 /4
会员注册不成功的原因
新手获取积分方法
hadoop3.0学习：零基础安装部署hadoop集群
about云课程：大数据日志实时分析
Powered by
& 2018 Designed by}

叫阿莫西中心