001LinearRegression 16

Linear Regression
Python TensorFlow
郭忠義
jykuo@ntut.edu.tw
臺北科技大學資訊工程系
1
Linear Regression
 線性迴歸
 找到一條直線方程式 H = W*X + b，擬合藍色線，Y = X
 訓練資料，X=[1, 2, 3]，Y = [1, 2, 3]
 建構直線方程式 H = W*X + b，訓練調整 W、b

Linear Regression
 單變量線性回歸，又稱簡單線性回歸(simple linear
regression, SLR)，是最簡單但用途很廣的回歸模型。
 從一組樣本(yi,xi)，i=1, 2, .., n。估計誤差最小的和

 ，通常採用最小平方法，計算目標為最小化殘差平方和。
 用微分法求極值，將上式分別對和做一階偏微分，並令
其等於0：
Linear Regression
 此二元一次線性方程組可用克萊姆法則求解，得解
sklearn
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
def linearTest(x, y):
from sklearn.linear_model import LinearRegression
#轉成陣列
x = np.array(x).reshape(len(x),1)
def linearIris():
y = np.array(y).reshape(len(y),1)
hua = load_iris()
clf = LinearRegression()
#獲取花瓣的長和寬
clf.fit(x,y)
x = [n[0] for n in hua.data]
pre = clf.predict(x)
y = [n[1] for n in hua.data]
#畫圖
linearTest(x, y)
plt.scatter(x,y,s=100)
x = [1, 2, 3, 4, 5]
plt.plot(x,pre,"r-",linewidth=4)
y = [1, 5, 9, 12,14]
for idx, m in enumerate(x):
linearTest(x, y)
plt.plot([m,m],[y[idx],pre[idx]], 'g-')
x = np.random.rand(100).astype(np.float32)
plt.show()
y = x * 0.1 + 0.3
print("係數", clf.coef_)
linearTest(x, y)
print("截距", clf.intercept_)
print(np.mean(y-pre)**2)
print(clf.predict([[5.0]]))
TensorFlow
 線性迴歸
 訓練資料，X= [1, 2, 3]，Y = [1, 2, 3]，使用佔位函式，
後面再給訓練資料的值。
 建構直線方程式 H = W*X + b，隨機值初始化 W、b。
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
X = tf.placeholder(tf.float32, shape=[None])
Y = tf.placeholder(tf.float32, shape=[None])
print(W, b, X, Y)
# Our H (hypothesis) XW+b
H=X*W+b
TensorFlow
 線性迴歸
 建構損失函式(loss/cost function)，使用最小方差。
 最佳化器，使用梯度最佳化，學習率 0.001。
# cost/loss function
cost = tf.reduce_mean(tf.square(H - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
TensorFlow
 線性迴歸
 執行計算圖訓練(會話 Session)，初始化全部變數。
 訓練 2001次，以 feed_dict 輸入佔位變數 X, Y 測試資料。
 每 200 次印出一次損失值、及 W, b 參數調整結果。
# Launch the graph in a session.

sess = tf.Session()
# Initializes all variables in the graph.
sess.run(tf.global_variables_initializer())
# Fit the line, run 2001
for step in range(2001):
# feed in training data x, y
cost_val, W_val, b_val, _ = sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})
if step % 200 == 0: print(step, cost_val, W_val, b_val)
TensorFlow
 範例–線性迴歸
 執行結果，W 趨近 1 、b 趨近 0。
<tf.Variable 'weight_5:0' shape=(1,) dtype=float32_ref> <tf.Variable 'bias_5:0'
shape=(1,) dtype=float32_ref> Tensor("Placeholder_10:0", shape=(?,),
dtype=float32) Tensor("Placeholder_11:0", shape=(?,), dtype=float32)
0 32.188885 [-1.4213595] [-0.47519705]
200 0.0133846365 [0.865631] [0.30545235]
400 0.0051109153 [0.91696805] [0.18875107]
600 0.0019515958 [0.9486913] [0.11663658]
800 0.0007452139 [0.9682944] [0.07207426]
1000 0.00028456055 [0.9804079] [0.04453757]
1200 0.00010865821 [0.9878932] [0.02752146]
1400 4.1490843e-05 [0.9925188] [0.01700653]
1600 1.584322e-05 [0.99537706] [0.01050906]
1800 6.0499187e-06 [0.99714327] [0.00649399]
2000 2.310245e-06 [0.9982347] [0.00401292]
3. TensorFlow
 邏輯回歸和線性回歸區別
 分析資料若是名目或順序變項，例如醫學統計發生死亡與否、
生病與否等有無發生，此時依變項只有兩種情況，傳統線性迴
歸不適用類別性資料。此時對依變項作logit轉換可解決許多
問題。
 線性迴歸係數的解釋為「當自變項增加一個單位，依變項會增
加多少單位」。
 Logistic regression的迴歸係數解釋為「當自變項增加一個
單位，依變項1相對依變項0的機率會增加幾倍」
 自變項增加一個單位，依變項有發生狀況 Event 相對於沒有發生
狀況的比值，是勝算比（Odds ratio, OR）。
3. TensorFlow
 邏輯回歸和線性回歸區別
線性回歸邏輯回歸
變量服從常態分布是否
因變量連續性數值分類型或連續性數值

自變量和因變量線性關系非線性關系
分析因變量與自變量關係因變量取某值的機率與自變量關系
函數值在[0,1]間
3. TensorFlow
 範例–邏輯迴歸
 從訓練數據特徵學習出一個 0/1分類模型。預測函數
𝑧 𝜃 𝜃 𝑥 𝜃 𝑥 … 𝜃 𝑥 𝜃 𝑥
1 1
ℎ 𝑧 ，ℎ 𝑥
1 𝑒 1 𝑒
 將線性函式映射到sigmoid函式，Sigmoid函式
1
𝑦
1 𝑒
ℎ 𝑥 <0.5，說明目前資料屬於0類
ℎ 𝑥 >0.5，說明目前資料屬於1類
3. TensorFlow
 優點：計算代價不高，易於理解和實現；
 缺點：容易欠擬合，分類精度不高
 觀測值的機率(極大似然機率)
 損失代價函數:
3. TensorFlow
3. TensorFlow
 範例–Softmax迴歸
3. TensorFlow
 範例–Softmax迴歸
 執行結果，W趨近 1 、b 趨近 0。
 當多分類問題時，logistic推廣為softmax.
 假設函數
 損失函數
 為解決softmax回歸的參數冗余帶來的數值問題，可以加入權
重衰減項

001LinearRegression 16

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

001LinearRegression 16

Uploaded by

Copyright:

Available Formats

Linear Regression

 建構直線方程式 H = W*X + b，訓練調整 W、b

 從一組樣本(yi,xi)，i=1, 2, .., n。估計誤差最小的和

 每 200 次印出一次損失值、及 W, b 參數調整結果。

# Launch the graph in a session.

因變量連續性數值分類型或連續性數值

You might also like

001LinearRegression 16

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

001LinearRegression 16

Uploaded by

Copyright:

Available Formats

Linear Regression

 建構直線方程式 H = W*X + b，訓練調整 W、b

 從一組樣本(yi,xi)，i=1, 2, .., n。估計誤差最小的和

 每 200 次印出一次損失值、及 W, b 參數調整結果。

# Launch the graph in a session.

因變量 連續性數值 分類型或連續性數值

You might also like

因變量連續性數值分類型或連續性數值