Gradient descent is the most popular optimization algorithm, used in machine learning and deep learning. Gradient descent is iterative optimization algorithm for finding the local minima. To find local minima using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point.
Here in this tutorial will use Gradient descent optimization algorithm. In our example we have data in csv format with columns “height weight age projects salary”. Assuming there is a correlation between projects and salary will try predict salary given projetcs completed. You download data using this link : “https://drive.google.com/file/d/1Gx0riTlJHt9o_VyokrKNbj384AhwXpAW/view?usp=sharing”
Initial Setup
First and foremost, we need to load the necessary libraries.
from __future__ import print_function
import math ##For basic mathematical operations
from IPython import display ## Plot setup for Ipython
from matplotlib import cm ## Colormap reference
from matplotlib import gridspec ##plot setups
from matplotlib import pyplot as plt ##plot setups
import numpy as np
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset
from google.colab import drive ## Loading data directly from Google Drive
drive.mount('/content/gdrive') ## Mounting drive
tf.logging.set_verbosity(tf.logging.ERROR)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format
Loading Dataset
Load data-set as pandas dataframe and check stats.
dataframe = pd.read_csv("/content/gdrive/My Drive/Colab Notebooks/TENSOR_FLOW/train_dataset.csv", sep=",")
dataframe.head()
height | weight | age | projects | salary | |
0 | -114.3 | 34.2 | 15 | 1015 | 66900 |
1 | -114.5 | 34.4 | 19 | 1129 | 80100 |
2 | -114.6 | 33.7 | 17 | 333 | 85700 |
3 | -114.6 | 33.6 | 14 | 515 | 73400 |
4 | -114.6 | 33.6 | 20 | 624 | 65500 |
dataframe.describe()
height weight age projects salary
count 17000.0 17000.0 17000.0 17000.0 17000.0
mean -119.6 35.6 28.6 1429.6 207300.9
std 2.0 2.1 12.6 1147.9 115983.8
min -124.3 32.5 1.0 3.0 14999.0
25% -121.8 33.9 18.0 790.0 119400.0
50% -118.5 34.2 29.0 1167.0 180400.0
75% -118.0 37.7 37.0 1721.0 265000.0
max -114.3 42.0 52.0 35682.0 500001.0
dataframe = dataframe.reindex(np.random.permutation(dataframe.index))
dataframe["salary"] /= 1000.0
dataframe.head()
height weight age projects salary
11381 -121.2 38.9 19 1206 192.6
4865 -118.1 34.1 50 636 500.0
3442 -117.9 33.8 35 1435 200.8
14934 -122.2 37.8 52 409 189.6
14925 -122.2 37.8 52 1659 107.9
Build our First Model
We wish to predict Salary, which will be our label. We’ll use projects as our input feature. To train our model, we’ll use the LinearRegressor interface provided by the TensorFlow Estimator API. This API takes care of a lot of the low-level model fixing and exposes convenient methods for performing model training, evaluation, and inference.
Step 1: Define Features and Configure Feature Columns
In TensorFlow, we indicate a feature’s data type using a construct called a feature column. Feature columns store only a description of the feature data; they do not contain the feature data itself.
To start, we’re going to use just one numeric input feature, projects.
my_feature = dataframe[["projects"]]
feature_columns = [tf.feature_column.numeric_column("projects")]
feature_columns
[NumericColumn(key='projects', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]
Step 2: Define the Target
Next, we’ll define our target, which issalary
Again, we can pull it from our dataframe
:
targets = dataframe["salary"]
targets
11381 192.6
4865 500.0
3442 200.8
14934 189.6
14925 107.9
…
7869 269.2
3770 192.9
11859 194.6
10158 167.7
14422 500.0
Name: salary, Length: 17000, dtype: float64
Step 3: Configure the LinearRegressor
Next, we’ll configure a linear regression model using LinearRegressor. We’ll train this model using the GradientDescentOptimizer, which implements Mini-Batch Stochastic Gradient Descent (SGD). The learning_rate argument controls the size of the gradient step.
my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0000001)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
Step 4: Define the Input Function
To import our salary data into our LinearRegressor, we need to define an input function, which instructs TensorFlow how to preprocess the data, as well as how to batch, shuffle, and repeat it during model training.
First, we’ll convert our pandas feature data into a dict of NumPy arrays. We can then use the TensorFlow Dataset API to construct a dataset object from our data, and then break our data into batches of batch_size, to be repeated for the specified number of epochs (num_epochs).
NOTE: When the default value of num_epochs=None is passed to repeat(), the input data will be repeated indefinitely.
Next, if shuffle is set to True, we’ll shuffle the data so that it’s passed to the model randomly during training. The buffer_size argument specifies the size of the dataset from which shuffle will randomly sample.
Finally, our input function constructs an iterator for the dataset and returns the next batch of data to the LinearRegressor.
def my_input_fn(features, targets, batch_size=1, shuffle=True, num_epochs=None):
# Convert pandas data into a dict of np arrays.
features = {key:np.array(value) for key,value in dict(features).items()}
# Construct a dataset, and configure batching/repeating.
ds = Dataset.from_tensor_slices((features,targets)) # warning: 2GB limit
ds = ds.batch(batch_size).repeat(num_epochs)
# Shuffle the data, if specified.
if shuffle:
ds = ds.shuffle(buffer_size=10000)
# Return the next batch of data.
features, labels = ds.make_one_shot_iterator().get_next()
return features, labels
Step 5: Train the Model
We can now call train() on our linear_regressor to train the model. We’ll wrap my_input_fn in a lambda so we can pass in my_feature and target as arguments (see this TensorFlow input function tutorial for more details), and to start, we’ll train for 100 steps.
_ = linear_regressor.train(
input_fn = lambda:my_input_fn(my_feature, targets),
steps=100
)
Tweak the Model Hyperparameters and optimize model
For this exercise, we’ve put all the above code in a single function for convenience. You can call the function with different parameters to see the effect.
In this function, we’ll proceed in 10 evenly divided periods so that we can observe the model improvement at each period.
For each period, we’ll compute and graph training loss. This may help you judge when a model is converged, or if it needs more iterations.
We’ll also plot the feature weight and bias term values learned by the model over time. This is another way to see how things converge.
def train_model(learning_rate, steps, batch_size, input_feature="projects"):
periods = 10
steps_per_period = steps / periods
my_feature = input_feature
my_feature_data = dataframe[[my_feature]]
my_label = "salary"
targets = dataframe[my_label]
# feature columns.
feature_columns = [tf.feature_column.numeric_column(my_feature)]
# input functions.
training_input_fn = lambda:my_input_fn(my_feature_data, targets, batch_size=batch_size)
prediction_input_fn = lambda: my_input_fn(my_feature_data, targets, num_epochs=1, shuffle=False)
# linear regressor object.
my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
linear_regressor = tf.estimator.LinearRegressor(
feature_columns=feature_columns,
optimizer=my_optimizer
)
# plot
plt.figure(figsize=(15, 6))
plt.subplot(1, 2, 1)
plt.title("Learned Line by Period")
plt.ylabel(my_label)
plt.xlabel(my_feature)
sample = dataframe.sample(n=300)
plt.scatter(sample[my_feature], sample[my_label])
colors = [cm.coolwarm(x) for x in np.linspace(-1, 1, periods)]
# Training
print("Training model...")
print("RMSE (on training data):")
root_mean_squared_errors = []
for period in range (0, periods):
linear_regressor.train(
input_fn=training_input_fn,
steps=steps_per_period
)
predictions = linear_regressor.predict(input_fn=prediction_input_fn)
predictions = np.array([item['predictions'][0] for item in predictions])
root_mean_squared_error = math.sqrt(
metrics.mean_squared_error(predictions, targets))
print(" period %02d : %0.2f" % (period, root_mean_squared_error))
root_mean_squared_errors.append(root_mean_squared_error)
y_extents = np.array([0, sample[my_label].max()])
weight = linear_regressor.get_variable_value('linear/linear_model/%s/weights' % input_feature)[0]
bias = linear_regressor.get_variable_value('linear/linear_model/bias_weights')
x_extents = (y_extents - bias) / weight
x_extents = np.maximum(np.minimum(x_extents,
sample[my_feature].max()),
sample[my_feature].min())
y_extents = weight * x_extents + bias
plt.plot(x_extents, y_extents, color=colors[period])
print("Model training finished.")
plt.subplot(1, 2, 2)
plt.ylabel('RMSE')
plt.xlabel('Periods')
plt.title("Root Mean Squared Error vs. Periods")
plt.tight_layout()
plt.plot(root_mean_squared_errors)
calibration_data = pd.DataFrame()
calibration_data["predictions"] = pd.Series(predictions)
calibration_data["targets"] = pd.Series(targets)
display.display(calibration_data.describe())
print("Final RMSE (on training data): %0.2f" % root_mean_squared_error)
Training: Achieve an RMSE of 180 or Below
Tweak the model hyperparameters to improve loss and better match the target distribution. If, after 5 minutes or so, you’re having trouble beating a RMSE of 180, check the solution for a possible combination.
train_model(
learning_rate=0.00002,
steps=500,
batch_size=3
)
Training model…
RMSE (on training data):
period 00 : 0.27
period 01 : 0.27
period 02 : 0.27
period 03 : 0.24
period 04 : 0.27
period 05 : 0.27
period 06 : 0.27
period 07 : 0.18
period 08 : 0.18
period 09 : 0.18
Model training finished.
predictions targets
count 17000.0 17000.0
mean 0.1 0.2
std 0.1 0.1
min 0.0 0.0
25% 0.0 0.1
50% 0.1 0.2
75% 0.1 0.3
max 2.2 0.5

- Why Don’t Language Models Understand ‘A is B’ Equals ‘B is A’? Exploring the Reversal Curse in Auto-Regressive LLMs
- Shanghai Jiao Tong University Researchers Unveil RH20T: The Ultimate Robotic Dataset Boasting 110K Sequences, Multimodal Data, and 147 Diverse Tasks
- Google Researchers Launch an Ambitious Project to Map Mouse Brain: Paving the Way for Understanding Neurological Disorders
- Google DeepMind Researchers Uncover Scalable Solutions to Combat Training Instabilities in Transformer Models: An In-depth Analysis on Smaller Scale Reproducibility and Optimization Strategies
- Deep Learning in Optical Metrology: How Can DYnet++ Enhance Single-Shot Deflectometry for Complex Surfaces?
Hi Nilesh, Nice work with the computer generated Bible. I had a similar thought with another piece of art. Wanted…
Very nice
Thank you
Good work
I am Nilesh Kumar, a graduate student at the Department of Biology, UAB under the mentorship of Dr. Shahid Mukhtar. I joined UAB in Spring 2018 and working on Network Biology. My research interests are Network modeling, Mathematical modeling, Game theory, Artificial Intelligence and their application in Systems Biology.
I graduated with master’s degree “Master of Technology, Information Technology (Specialization in Bioinformatics)” in 2015 from Indian Institute of Information Technology Allahabad, India with GATE scholarship. My Master’s thesis was entitled “Mirtron Prediction through machine learning approach”. I worked as a research fellow at The International Centre for Genetic Engineering and Biotechnology, New Delhi for two years.
Very nice thanks for post