Regression using Tensorflow and Gradient descent optimizer

Gradient descent is the most popular optimization algorithm, used in machine learning and deep learning. Gradient descent is iterative optimization algorithm for finding the local minima. To find local minima using gradient descent, one takes steps proportional to the negative of the gradient of the function at the current point.

Here in this tutorial will use Gradient descent optimization algorithm. In our example we have data in csv format with columns “height weight age projects salary”. Assuming there is a correlation between projects and salary will try predict salary given projetcs completed. You download data using this link : “https://drive.google.com/file/d/1Gx0riTlJHt9o_VyokrKNbj384AhwXpAW/view?usp=sharing”

Initial Setup

First and foremost, we need to load the necessary libraries.

from __future__ import print_function

import math ##For basic mathematical operations

from IPython import display ## Plot setup for Ipython
from matplotlib import cm ##  Colormap reference
from matplotlib import gridspec ##plot setups
from matplotlib import pyplot as plt ##plot setups
import numpy as np 
import pandas as pd
from sklearn import metrics
import tensorflow as tf
from tensorflow.python.data import Dataset

from google.colab import drive ## Loading data directly from Google Drive
drive.mount('/content/gdrive') ## Mounting drive

tf.logging.set_verbosity(tf.logging.ERROR)
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format

Loading Dataset

Load data-set as pandas dataframe and check stats.

dataframe = pd.read_csv("/content/gdrive/My Drive/Colab Notebooks/TENSOR_FLOW/train_dataset.csv", sep=",")
dataframe.head()
heightweightageprojectssalary
0-114.334.215101566900
1-114.534.419112980100
2-114.633.71733385700
3-114.633.61451573400
4-114.633.62062465500
dataframe.describe()

		height	weight	age	projects salary
	count	17000.0	17000.0	17000.0	17000.0	17000.0
	mean	-119.6	35.6	28.6	1429.6	207300.9
	std	2.0	2.1	12.6	1147.9	115983.8
	min	-124.3	32.5	1.0	3.0	14999.0
	25%	-121.8	33.9	18.0	790.0	119400.0
	50%	-118.5	34.2	29.0	1167.0	180400.0
	75%	-118.0	37.7	37.0	1721.0	265000.0
	max	-114.3	42.0	52.0	35682.0	500001.0
dataframe = dataframe.reindex(np.random.permutation(dataframe.index))
dataframe["salary"] /= 1000.0
dataframe.head()
height  weight  age projects    salary
11381 -121.2 38.9 19 1206 192.6
4865 -118.1 34.1 50 636 500.0
3442 -117.9 33.8 35 1435 200.8
14934 -122.2 37.8 52 409 189.6
14925 -122.2 37.8 52 1659 107.9

Build our First Model

We wish to predict Salary, which will be our label. We’ll use projects as our input feature. To train our model, we’ll use the LinearRegressor interface provided by the TensorFlow Estimator API. This API takes care of a lot of the low-level model fixing and exposes convenient methods for performing model training, evaluation, and inference.

Step 1: Define Features and Configure Feature Columns

In TensorFlow, we indicate a feature’s data type using a construct called a feature column. Feature columns store only a description of the feature data; they do not contain the feature data itself.

To start, we’re going to use just one numeric input feature, projects.

my_feature = dataframe[["projects"]]
feature_columns = [tf.feature_column.numeric_column("projects")]

feature_columns

[NumericColumn(key='projects', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

Step 2: Define the Target

Next, we’ll define our target, which issalary Again, we can pull it from our dataframe:

targets = dataframe["salary"]
targets
11381   192.6
4865 500.0
3442 200.8
14934 189.6
14925 107.9

7869 269.2
3770 192.9
11859 194.6
10158 167.7
14422 500.0
Name: salary, Length: 17000, dtype: float64

Step 3: Configure the LinearRegressor

Next, we’ll configure a linear regression model using LinearRegressor. We’ll train this model using the GradientDescentOptimizer, which implements Mini-Batch Stochastic Gradient Descent (SGD). The learning_rate argument controls the size of the gradient step.

my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.0000001)

my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)


linear_regressor = tf.estimator.LinearRegressor(
    feature_columns=feature_columns,
    optimizer=my_optimizer
)

Step 4: Define the Input Function

To import our salary data into our LinearRegressor, we need to define an input function, which instructs TensorFlow how to preprocess the data, as well as how to batch, shuffle, and repeat it during model training.

First, we’ll convert our pandas feature data into a dict of NumPy arrays. We can then use the TensorFlow Dataset API to construct a dataset object from our data, and then break our data into batches of batch_size, to be repeated for the specified number of epochs (num_epochs).

NOTE: When the default value of num_epochs=None is passed to repeat(), the input data will be repeated indefinitely.

Next, if shuffle is set to True, we’ll shuffle the data so that it’s passed to the model randomly during training. The buffer_size argument specifies the size of the dataset from which shuffle will randomly sample.

Finally, our input function constructs an iterator for the dataset and returns the next batch of data to the LinearRegressor.

def my_input_fn(features, targets, batch_size=1, shuffle=True, num_epochs=None):
  
    # Convert pandas data into a dict of np arrays.
    features = {key:np.array(value) for key,value in dict(features).items()}                                           
    
    # Construct a dataset, and configure batching/repeating.
    ds = Dataset.from_tensor_slices((features,targets)) # warning: 2GB limit
    ds = ds.batch(batch_size).repeat(num_epochs)
    
    # Shuffle the data, if specified.
    if shuffle:
      ds = ds.shuffle(buffer_size=10000)
    
    # Return the next batch of data.
    features, labels = ds.make_one_shot_iterator().get_next()
    return features, labels

Step 5: Train the Model

We can now call train() on our linear_regressor to train the model. We’ll wrap my_input_fn in a lambda so we can pass in my_feature and target as arguments (see this TensorFlow input function tutorial for more details), and to start, we’ll train for 100 steps.

_ = linear_regressor.train(
input_fn = lambda:my_input_fn(my_feature, targets),
steps=100
)

Tweak the Model Hyperparameters and optimize model

For this exercise, we’ve put all the above code in a single function for convenience. You can call the function with different parameters to see the effect.

In this function, we’ll proceed in 10 evenly divided periods so that we can observe the model improvement at each period.

For each period, we’ll compute and graph training loss. This may help you judge when a model is converged, or if it needs more iterations.

We’ll also plot the feature weight and bias term values learned by the model over time. This is another way to see how things converge.

def train_model(learning_rate, steps, batch_size, input_feature="projects"):
 
  periods = 10
  steps_per_period = steps / periods

  my_feature = input_feature
  my_feature_data = dataframe[[my_feature]]
  my_label = "salary"
  targets = dataframe[my_label]

  # feature columns.
  feature_columns = [tf.feature_column.numeric_column(my_feature)]
  
  # input functions.
  training_input_fn = lambda:my_input_fn(my_feature_data, targets, batch_size=batch_size)
  prediction_input_fn = lambda: my_input_fn(my_feature_data, targets, num_epochs=1, shuffle=False)
  
  # linear regressor object.
  my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
  my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)
  linear_regressor = tf.estimator.LinearRegressor(
      feature_columns=feature_columns,
      optimizer=my_optimizer
  )

  # plot
  plt.figure(figsize=(15, 6))
  plt.subplot(1, 2, 1)
  plt.title("Learned Line by Period")
  plt.ylabel(my_label)
  plt.xlabel(my_feature)
  sample = dataframe.sample(n=300)
  plt.scatter(sample[my_feature], sample[my_label])
  colors = [cm.coolwarm(x) for x in np.linspace(-1, 1, periods)]

  # Training
  print("Training model...")
  print("RMSE (on training data):")
  root_mean_squared_errors = []
  for period in range (0, periods):
    
    linear_regressor.train(
        input_fn=training_input_fn,
        steps=steps_per_period
    )
    
    predictions = linear_regressor.predict(input_fn=prediction_input_fn)
    predictions = np.array([item['predictions'][0] for item in predictions])
    
   
    root_mean_squared_error = math.sqrt(
        metrics.mean_squared_error(predictions, targets))
    
    print("  period %02d : %0.2f" % (period, root_mean_squared_error))
   
    root_mean_squared_errors.append(root_mean_squared_error)
   
    
    y_extents = np.array([0, sample[my_label].max()])
    
    weight = linear_regressor.get_variable_value('linear/linear_model/%s/weights' % input_feature)[0]
    bias = linear_regressor.get_variable_value('linear/linear_model/bias_weights')

    x_extents = (y_extents - bias) / weight
    x_extents = np.maximum(np.minimum(x_extents,
                                      sample[my_feature].max()),
                           sample[my_feature].min())
    y_extents = weight * x_extents + bias
    plt.plot(x_extents, y_extents, color=colors[period]) 
  print("Model training finished.")

  
  plt.subplot(1, 2, 2)
  plt.ylabel('RMSE')
  plt.xlabel('Periods')
  plt.title("Root Mean Squared Error vs. Periods")
  plt.tight_layout()
  plt.plot(root_mean_squared_errors)

  
  calibration_data = pd.DataFrame()
  calibration_data["predictions"] = pd.Series(predictions)
  calibration_data["targets"] = pd.Series(targets)
  display.display(calibration_data.describe())

  print("Final RMSE (on training data): %0.2f" % root_mean_squared_error)

Training: Achieve an RMSE of 180 or Below

Tweak the model hyperparameters to improve loss and better match the target distribution. If, after 5 minutes or so, you’re having trouble beating a RMSE of 180, check the solution for a possible combination.

train_model(
    learning_rate=0.00002,
    steps=500,
    batch_size=3
)
Training model…
RMSE (on training data):
period 00 : 0.27
period 01 : 0.27
period 02 : 0.27
period 03 : 0.24
period 04 : 0.27
period 05 : 0.27
period 06 : 0.27
period 07 : 0.18
period 08 : 0.18
period 09 : 0.18
Model training finished.
predictions targets 
count 17000.0 17000.0
mean 0.1 0.2
std 0.1 0.1
min 0.0 0.0
25% 0.0 0.1
50% 0.1 0.2
75% 0.1 0.3
max 2.2 0.5

  1. Hi Nilesh, Nice work with the computer generated Bible. I had a similar thought with another piece of art. Wanted…

I am Nilesh Kumar, a graduate student at the Department of Biology, UAB under the mentorship of Dr. Shahid Mukhtar. I joined UAB in Spring 2018 and working on Network Biology. My research interests are Network modeling, Mathematical modeling, Game theory, Artificial Intelligence and their application in Systems Biology.

I graduated with master’s degree “Master of Technology, Information Technology (Specialization in Bioinformatics)” in 2015 from Indian Institute of Information Technology Allahabad, India with GATE scholarship. My Master’s thesis was entitled “Mirtron Prediction through machine learning approach”. I worked as a research fellow at The International Centre for Genetic Engineering and Biotechnology, New Delhi for two years.

🚀 The end of project management by humans (Sponsored)