Logistic Regression with Keras


This article explains what Logistic Regression is, its intuition, and how we can use Keras layers to implement it.

What is Logistic Regression?

It is a regression algorithm used for classifying binary dependent variables. It uses a probabilistic logarithmic function which tells how likely the given data point belongs to a class.

source: https://dataaspirant.com/how-logistic-regression-model-works/

For example, Penguin wants to know how likely it will be happy based on the daily activities.

The intuition behind Logistic Regression

Is it feasible to use linear Regression for classification problems?

First, we took a balanced binary dataset for classification with one input feature and finding the best fit line for this using linear Regression. We will set a threshold like if the value of y > 0.5, the class predicted will be one else; if y <= 0.5, then the data point belongs to class 0. See the figure below.

The regression line and the threshold are intersecting at x = 19.5.For x > 19.5 our model will predict class 0 and for x <= 19.5 our model will predict class 1.

On this type of balance data, linear Regression performs good but what if the data is imbalanced.

Now apply linear Regression on imbalanced data and analyze the predictions.

You will see that linear Regression doesn’t perform well for the data points shown above because for x < 24, the model will predict class 1, hence making some errors as there are also the classes with label 0, which the model classifies wrongly.

Two problems arise while using Linear Regression for classification

  • Sensitive to the imbalanced dataset, as we have seen earlier.
  • It gives continuous values, not the probabilistic(0-1) :- we defined that if y > 0.5 model will predict class 1.suppose for a particular value of feature, output y = 105. This makes no sense as these number doesn’t tell anything.

Here comes the Logistic Regression. What it does it applies a logistic function that limits the value between 0 and 1.This logistic function is Sigmoid.

Sigmoid curve with threshold y = 0.5:

This function provides the likelihood of a data point belongs to a class or not. The hypothesis of Logistic Regression is given below:

For optimizing the weights, gradient descent technique is used like adam, SGD, RMSprop, etc.

Cost Function

In logistic Regression, using mean squared error as the loss function will give less accuracy on the data. It has many local minima(non-convex), and it might happen that gradient descent doesn’t give the global minima.

source: https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc

So, we will use Binary cross-entropy(convex function) as the loss function given below:

Let’s look into the implementation:

Sklearn.linear_model  provides you Logistic Regression class; you can also use it to make the model. But here, we see the implementation of Logistic Regression using Keras.


source: https://towardsdatascience.com/a-logistic-regression-from-scratch-3824468b1f88

In the above architecture, the number of features, i.e., four, can differ accordingly with the dataset you are working upon and the same with weights.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
number_of_classes = 2
number_of_features = 4 #X_train.shape[1]
model = Sequential()
model.add(Dense(number_of_classes,activation = 'sigmoid',input_dim = number_of_features))
model.compile(optimizer='adam', loss='binary_crossentropy')
#model.fit(x, y, epoch=10, validation_data=(x_val, y_val))

If the number of classes in the dataset is greater than two, then you should use Categorical cross-entropy.

Note: If the dataset is huge then,adam optimizer is the best option.

Thank you for the read! For any query, please leave a comment.


Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.