Boosting is a supervised machine learning algorithm for primarily handling data which have outlier and variance. Recently, boosting algorithms gained enormous popularity in data science. Boosting algorithms combine multiple low accuracy models to create a high accuracy model. AdaBoost is example of Boosting algorithm. The important advantages of AdaBoost Low generalization error, easy to implement, works with a wide range of classifiers, no parameters to adjust. Especial attention is needed to data as this algorithm is sensitive to outliers.
# For linux os
$ sudo pip install sklearn
Building Model in Python
Let’s first install the required Sklearn libraries in Python using pip.
from sklearn.ensemble import AdaBoostClassifier from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn import metrics
Loading iris dataset
There are 4 features (sepal length, sepal width, petal length, petal width) and a target four types of flower: Setosa, Versicolour, and Virginica.
iris = datasets.load_iris() X = iris.data y = iris.target print X.view
<built-in method view of numpy.ndarray object at 0x7f9b3e0d7df0> print X
[[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2] [4.6 3.1 1.5 0.2] ... [6.3 2.5 5. 1.9] [6.5 3. 5.2 2. ] [6.2 3.4 5.4 2.3] [5.9 3. 5.1 1.8]]
Split the data set
For better model training we would need Tesing and trainig sclices of the data.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) print "X_train:",len(X_train),"; X_test:",len(X_test),"; y_train:",len(y_train),"; y_test:",len(y_test)
X_train: 105 ; X_test: 45 ; y_train: 105 ; y_test: 4 70% training and 30% test
Building the Model, AdaBoost
Let’s build the AdaBoost Model using Scikit-learn using Decision Tree Classifier the default Classifier.
# Create adaboost object Adbc = AdaBoostClassifier(n_estimators=50, learning_rate=1.5) # Train Adaboost model = Adbc.fit(X_train, y_train) #Predict the response for test dataset y_pred = model.predict(X_test)
Evaluation of the model
print("Accuracy:",metrics.accuracy_score(y_test, y_pred)) #('Accuracy:', 0.8888888888888888)
For more tutorial and details about Adaboost please follow official Sklearn Adaboost web page.
For latest Data science and Machine learning follow these links.
- Altruist: A New Method To Explain Interpretable Machine Learning Through Local Interpretations of Predictive Models
- CLEANN : A Framework That Protects Artificial Neural Networks From Trojan Attacks
- DeepMind Open-Sources The FermiNet: A Deep Learning Model For Computing The Energy Of Atoms
- Facebook AI Introduces M2M-100 : The First Many-To-Many Multilingual Model That Translates 100×100 Languages Without Relying On English Data
- Google Launches rǝ: A Browser-Based Toolset To Reconstruct The 3D Structure Of Cities Using Deep Learning and Crowdsourcing