Machine Learning Algorithms - A Review

1y ago
29 Views
2 Downloads
825.75 KB
6 Pages
Last View : 2d ago
Last Download : 3m ago
Upload by : Mia Martinelli
Transcription

International Journal of Science and Research (IJSR)ISSN: 2319-7064ResearchGate Impact Factor (2018): 0.28 SJIF (2018): 7.426Machine Learning Algorithms - A ReviewBatta MaheshAbstract: Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform aspecific task without being explicitly programmed. Learning algorithms in many applications that’s we make use of daily. Every time aweb search engine like Google is used to search the internet, one of the reasons that work so well is because a learning algorithm thathas learned how to rank web pages. These algorithms are used for various purposes like data mining, image processing, predictiveanalytics, etc. to name a few. The main advantage of using machine learning is that, once an algorithm learns what to do with data, itcan do its work automatically. In this paper, a brief review and future prospect of the vast applications of machine learning algorithmshas been made.Keywords: Algorithm, Machine Learning, Pseudo Code, Supervised learning, Unsupervised learning, Reinforcement learning1. IntroductionSince their evolution, humans have been using many typesof tools to accomplish various tasks in a simpler way. Thecreativity of the human brain led to the invention of differentmachines. These machines made the human life easy byenabling people to meet various life needs, includingtravelling, industries, and computing. And Machine learningis the one among them.According to Arthur Samuel Machine learning is defined asthe field of study that gives computers the ability to learnwithout being explicitly programmed. Arthur Samuel wasMachine Learning relies on different algorithms to solvedata problems. Data scientists like to point out that there‟sno single one-size-fits-all type of algorithm that is best tosolve a problem. The kind of algorithm employed dependson the kind of problem you wish to solve, the number ofvariables, the kind of model that would suit it best and so on.Here‟s a quick look at some of the commonly usedalgorithms in machine learning (ML)Supervised LearningSupervised learning is the machine learning task of learninga function that maps an input to an output based on exampleinput-output pairs. It infers a function from labelled trainingdata consisting of a set of training examples. The supervisedmachine learning algorithms are those algorithms whichneeds external assistance. The input dataset is divided intotrain and test dataset. The train dataset has output variablewhich needs to be predicted or classified. All algorithmsfamous for his checkers playing program. Machine learning(ML) is used to teach machines how to handle the data moreefficiently. Sometimes after viewing the data, we cannotinterpret the extract information from the data. In that case,we apply machine learning. With the abundance of datasetsavailable, the demand for machine learning is in rise. Manyindustries apply machine learning to extract relevant data.The purpose of machine learning is to learn from the data.Many studies have been done on how to make machineslearn by themselves without being explicitly programmed.Many mathematicians and programmers apply severalapproaches to find the solution of this problem which arehaving huge data sets.learn some kind of patterns from the training dataset andapply them to the test dataset for prediction or classification.The workflow of supervised machine learning algorithms isgiven in fig below. Most famous supervised machinelearning algorithms have been discussed hereFigure: Supervised learning WorkflowVolume 9 Issue 1, January 2020www.ijsr.netLicensed Under Creative Commons Attribution CC BYPaper ID: ART20203995DOI: 10.21275/ART20203995381

International Journal of Science and Research (IJSR)ISSN: 2319-7064ResearchGate Impact Factor (2018): 0.28 SJIF (2018): 7.426Decision TreeDecision tree is a graph to represent choices and their resultsin form of a tree. The nodes in the graph represent an eventor choice and the edges of the graph represent the decisionrules or conditions. Each tree consists of nodes andbranches. Each node represents attributes in a group that isto be classified and each branch represents a value that thenode can take.Figure: Navie BayesFigure: Decision TreeDecision Tree Pseudo Code:defdecisionTreeLearning(examples, attributes,parent examples):if len(examples) 0:return pluralityValue(parent examples)# return most probable answer as there is no training dataleftelif len(attributes) 0:return pluralityValue(examples)elif (all examples classify the same):return their classificationA max(attributes, key(a) importance(a, examples)# choose the most promissing attribute to condition ontree new Tree(root A)for value in A.values():exs examples[e.A value]subtree decisionTreeLearning(exs, attributes.remove(A),examples)# note implementation should probably wrap the trivial casereturns into trees for consistencytree.addSubtreeAsBranch(subtree, label (A, value)return treePseudo Code of Navie BayesInput:Training dataset T,F (f1, f2, f3,., fn) // value of the predictor variable intesting dataset.Output: A class of testing dataset.Steps:1) Read the training dataset T;2) Calculate the mean and standard deviation of thepredictor variables in each class;3) Repeat Calculate the probability of fi using the gaussdensity equation in each class; Until the probability of allpredictor variables (f1, f2, f3,., fn) has been calculated.4) Calculate the likelihood for each class;5) Get the greatest likelihoodSupport Vector MachineAnother most widely used state-of-the-art machine learningtechnique is Support Vector Machine (SVM). In machinelearning, support-vectormachines are supervisedlearning models with associated learning algorithms thatanalyze data used for classification and regression analysis.In addition to performing linear classification, SVMs canefficiently perform a non-linear classification using what iscalled the kernel trick, implicitly mapping their inputs intohigh-dimensional feature spaces. It basically, draw marginsbetween the classes. The margins are drawn in such afashion that the distance between the margin and the classesis maximum and hence, minimizing the classification error.Navie BayesIt is a classification technique based on Bayes Theorem withan assumption of independence among predictors. In simpleterms, a Naive Bayes classifier assumes that the presence ofa particular feature in a class is unrelated to the presence ofany other feature. Naïve Bayes mainly targets the textclassification industry. It is mainly used for clustering andclassification purpose depends on the conditional probabilityof happening.Figure: Support Vector MachinePseudo Code of Support Vector Machineinitialize Yi YI for i IrepeatVolume 9 Issue 1, January 2020www.ijsr.netLicensed Under Creative Commons Attribution CC BYPaper ID: ART20203995DOI: 10.21275/ART20203995382

International Journal of Science and Research (IJSR)ISSN: 2319-7064ResearchGate Impact Factor (2018): 0.28 SJIF (2018): 7.426compute svm solution vv , b for data set with imputed labelscompute outputs ii (vv , xi) b for all xi in positive bagsset yi sgn(fi) for every i e i, yi 1for (every positive bag bi) endif (liei(l yi)/2 0)compute i* arg maxiei iiset yi* 1endwhile (imputed labels have changed)output (vv, b)Unsupervised Learning:These are called unsupervised learning because unlikesupervised learning above there is no correct answers andthere is no teacher. Algorithms are left to their own devisesto discover and present the interesting structure in the data.The unsupervised learning algorithms learn few featuresfrom the data. When new data is introduced, it uses thepreviously learned features to recognize the class of the data.It is mainly used for clustering and feature reduction.Figure: Unsupervised LearningPrincipal Component AnalysisPrincipal component analysis is a statistical procedure thatuses an orthogonal transformation to convert a set ofobservations of possibly correlated variables into a set ofvalues of linearly uncorrelated variables called principalcomponents. In this the dimension of the data is reduced tomake the computations faster and easier. It is used to explainthe variance-covariance structure of a set of variablesthrough linear combinations. It is often used as adimensionality-reduction technique.Figure: Principal Component AnalysisK-Means ClusteringK-means is one of the simplest unsupervised learningalgorithms that solve the well known clustering problem.The procedure follows a simple and easy way to classify agiven data set through a certain number of clusters. Themain idea is to define k centers, one for each cluster. Thesecenters should be placed in a cunning way because ofdifferent location causes different result. So, the betterchoice is to place them is much as possible far away fromeach other.The next step is to take each point belonging to a given dataset and associate it to the nearest center. When no point ispending, the first step is completed and an early group ageis done. At this point we need to re-calculate k new centroidsas bary center of the clusters resulting from the previousstep.Figure: Pseudo Code of K-Means ClusteringFigure: K-Means ClusteringVolume 9 Issue 1, January 2020www.ijsr.netLicensed Under Creative Commons Attribution CC BYPaper ID: ART20203995DOI: 10.21275/ART20203995383

International Journal of Science and Research (IJSR)ISSN: 2319-7064ResearchGate Impact Factor (2018): 0.28 SJIF (2018): 7.426Semi Supervise Learning:Semi-supervised machine learning is a combinationof supervised and unsupervised machine learning methods.It can be fruit-full in those areas of machine learning anddata mining where the unlabeled data is already present andgetting the labeled data is a tedious process. With morecommon supervised machine learning methods, you traina machine learning algorithm on a “labeled” dataset inwhich each record includes the outcome information. Thesome of Semi Supervise learning algorithms are discussedbelowTransductive SVMTransductive support vector machines (TSVM) has beenwidely used as a means of treating partially labeled data insemisupervised learning. Around it, there has been mysterybecause of lack of understanding its foundation ingeneralization. It is used to label the unlabeled data in such away that the margin is maximum between the labeled andunlabeled data. Finding an exact solution by TSVM is a NPhard problem.Generative ModelsA Generative model is the one that can generate data. Itmodels both the features and the class (i.e. the completedata). If we model P(x,y): I can use this probabilitydistribution to generate data points - and hence allalgorithms modeling P(x,y) are generative. One labeledexample per component is enough to confirm the mixturedistribution.learning approaches aim to solve just 1 task using 1particular model), where these n tasks or a subset of themare related to each other but not exactly identical, MultiTask Learning (MTL) will help in improving the learning ofa particular model by using the knowledge contained in allthe n tasks.Ensemble LearningEnsemble learning is the process by which multiple models,such as classifiers or experts, are strategically generated andcombinedtosolveaparticular computationalintelligence problem. Ensemble learning is primarily used toimprove the performance of a model, or reduce thelikelihood of an unfortunate selection of a poor one. Otherapplications of ensemble learning include assigning aconfidence to the decision made by the model, selectingoptimal features, data fusion, incremental learning, nonstationary learning and error-correcting.Boosting:The term „Boosting‟ refers to a family of algorithmswhich converts weak learner to strong learners. Boosting is atechnique in ensemble learning which is used to decreasebias and variance. Boosting is based on the question posedby Kearns and Valiant “Can a set of weak learners createa single strong learner?" A weak learner is defined to bea classifier, a strong learner is a classifier that is arbitrarilywell-correlated with the true classification.Self-TrainingIn self-training, a classifier is trained with a portion oflabeled data. The classifier is then fed with unlabeled data.The unlabeled points and the predicted labels are addedtogether in the training set. This procedure is then repeatedfurther. Since the classifier is learning itself, hence the nameself-training.Reinforcement LearningReinforcement learning is an area of machine learningconcerned with how software agents ought to take actions inan environment in order to maximize some notion ofcumulative reward. Reinforcement learning is one of threebasic machine learning paradigms, alongside supervisedlearning and unsupervised learning.Figure: Boosting Pseudo codeBaggingBagging or bootstrap aggregating is applied where theaccuracy and stability of a machine learning algorithm needsto be increased. It is applicable in classification andregression. Bagging also decreases variance and helps inhandling overfitting.Figure: Reinforcement LearningMultitask LearningMulti-Task learning is a sub-field of Machine Learning thataims to solve multiple different tasks at the same time, bytaking advantage of the similarities between different tasks.This can improve the learning efficiency and also act as aregularize. Formally, if there are n tasks (conventional deepVolume 9 Issue 1, January 2020www.ijsr.netLicensed Under Creative Commons Attribution CC BYPaper ID: ART20203995DOI: 10.21275/ART20203995384

International Journal of Science and Research (IJSR)ISSN: 2319-7064ResearchGate Impact Factor (2018): 0.28 SJIF (2018): 7.426Unsupervised Neural NetworkThe neural network has no prior clue about the output theinput. The main job of the network is to categorize the dataaccording to some similarities. The neural network checksthe correlation between various inputs and groups them.Figure: Pseudo code of BaggingNeural NetworksA neural network is a series of algorithms that endeavors torecognize underlying relationships in a set of data through aprocess that mimics the way the human brain operates. Inthis sense, neural networks refer to systems of neurons,either organic or artificial in nature. Neural networks canadapt to changing input; so the network generates the bestpossible result without needing to redesign the outputcriteria. The concept of neural networks, which has its rootsin artificial intelligence, is swiftly gaining popularity in thedevelopment of trading systems.Figure: Unsupervised Neural NetworkReinforced Neural NetworkReinforcement learning refers to goal-oriented algorithms,which learn how to attain a complex objective (goal) ormaximize along a particular dimension over many steps; forexample, maximize the points won in a game over manymoves. They can start from a blank slate, and under the rightconditions they achieve superhuman performance. Like achild incentivized by spankings and candy, these algorithmsare penalized when they make the wrong decisions andrewarded when they make the right ones – this isreinforcement.Figure: Neural NetworksAn artificial neural network behaves the same way. It workson three layers. The input layer takes input. The hidden layerprocesses the input. Finally, the output layer sends thecalculated output.Supervised Neural NetworkIn the supervised neural network, the output of the input isalready known. The predicted output of the neural networkis compared with the actual output. Based on the error, theparameters are changed, and then fed into the neural networkagain. Supervised neural network is used in feed forwardneural network.Figure: Supervised Neural NetworkFigure: Reinforced Neural NetworkInstance-Based LearningInstance-based learning refers to a family of techniquesfor classification and regression, which produce a classlabel/predication based on the similarity of the query to itsnearest neighbor(s) in the training set. In explicit contrast toother methods such as decision trees and neural networks,instance-based learning algorithms do not create anabstraction from specific instances. Rather, they simply storeall the data, and at query time derive an answer from anexamination of the queries nearest neighbour (s).K-Nearest NeighborThe k-nearest neighbors (KNN) algorithm is a simple,supervised machine learning algorithm that can be used tosolve both classification and regression problems. It's easy toimplement and understand, but has a major drawback ofVolume 9 Issue 1, January 2020www.ijsr.netLicensed Under Creative Commons Attribution CC BYPaper ID: ART20203995DOI: 10.21275/ART20203995385

International Journal of Science and Research (IJSR)ISSN: 2319-7064ResearchGate Impact Factor (2018): 0.28 SJIF (2018): 7.426becoming significantly slows as the size of that data in usegrows.Figure: Pseudo code of KNN2. ConclusionMachine Learning can be a Supervised or Unsupervised. Ifyou have lesser amount of data and clearly labelled data fortraining, opt for Supervised Learning. UnsupervisedLearning would generally give better performance andresults for large data sets. If you have a huge data set easilyavailable, go for deep learning techniques. You also havelearned Reinforcement Learning and Deep ReinforcementLearning. You now know what Neural Networks are, theirapplications and limitations. This paper surveys variousmachine learning algorithms. Today each and every personis using machine learning knowingly or unknowingly. Fromgetting a recommended product in online shopping toupdating photos in social networking sites. This paper givesan introduction to most of the popular machine learningalgorithms.References[1] W. Richert, L. P. Coelho, “Building Machine LearningSystems with Python”, Packt Publishing Ltd., ISBN978-1-78216-140-0[2] J. M. Keller, M. R. Gray, J. A. Givens Jr., “A Fuzzy KNearest Neighbor Algorithm”, IEEE Transactions onSystems, Man and Cybernetics, Vol. SMC-15, No. 4,August 1985[3] https://www.geeksforgeeks.org/machine-learning/[4] ] S. Marsland, Machine learning: an algorithmicperspective. CRC press, 2015.[5] M. Bkassiny, Y. Li, and S. K. Jayaweera, “A survey onmachine learning techniques in cognitive radios,” IEEECommunications Surveys & Tutorials, vol. 15, no. 3,pp. 1136–1159, Oct. 2012.[6] https://en.wikipedia.org/wiki/Instance-based learning[7] R. S. Sutton, “Introduction: The Challenge ofReinforcement Learning”, Machine Learning, 8, Page225-227, Kluwer Academic Publishers, Boston, 1992[8] P. Harrington, “Machine Learning in action”, ManningPublications Co., Shelter Island, New York, 2012Volume 9 Issue 1, January 2020www.ijsr.netLicensed Under Creative Commons Attribution CC BYPaper ID: ART20203995DOI: 10.21275/ART20203995386

supervised machine learning is a combination of supervised and unsupervised machine learning methods. It can be fruit-full in those areas of machine learning and data mining where the unlabeled data is already present and getting the labeled data is a tedious process. With more common supervised machine learning methods, you train

Related Documents:

decoration machine mortar machine paster machine plater machine wall machinery putzmeister plastering machine mortar spraying machine india ez renda automatic rendering machine price wall painting machine price machine manufacturers in china mail concrete mixer machines cement mixture machine wall finishing machine .

Machine learning has many different faces. We are interested in these aspects of machine learning which are related to representation theory. However, machine learning has been combined with other areas of mathematics. Statistical machine learning. Topological machine learning. Computer science. Wojciech Czaja Mathematical Methods in Machine .

with machine learning algorithms to support weak areas of a machine-only classifier. Supporting Machine Learning Interactive machine learning systems can speed up model evaluation and helping users quickly discover classifier de-ficiencies. Some systems help users choose between multiple machine learning models (e.g., [17]) and tune model .

Artificial Intelligence, Machine Learning, and Deep Learning (AI/ML/DL) F(x) Deep Learning Artificial Intelligence Machine Learning Artificial Intelligence Technique where computer can mimic human behavior Machine Learning Subset of AI techniques which use algorithms to enable machines to learn from data Deep Learning

mechanics. Rather than creating new quantum machine learning algorithms, let us now try to think if we can change only parts of existing classical machine learning algorithms to quantum ones. Machine learning and deep learning use linear algebra routines to manipulate and analyse data to learn from it.

AI Machine Learning / Deep Learning Overview Problem Statement Test Compaction: Hypothesis 1 -Machine learning algorithms analyze test data to optimize the test list. Dynamic Spatial Testing: Hypothesis 2 -Machine learning algorithms learn wafer spatial correlations to dynamically optimize test coverage Test Compaction

Machine Learning (15CS73) 3. COURSE OUTCOMES: At the end of the course, the student will be able to; 1. Understand the implementation procedures for the machine learning algorithms. 2. Design python programs for various learning algorithms. 3. Apply appropriate data sets to the machine learning algorithms. 4.

current trends and techniques in the fi eld of analytical chemistry. Written for undergraduate and postgraduate students of chemistry, this revised and updated edition treats each concept and principle systematically to make the subject comprehensible to beginners as well as advanced learners. FEATURES Updated nomenclature Addition of tests for metals based on fl ame atomic emission .