Anda di halaman 1dari 17

MACHINE LEARNING

CO 1 - SESSION 2

• Introduction
• Well posed learning problems
• Terminology
• Applications
Introduction
• ML is a branch of AI concerned with design and
development of algorithms that deals with
“construction and study of systems that can learn from
data.”
• ML can empower computers learn and behave more
intelligently. It is enabled by iterating over past data
and be ready to predict any future data.
Ex. Humans – learn from past experiences
Robot—Learn from past data
Introduction
Few examples:
• We have a small house of $70,000 and big house for
$160,000.Now how to predict the cost of a medium house.

• Detecting spam emails (e.g. with words like “cheap’, “free” etc.).
What is the probability that these mails containing the word
“cheap” is spam. We will categorize based on the word. If 80%
of mails indicate it is spam, we can conclude the mails with
word “ cheap” is spam.(Naive Bayes)

• Recommending apps: Consider we have data with columns age,


gender and apps they downloaded. Now for a new person with
specific age and gender, what app is to be recommended?
Class question: Which is best feature among age and gender.
Answer: Age (>20 and < 20) then gender. so it's a decision tree
Introduction
o Acceptance at University based on test and grade.

Student Test Grade accepta


nce
1 9/10 8/10 Yes

2 3/10 4/10 No

3 7/10 6/10 ?
Introduction
Other examples
“ Is this cancer?”,
“What is the market value of this house?”,
“Which of these people are good friends with each
other?”,
“Will this rocket engine explode on take off?”,
“Will this person like this movie?”,

All of these problems are excellent targets for an ML


project, and in fact ML has been applied to each of
them with great success.
Well posed Learning problems
ML Definitions
1) Arthur Samuel way back in 1959: “[Machine Learning is the] field of study
that gives computers the ability to learn without being explicitly
programmed.”

2) Tom Mitchell gave a “well-posed” definition that has proven more useful
to engineering types: “A computer program is said to learn from experience
E with respect to some task T and some performance measure P, if its
performance on T, as measured by P, improves with experience E.”

E.g. So if you want your program to predict, for example, traffic patterns at a
busy intersection (task T), you can run it through a machine learning
algorithm with data about past traffic patterns (experience E) and, if it has
successfully “learned”, it will then do better at predicting future traffic
patterns (performance measure P).
Terminology
In machine learning we use the following terms quite frequently.
Features: These are distinct characteristics that can be used to describe
each item in a quantitative manner.

COLOUR: RED
TYPE : FRUIT
SHAPE: ROUND
Terminology
Sample: It is an item to process (e.g. classify)
Ex: Document Classification
Terminology
Picture:

Sound/audio: In a forest, identifying and classifying the sounds of various


animals
Video: identifying a particular person in a Mob.
Terminology
Feature vector: N dimensional vector of features that represent
some object.
For instance, a life insurance company might be interesting in
obtaining the vector of variables (blood pressure, heart rate,
height, weight, cholesterol level, smoker, gender) to infer the life
expectancy of a potential customer.
A farmer might be interested in determining the ripeness of fruit
based on (size, weight, spectral data).
An engineer might want to find dependencies in (voltage,
current) pairs.
Feature extraction: Preparation of feature vector.
Training set: set of data to discover potentially predictive
relationships.
APPLICATIONS
Web page ranking:
• That is, the process of submitting a query to a search engine, which then
finds webpages relevant to the query and which returns them in their
order of relevance.
• Example of the query results for “machine learning”. That is, the search
engine returns a sorted list of webpages given a query.
• To achieve this goal, a search engine needs to ‘know’ which pages are
relevant and which pages match the query.
• Such knowledge can be gained from several sources: the link structure
of webpages, their content, the frequency with which users will follow
the suggested links in a query, or from examples of queries in
combination with manually ranked webpages.
• Increasingly machine learning rather than guesswork and clever
engineering is used to automate the process of designing a good search
engine.
APPLICATIONS
Collaborative filtering:
• Internet bookstores such as Amazon or video rental sites such
as Netflix use this information extensively to attract users to
purchase additional goods (or rent more movies).
• The problem is quite similar to the one of web page ranking.
As before, we want to obtain a sorted list (in this case of
articles).
• The key difference is that an explicit query is missing and
instead we can only use past purchase and viewing decisions
of the user to predict future viewing and purchase habits.
• The key side information here are the decisions made by
similar users, hence the collaborative nature of the process.
APPLICATIONS
Automatic translation of documents:
• At one extreme, we could aim at fully understanding a text
before translating it using a curated set of rules crafted by a
computational linguist well versed in the two languages we
would like to translate.
• This is a rather difficult task, in particular given that text is not
always grammatically correct, nor is the document
understanding part itself a trivial one.
• Instead, we could simply use examples of translated
documents, such as the proceedings of the Canadian
parliament or other multilingual entities (United Nations,
European Union, Switzerland) to learn how to translate
between the two languages.
APPLICATIONS
Security applications:
• For access control, use face recognition as one of its components. That
is, given the photo (or video recording) of a person, recognize who this
person is.
• In other words, the system needs to classify the faces into one of many
categories (Alice, Bob, Charlie) or decide that it is an unknown face.
• A similar, yet conceptually quite different problem is that of verification.
Here the goal is to verify whether the person in question is who he
claims to be.
• Note that differently to before, this is now a yes/no question. To deal
with different lighting conditions, facial expressions, whether a person is
wearing glasses, hairstyle, etc., it is desirable to have a system which
learns which features are relevant for identifying a person.
APPLICATIONS
Named entity recognition:
• That is, the problem of identifying entities, such as
places, titles, names, actions, etc. from documents.
• Such steps are crucial in the automatic digestion and
understanding of documents.
• Some modern e-mail clients, such as Apple’s Mail.app
nowadays ship with the ability to identify addresses in
mails and filing them automatically in an address
book.
APPLICATIONS
Recognition applications
• Annotate an audio sequence with text, such as the system
shipping with Microsoft Vista),
• the recognition of handwriting (annotate a sequence of
strokes with text, a feature common to many PDAs
• the detection of failure in jet engines,
• direct marketing (companies use past purchase behavior to
guesstimate whether you might be willing to purchase even
more)
• Face recognition
APPLICATIONS

Anda mungkin juga menyukai