Anda di halaman 1dari 23

COIT20250: E-Business Systems

Web Analytics and e-commerce

• Henry Zegarra
• SaiKrishna Kondapaka
• Nazeer Ahmed
• Chaitanaya Sanjay

Data Science
Data Analysis
Big Data
Data Mining
Data Science Data Analysis Big Data Data Mining Analytics

Activity Wide term Particular Data About

Field automating
Technique for Data Sets Analysis
Discipline insights into
Process technique Data Sets
About used for use Focus on Uses
About Collection, of predictive Modeling and queries,
Processes and Processing, analytics Knowledge aggregation
and Patterns procedures,
Systems Cleaning, algorithms,
Algorithm Data sets are discovery
Data Mining
Modeling growing tech.
rapidly Term Critics
Extracts Insights from (2.5 exabytes for Misnomer
every day @ Insights from
Insights from Data and buzzword Data

Security Risk Web

analytics analytics analytics

Security Portfolio Marketing

events analysis optimization
Web analytics
Web Analytics is the objective tracking, collection, measurement, reporting and
analysis of quantitative Internet data to optimize websites and web marketing

(DAA - Digital Analytics Association, formerly named as Web Analytic Association
Web analytics
How Companies Use it

Marketing Performance Measurement and Optimization

Calculate the return on the marketing investment
Determine the outcomes of campaigns

Visitor Data
Segmentation of customers
Building of who site visitors are

Content Optimization
Knowing how much time visitors spend on the website
Determine which pages visitors stay less
Web Analytics

Avinash Kaushik
Is an Indian entrepreneur, author and public
He encourages his vision of Web Analytics 2.0,
and the Principle of aggregation of marginal
Works as advisor and associate instructor in
some universities on US and Canada.
2009 with the ‘Harry V Roberts Statistical Advocate Year Award’ from the
American Statistical Association
2011 with the ‘Most Influential Industry Contributor’ award from the Web
Analytics Association.
Paradox of Data
The adoption in 2012 of Web Analytics
in Fortune 500 was:

For small companies there are plenty of

tools available, many of them for free
(Google, Yahoo!, etc.).

Data collection is not a problem any

more, for example Yahoo! Web
Analytics has approximately 110
standard reports.

Even with a lot of data, people still get

an very small number of insights.
Web Analytics 2.0
It is a concept, approach or mindset defined and
promoted by Avinash Kaushik, that proposes a set of
activities to help get insights.
• Clickstream
What the site’s analytics is collection, storing and processing
• Multiple Outcomes
Focus on defining and measure all of the objective outcomes
of the site
• Experimentation and Testing
A/B testing the design of your website, including text,
graphics, buttons, banner ads
• Voice of Customer
Ask website visitors through surveys.
• Competitive Intelligence
Analyse competitors, their campaigns, products and features
that are impacting your site’s performance (could be either
up or down)
Measuring Campaigns and Troubleshooting Traffic Sources

 Google Analytics has an entire set of Acquisition reports, dedicated to categorizing user’s
sources of traffic to the site.

 Did they come from a search engine, a link on a social media site, or a paid

 Traffic Sources in GA
 In the first hit in a user’s session, GA looks at the browser’s Referrer value (the URL of the
previous page) to determine where the user came from to arrive at the site.

Based on this value, it assigns values for two of the dimensions used in the Acquisition
reports, Medium and Source.

A medium represents a general category or type of traffic, while the source specifies a
specific site within that category.

GA categorizes traffic by default into the following mediums and sources:

 Organic search traffic

 Referral traffic
 Direct traffic
 Beyond the default categorization performed by GA, you can influence these classifications and
supply additional detail about how you’d like to label traffic sources to better reflect your site and
its audiences.

 Adding Organic Search Engines:

o GA recognizes a wide list of organic search engines by default
o but if a particular search engine is not in that list
o add domains to be counted as search engines (rather than referrals).
This is typically used for the following situations:
• Language- or country-specific search engines relevant to your site not included in GA’s default list.
• Industry-specific or other niche search engines relevant to your site.

 Ignoring Certain Referrers

GA also gives you the option to ignore certain referring sites (treating them as direct).
This is most common in the following situations:
• Certain types of third-party sites, such as PayPal.
It’s typical for a user to leave your site, go to PayPal (to complete a transaction), and then return to
your site for the final confirmation message. You’re not interested in counting the return as a
referral from PayPal.
• Cross-domain tracking
You can specify domains to treat as direct in GA’s Admin area in the property settings under
Tracking Info ➤ Referral Exclusion List. Any referrals from sites added to this list will be treated as
direct traffic.
 Campaign Tracking :
• For links to your site that you control, you can
specify exactly the value you’d like GA to use for
the medium and source (as well as additional
traffic source dimensions).
• This could include many types of marketing and
advertising links:
• Paid search and display advertisements
• Social posts and paid social advertisements
• Links in email marketing, such as a newsletter
or promotion
• Links from partner or affiliate sites
• Links in offline advertising, such as print, TV, or
Troubleshooting Traffic Sources

• Sometimes, traffic source information can go missing. Let’s examine the causes of incorrect traffic
source data and see how you can avoid pitfalls.

 Redirects
• Redirects are a valuable tool to enforce consistency in URLs on a website,
• to provide alternative (usually shorter) URLs, and
• to ensure that historical links continue to work.
• be a little careful about how redirects are used on your site to ensure that you don’t lose data
about how a user arrived at the site.

You need redirects to do both of the following:

• The redirect preserves the HTTP Referrer header.

-The Referrer header tells the browser what URL was the previous page when a link is followed, and
is the signal
-GA uses to assign the source in referral and organic search traffic.
-Server-side redirects (also called 301 or 302 redirects) typically preserve the Referrer header.

• The redirect preserves any query parameters in place on the original URL—
These parameters should be visible in the URL in the final destination page—
• You can check for the appropriate behavior using your browser’s testing tools on a redirected URL..
 Self-Referrals :
• One of the most common traffic source problems in GA
is seeing self-referrals
• your own website appears as a referral source.
Obviously this isn’t ntended—when a user follows a
link from one page on your website to another page,
that shouldn’t count as a referral—it’s just navigating
through the website!
• Why do self-referrals happen? The two most common
reasons are untagged pages and incorrect
• cross-domain or subdomain tracking.
 Untagged Pages

• When a user lands on a page and begins a session, GA assigns the source, medium,
and other traffic source
dimensions. However, suppose you have a situation where the user lands on a
page where no GA tag fires.
What happens?
• Since no GA tag fired, no session has yet begun. I
• f the user continues to navigate to a second page—this one with a GA tag—GA
begins a session and says, “OK, where did this user come from?” In this case, it’s
from another page on your site, and GA assigns the medium “referral” and the
source as your own domain.

• In the Acquisition ➤ Traffic Sources ➤ Referrals report, you can drill down into
self-referrals to see the
pages they originate from.

 Incorrect Cross-Domain or Subdomain Tracking

• When there are multiple domains or subdomains that you’d like to measure as a
single site, you need to ensure that GA has consistent cookie values across these
domains. If it doesn’t, it will treat each site as separate, with a separate session on
each, and you’ll see referrals between those domains.
Tracking Users Across Devices

Google Analytics reports a metric called Users, which sounds like it counts the
number of people using the site.

• Google Analytics typically counts users with the client ID, an identifier stored in a
cookie that is particular to a specific browser and device. For sites where users log
in or you can otherwise identify them, GA supports using a user ID instead for a
more accurate count of users across devices.

• Privacy concerns, policies, regulations, and appropriate disclosures to your site’s

users are important considerations when collecting user ID data. Review them

• User ID features are enabled in GA at the property level, choosing to use session
unification (counting hits before the user logs in) or not. Within that property, user
ID–enabled views can be created, which show only data with an associated user ID
along with additional cross-device reports.

• In GTM, the user ID is captured from the website, typically by inserting the user
ID value into the data layer. GA tags in GTM are altered to include this user ID
Importing Data into Google
Data import allows you to fill in the data using files uploaded directly to
GA. This can be useful in situations such as the following:

• The data isn’t available to the site at the time the hit occurs—for
example, because it’s stored in a separate system. You can upload data
from such systems to GA.

• The data is extensive, and including it directly on the site would be a

development burden, or would exceed the character limits for data
included in a hit. You can reduce the data sent from the site to certain key
dimensions and fill in other values later.

• The data is sensitive and you wouldn’t want to include it on the site,
such as certain kinds of user or product data.
• Data Import Process

• The basic process for data import works like this:

1. You create a data set associated with a property in GA, to configure the dimensions and metrics that will be

2. You upload a text file with the data to be imported. GA takes this and processes it into the data in reports.

3. You update the data set as necessary to update the data going forward.

• Data Import Types

There are three basic categories of data import available in Google Analytics:

1. Hit data (directly import hits to GA) :

a.Refund Data

2. Extended data (import several kinds of dimension or metric values to be applied to existing hits in GA) :
a. User Data Import
b. Campaign Data Import
c. Geographical Data Import
d. content Data Import
e. Product Data Import
f. Custom Data Import

3. Summary data (import metrics for data already aggregated in certain dimensions).
a. Cost Data Import
BigQuery for Big Data Analysis

• BigQuery is a tool that allows you to store, query, and

extract data.
• It’s a separate tool from GA and GTM, part of Google’s
Cloud Platform, but you can access it with a Google account
login in the same way as other Google tools.
• Although there is a web interface for using BigQuery
• the web interface is primarily for exploration and testing of
• Typically, BigQuery would be used as a service that acts a
source of information for other applications, much like any
other database.
• Like any other database, it supports a query language to
select subsets of data based on criteria of interest, and
• it returns the values for further processing in a report,
visualization, statistical model, or other application.
Kaushik, A. (2010). Web analytics 2.0. Indianapolis, IN: Wiley.
Chaffey, D. (2015). Digital business and e-commerce management.
Weber, J. (2015). Practical Google Analytics and Google Tag Manager for Developers. Berkeley, CA: Apress.
Ermolenko, M. (2016). What is the difference between Data Science, Data Analysis, Big Data, Data Analytics, Data
Mining and Machine Learning?. [online] Available at:
Learning%3F [Accessed 13 Jan. 2016].
Wikipedia, (2016). Data Science. [online] Available at: [Accessed 15 Jan.
Wikipedia, (2016). Data Analysis. [online] Available at: [Accessed 13 Jan.
Wikipedia, (2016). Big Data. [online] Available at: [Accessed 14 Jan. 2016].
Wikipedia, (2016). Data Mining. [online] Available at: [Accessed 14 Jan.
Wikipedia, (2016). Analytics. [online] Available at: [Accessed 15 Jan. 2016].
Wikipedia, (2016). Web Analytics. [online] Available at: [Accessed 15 Jan.
WAA Standards Committee. "Web analytics definitions." Washington DC: Web Analytics Association (2008).
Smith, S. and Media, D. (2016). How Do Companies Use Web Analytics?. [online]
Available at: [Accessed 14 Jan. 2016].