Description
e-commerce along with technical things involved in e-commerce. It explains topics like click stream, challenges of tracking with click stream data, specific dimensions of click stream.
ELECTRONIC ECOMMERCE
WHAT IS E-COMMERCE?
?
Electronic commerce, commonly known as ecommerce, refers to the buying and selling of products or services over electronic systems such as the Internet and other computer networks
It includes the entire online process of developing, marketing, selling, delivering, servicing and paying for products and services
?
CLICK-STREAM
?
A click-stream is the recording of the parts of the screen a computer user clicks on while web browsing As the user clicks anywhere in the webpage or application, the action is logged on a client or inside the web server
?
?
Series of page request, which generates signal ; graphically representation of this signals called click-stream record
WEB SERVER INTERACTION
CONTINUE…
Visitor clicks a button or hypertext link containing a Uniform Resource Locator (URL) to access a particular Web site ? Fetches a document in Hypertext Markup Language (HTML) format—websitepage.html ? The browser issues a second request to the server and the server responds by returning the specified image. ? The browser makes this request and the server at Banner-ad.com interprets a request for the image in a special way
?
CONTINUE…
Server first issues a cookie request to the visitor’s browser requesting the contents of any cookie that might have been placed previously in the visitor’s PC by Banner-ad.com ? In our original HTML document, website.html had a hidden field that contained a request to retrieve a specific document from Profiler.com ? Via an alternative path alerting Website.com that the visitor is currently logged onto Website.com and viewing a specific page
?
CHALLENGES OF TRACKING WITH CLICK-STREAM DATA
Identifying the Visitor Origin There is no easy way to determine from a log whether the site is set as a browser’s home page. ? The site may be reached as a result of a clickthrough—a deliberate click on a text or graphic link from another site ? Capturing this crucial click-stream data is important to verify the efficacy of marketing programs but it is too difficult
?
IDENTIFYING THE SESSION
Session (visit) to have its own unique identity tag Types of finding session ID? Session can be consolidated by collating timecontiguous log entries from the same host or ID ? Web browser place a session-level cookie into the visitor’s Web browser ? Session ID can be returned to the Web server as a query string appended to a subsequent URL ? Web site may establish a persistent cookie in the visitor’s PC
IDENTIFYING THE VISITOR
Web visitors wish to be anonymous ? If we request a visitor’s identity, he or she is likely to lie about it ? We can’t be sure which family member is visiting our site ? We can’t assume that an individual is always at the same computer
?
PROXY SERVER
Proxy server may deliver outdated content ? Without properly notifying the originating server proxies may satisfy a content ? The web site will not know who made the page request unless a cookie is present ? Web site which not having proxy tags in the HTML called a forward proxy who used for Liberal use of expiration dates ? Enterprise’s Web servers offload requests for frequently accessed content called a reverse proxy
?
BROWSER CACHES
?
Browser caches also introduce uncertainties in our attempts to track all the events that occur during a visitor session If user clicks the Back button and no record of event sent to the server it shows the event not recorded Visitor opens multiple browser windows to the same Web site, but there isn’t any way for the Web server to know this
?
?
SPECIFIC DIMENSIONS FOR THE CLICKSTREAM
?
Any single dimensional schema will not use all the dimensions at once, but it is nice to have a portfolio of dimensions waiting to be used
Page Dimension ? Event Dimension ? Session Dimension ? Referral Dimension
?
PAGE DIMENSION
The page dimension describes the page context for a Web page event ? Our definition of page must be flexible enough to handle Web pages from the current, mostly static page delivery to highly dynamic page delivery ? If the nominal width of a single row is 100 bytes and we have a web site with 100,000 pages, then the unindexed data size is 100 x 100,000 = 10 MB. If indexing adds a factor of 3, then the total size of this dimension is about 40 MB
?
EVENT DIMENSION
?
The event dimension describes what happened on a particular page at a particular point in time The main interesting events are open page, refresh page, click link, and enter data Each field in an XML document can be labeled with a visitor-defined tag
?
?
SESSION DIMENSION
?
The session dimension provides one or more levels of diagnosis for the visitor’s session as a whole If the nominal width of a single row is 80 bytes and we have 10,000 identified session combinations, then the indexed data size is 80 x 10,000 = 0.8 MB
?
REFERRAL DIMENSION
?
Referral dimension describes how the customer arrived at the current page Web server logs usually provide this information If the referrer is a search engine, then usually the search string is specified
?
?
DESIGN OF A CLICK-STREAM FACT TABLE FOR COMPLETE SESSIONS
Dimensions that are appropriate for this fact table are calendar date, time of day, customer, page, session, and referrer ? Finally, we add session seconds, pages visited, orders placed, order quantity, and order dollar amount ? An interesting compromise between a high-level summary of Web site activity and the details provided by a fact table for each page event
?
CLICKSTREAM SCHEMA AT SESSION GRAIN
THE DESIGN OF A CLICKSTREAM FACT TABLE
FOR INDIVIDUAL PAGE EVENT In this we will define the granularity to be the individual page event in each customer session ? This ultimate level of detail is the most accurate and complete record of customer behavior ? The page dimension refers to the individual page whose events we are recording, this is the main difference in grain between this fact table and the first one we built. ? In this fact table we will be able to see all the pages accessed by the customers ? The size problems with this table can be addressed by sampling
?
CLICK-STREAM SCHEMA AT THE PAGE EVENT
GRAIN
THE DESIGN OF AGGREGATE CLICKSTREAM
FACT TABLES
?
There are many business questions we would like to ask that would be forced to summarize millions of rows from these tables Much smaller (and faster)fact tables can usefully summarize visitor behavior, such as correlating demographics with productive sessions
?
AGGREGATED CLICK-STREAM SCHEMA
SUMMARIZED BY SESSION CHARACTERISTICS
INTEGRATING THE CLICKSTREAM DATA MART INTO THE ENTERPRISE DATA WAREHOUSE
Using the bus matrix design method, we see which dimensions must be conformed across all the data marts ? We see that the clickstream data mart has a significant overlap with the other data marts ? Matrix method lists the data marts down the left side of the matrix and the dimensions used by the data marts across the top of the matrix ? The matrix describes data marts, not individual fact tables
?
DATA WAREHOUSE BUS MATRIX FOR A WEB
RETAILER
doc_124065008.pptx
e-commerce along with technical things involved in e-commerce. It explains topics like click stream, challenges of tracking with click stream data, specific dimensions of click stream.
ELECTRONIC ECOMMERCE
WHAT IS E-COMMERCE?
?
Electronic commerce, commonly known as ecommerce, refers to the buying and selling of products or services over electronic systems such as the Internet and other computer networks
It includes the entire online process of developing, marketing, selling, delivering, servicing and paying for products and services
?
CLICK-STREAM
?
A click-stream is the recording of the parts of the screen a computer user clicks on while web browsing As the user clicks anywhere in the webpage or application, the action is logged on a client or inside the web server
?
?
Series of page request, which generates signal ; graphically representation of this signals called click-stream record
WEB SERVER INTERACTION
CONTINUE…
Visitor clicks a button or hypertext link containing a Uniform Resource Locator (URL) to access a particular Web site ? Fetches a document in Hypertext Markup Language (HTML) format—websitepage.html ? The browser issues a second request to the server and the server responds by returning the specified image. ? The browser makes this request and the server at Banner-ad.com interprets a request for the image in a special way
?
CONTINUE…
Server first issues a cookie request to the visitor’s browser requesting the contents of any cookie that might have been placed previously in the visitor’s PC by Banner-ad.com ? In our original HTML document, website.html had a hidden field that contained a request to retrieve a specific document from Profiler.com ? Via an alternative path alerting Website.com that the visitor is currently logged onto Website.com and viewing a specific page
?
CHALLENGES OF TRACKING WITH CLICK-STREAM DATA
Identifying the Visitor Origin There is no easy way to determine from a log whether the site is set as a browser’s home page. ? The site may be reached as a result of a clickthrough—a deliberate click on a text or graphic link from another site ? Capturing this crucial click-stream data is important to verify the efficacy of marketing programs but it is too difficult
?
IDENTIFYING THE SESSION
Session (visit) to have its own unique identity tag Types of finding session ID? Session can be consolidated by collating timecontiguous log entries from the same host or ID ? Web browser place a session-level cookie into the visitor’s Web browser ? Session ID can be returned to the Web server as a query string appended to a subsequent URL ? Web site may establish a persistent cookie in the visitor’s PC
IDENTIFYING THE VISITOR
Web visitors wish to be anonymous ? If we request a visitor’s identity, he or she is likely to lie about it ? We can’t be sure which family member is visiting our site ? We can’t assume that an individual is always at the same computer
?
PROXY SERVER
Proxy server may deliver outdated content ? Without properly notifying the originating server proxies may satisfy a content ? The web site will not know who made the page request unless a cookie is present ? Web site which not having proxy tags in the HTML called a forward proxy who used for Liberal use of expiration dates ? Enterprise’s Web servers offload requests for frequently accessed content called a reverse proxy
?
BROWSER CACHES
?
Browser caches also introduce uncertainties in our attempts to track all the events that occur during a visitor session If user clicks the Back button and no record of event sent to the server it shows the event not recorded Visitor opens multiple browser windows to the same Web site, but there isn’t any way for the Web server to know this
?
?
SPECIFIC DIMENSIONS FOR THE CLICKSTREAM
?
Any single dimensional schema will not use all the dimensions at once, but it is nice to have a portfolio of dimensions waiting to be used
Page Dimension ? Event Dimension ? Session Dimension ? Referral Dimension
?
PAGE DIMENSION
The page dimension describes the page context for a Web page event ? Our definition of page must be flexible enough to handle Web pages from the current, mostly static page delivery to highly dynamic page delivery ? If the nominal width of a single row is 100 bytes and we have a web site with 100,000 pages, then the unindexed data size is 100 x 100,000 = 10 MB. If indexing adds a factor of 3, then the total size of this dimension is about 40 MB
?
EVENT DIMENSION
?
The event dimension describes what happened on a particular page at a particular point in time The main interesting events are open page, refresh page, click link, and enter data Each field in an XML document can be labeled with a visitor-defined tag
?
?
SESSION DIMENSION
?
The session dimension provides one or more levels of diagnosis for the visitor’s session as a whole If the nominal width of a single row is 80 bytes and we have 10,000 identified session combinations, then the indexed data size is 80 x 10,000 = 0.8 MB
?
REFERRAL DIMENSION
?
Referral dimension describes how the customer arrived at the current page Web server logs usually provide this information If the referrer is a search engine, then usually the search string is specified
?
?
DESIGN OF A CLICK-STREAM FACT TABLE FOR COMPLETE SESSIONS
Dimensions that are appropriate for this fact table are calendar date, time of day, customer, page, session, and referrer ? Finally, we add session seconds, pages visited, orders placed, order quantity, and order dollar amount ? An interesting compromise between a high-level summary of Web site activity and the details provided by a fact table for each page event
?
CLICKSTREAM SCHEMA AT SESSION GRAIN
THE DESIGN OF A CLICKSTREAM FACT TABLE
FOR INDIVIDUAL PAGE EVENT In this we will define the granularity to be the individual page event in each customer session ? This ultimate level of detail is the most accurate and complete record of customer behavior ? The page dimension refers to the individual page whose events we are recording, this is the main difference in grain between this fact table and the first one we built. ? In this fact table we will be able to see all the pages accessed by the customers ? The size problems with this table can be addressed by sampling
?
CLICK-STREAM SCHEMA AT THE PAGE EVENT
GRAIN
THE DESIGN OF AGGREGATE CLICKSTREAM
FACT TABLES
?
There are many business questions we would like to ask that would be forced to summarize millions of rows from these tables Much smaller (and faster)fact tables can usefully summarize visitor behavior, such as correlating demographics with productive sessions
?
AGGREGATED CLICK-STREAM SCHEMA
SUMMARIZED BY SESSION CHARACTERISTICS
INTEGRATING THE CLICKSTREAM DATA MART INTO THE ENTERPRISE DATA WAREHOUSE
Using the bus matrix design method, we see which dimensions must be conformed across all the data marts ? We see that the clickstream data mart has a significant overlap with the other data marts ? Matrix method lists the data marts down the left side of the matrix and the dimensions used by the data marts across the top of the matrix ? The matrix describes data marts, not individual fact tables
?
DATA WAREHOUSE BUS MATRIX FOR A WEB
RETAILER
doc_124065008.pptx