Report on Mining of Business Data

Aditi707 · Jul 6, 2013

Description
In data analysis, a data definition specification (DDS) serves as the guideline to ensure that the data definition is comprehensive, consistent and represents the attributes required to quantify the quality of the data definition.

ABSTRACT

Title of dissertation:

Mining of Business Data Shu Zhang, Doctor of Philosophy, 2009

Dissertation directed by: Associate Professor Wolfgang Jank

AMSC and R.H.Smith School of Business

Applying statistical tools to help understand business processes and make informed business decisions has attracted enormous amount of research interests in recent years. In this dissertation, we develop and apply data mining techniques to two sources of data, online bidding data for eBay and o?ine sales transaction data from a grocery product distributor. We mine online auction data to develop forecasting models and bidding strategies and mine o?ine sales transaction data to investigate sales people’s price formation process. We start with discussing bidders’ bidding strategies in online auctions. Conventional bidding strategies do not help bidders select an auction to bid on. We propose a automated and data-driven strategy which consists of a dynamic forecasting model for auction closing price and a bidding framework built around this model to determine the best auction to bid on and the best bid amount. One important component of our bidding strategy is a good forecasting model. We investigate forecasting alternatives in three ways. Firstly, we develop model selection strategies for online auctions (Chapter 3). Secondly, we propose a novel

functional K-nearest neighbor (KNN) forecaster for real time forecasting of online auctions (Chapter 4). The forecaster uses information from other auctions and weighs their contribution by their relevance in terms of auction features. It improves the predictive performance compared to several competing models across various levels of data heterogeneity. Thirdly, we develop a Beta model (Chapter 5) for capturing auction price paths and ?nd this model has advantageous forecasting capability. Apart from online transactions, we also employ data mining techniques to understand o?ine transactions where sales representatives (salesreps) serve as media to interact with customers and quote prices. We investigate the mental models for salesreps’ decision making, and ?nd that price recommendation makes salesreps concentrate on cost related information. In summary, the dissertation develops various data mining techniques for business data. Our study is of great importance for understanding auction price formation processes, forecasting auction outcomes, optimizing bidding strategies, and identifying key factors in sales people’s decision making. Those techniques not only advance our understanding of business processes, but also help design business infrastructure.

Mining of Business Data

by Shu Zhang

Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial ful?llment of the requirements for the degree of Doctor of Philosophy 2009

Advisory Committee: Professor Professor Professor Professor Professor Professor Wolfgang Jank, Chair/Advisor Wedad J. Elmaghraby Abram Kagan Itir Z. Karaesmen Aydin Mahesh Kumar Paul J. Smith

c Copyright by Shu Zhang 2009

To Lijun, My Parents and My brother

ii

Acknowledgments
I owe my gratitude to all people who have brought laughters and joy into my graduate life. Without you being on my side, this thesis would never be possible, and I would have never had so many cherishable moments in the past four years that I will remember forever. First and foremost, I would like to thank my advisor, Dr. Wolfgang Jank, for his advice, support, and encouragement on research. I started as a rookie in the ?eld of eCommerce research, and his mentoring has leaded me all the way to get here. It has been my honor and a great pleasure to work with and learn from such an extraordinary individual. I’m grateful to Dr. Paul Smith, for his support and help in all aspects in my graduate study. He has brought up a lot of brilliant ideas that help improving this work signi?cantly. Not only helping me in research with his incredible knowledge in statistics, he has also always been there to guide me through the depressed days when I got stuck in research. The chapters about pricing strategies would have not been part of this dissertation without the help from Dr. Karaesmen and Dr. Elmaghraby who have always opened the door to me when I had questions. I have bene?ted a lot from their great passion towards research and extraordinary understanding and creative thoughts about pricing. I would like to thank Dr.Kagan and Dr.Kumar for their kind support and guidance, and reading of my dissertation. I have learnt a lot from them. It is a iii

great pleasure to have them in my committee. Thanks are also due to Dr.Shmueli, who has given me tremendously amount of help and encouragement. It has been a wonderful experience to have such a talented and exceptional mentor in my graduate life. I owe my deepest thanks to my family - my brother and parents who have always been on my side and support me from faraway China; and my beloved husband, Lijun, who takes great care of me every day. Their love has been my unlimited source of energy and con?dence. It is impossible to list everyone who has helped me in the few pages. I just want to thank you all. I am so lucky to have you in my life!

iv

Table of Contents
List of Tables List of Figures List of Abbreviations 1 Introduction 1.1 Introduction to Online Auctions . . . . . . . . . . . . . . . . . . 1.1.1 Online Auctions . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 eBay Data Structure . . . . . . . . . . . . . . . . . . . . 1.1.3 Online Auction Literature . . . . . . . . . . . . . . . . . 1.2 Introduction to Pricing in B2B and B2C and Literature Review 1.3 Contributions of the Dissertation . . . . . . . . . . . . . . . . . 1.3.1 Data Driven Bidding Strategy . . . . . . . . . . . . . . . 1.3.2 Model Selection for Improved Forecasting . . . . . . . . . 1.3.3 Weighted Forecasting of Closing Prices . . . . . . . . . . 1.3.4 A Flexible Model for Price Dynamics in Online Auctions 1.3.5 Decision Making in H2H Transactions . . . . . . . . . . . 1.3.6 Summary of Dissertation Contributions . . . . . . . . . . 2 An 2.1 2.2 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix xi xii 1 1 1 3 3 8 11 11 12 13 15 15 16 18 18 22 23 24 25 27 36 37 37 39 39 41 43 43 45 45 46 47 48 49 51 53

Automated and Data-Driven Bidding Strategy for Online Auctions Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . A Model for Forecasting Competing Auctions . . . . . . . . . . . 2.3.1 Functional Data Analysis and Price Dynamics . . . . . . . 2.3.2 Capturing Competition . . . . . . . . . . . . . . . . . . . . 2.3.3 Variable Selection . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Model Updating . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Estimation and Prediction Results . . . . . . . . . . . . . . . . . . 2.4.1 Model Estimation . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Model Fit and Varying Time Intervals . . . . . . . . . . . 2.4.3 Prediction Performance . . . . . . . . . . . . . . . . . . . . 2.4.4 Comparison with Alternative Models . . . . . . . . . . . . 2.5 An Automated Data-Driven Bidding Decision Rule . . . . . . . . 2.5.1 Decision Framework . . . . . . . . . . . . . . . . . . . . . 2.5.2 Experimental Set-Up . . . . . . . . . . . . . . . . . . . . . 2.5.2.1 Early Bidding Strategy . . . . . . . . . . . . . . . 2.5.2.2 Last-Minute Bidding Strategy . . . . . . . . . . . 2.5.2.3 Our Automated Data-Driven Bidding Strategy . 2.5.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . 2.5.3.1 E?ect of the Prediction Window . . . . . . . . . 2.5.4 Practical Considerations . . . . . . . . . . . . . . . . . . . 2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

3 Model Selection for Improved Forecasting 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 3.2 Create Features to Capture Important Information 3.3 Model Selection for Auctions Markets . . . . . . . . 3.3.1 Model Selection for Time Windows . . . . . 3.3.2 Variable Pre-Selection and Multicollinearity 3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Model Selection . . . . . . . . . . . . . . . . 3.4.2 Prediction Accuracy . . . . . . . . . . . . . 3.4.3 The Winner . . . . . . . . . . . . . . . . . . 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

56 56 58 60 62 65 68 68 71 73 75

4 Real-Time Forecasting of Online Auctions via Functional K-Nearest Neighbors 77 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.2.1 Palm PDA Auctions . . . . . . . . . . . . . . . . . . . . . . . 82 4.2.2 Laptop Auctions . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.3 A Functional Model for Capturing Price Growth Patterns . . . . . . . 84 4.3.1 The Beta Model . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.3.2 Model Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.3.3 Kullback-Leibler Distance . . . . . . . . . . . . . . . . . . . . 89 4.4 Functional K -Nearest Neighbors (fKNN) . . . . . . . . . . . . . . . . 90 4.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.4.2 Choice of Distance Metric . . . . . . . . . . . . . . . . . . . . 92 4.4.2.1 Static and Time-Varying Data . . . . . . . . . . . . 93 4.4.2.2 Dynamics (Functional Data) . . . . . . . . . . . . . . 96 4.4.2.3 Optimal Distance Metric . . . . . . . . . . . . . . . . 97 4.4.3 Choice of K . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.4.4 Forecasting Scheme . . . . . . . . . . . . . . . . . . . . . . . . 97 4.4.5 Comparison With Alternate Methods . . . . . . . . . . . . . . 98 4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.5.1 Selecting the Optimal K and the Optimal Distance Metric . . 99 4.5.2 Robustness of Optimal D and K to the Time Horizon . . . . . 102 4.5.2.1 Robustness of Optimal K . . . . . . . . . . . . . . . 102 4.5.2.2 Robustness of Optimal D . . . . . . . . . . . . . . . 103 4.5.3 Comparison With Alternate Prediction Methods . . . . . . . . 106 4.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 5 A Flexible Model for Price Dynamics in Online Auctions 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 5.2 Models for Auction Price Paths . . . . . . . . . . . 5.2.1 Smoothing Splines . . . . . . . . . . . . . . 5.2.2 Monotone Splines . . . . . . . . . . . . . . . 5.2.3 Parametric Growth Models . . . . . . . . . . vi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 112 113 114 116 118

5.3

5.4

5.5

5.2.3.1 The Exponential Model . . . . . . . . . 5.2.3.2 The Logarithmic Model . . . . . . . . . 5.2.3.3 The Logistic Model . . . . . . . . . . . . 5.2.3.4 The Re?ected-Logistic Model . . . . . . 5.2.3.5 The 4-member growth model family . . 5.2.4 Comparison . . . . . . . . . . . . . . . . . . . . . A New Model for Auction Price Paths: The Beta Model 5.3.1 Fitting the Beta Model . . . . . . . . . . . . . . . 5.3.1.1 Beta-Fitting Algorithm . . . . . . . . . 5.3.2 Properties of the Beta Model . . . . . . . . . . . 5.3.2.1 Representing Price Dynamics . . . . . . 5.3.2.2 Characterizing Growth Patterns . . . . . 5.3.2.3 Characterizing Bid Timing . . . . . . . . Empirical Comparison . . . . . . . . . . . . . . . . . . . 5.4.1 Model-?t Comparison . . . . . . . . . . . . . . . 5.4.2 Forecasting Accuracy Comparison . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

118 119 119 120 121 122 122 124 126 128 129 130 130 132 132 135 137 139 139 143 143 145 147 147 150 152 153 154 154 157 160 162 162 163 165 166 167 169 169 169 169 172

6 Pricing and Sales Person Decision Making: An Exploratory Analysis 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Data Description . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Data Reduction . . . . . . . . . . . . . . . . . . . . . . . 6.3 Mining Mental Models . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Building Mental Models . . . . . . . . . . . . . . . . . . 6.3.1.1 E?ect of Price Recommendation . . . . . . . . 6.3.2 Model Selection . . . . . . . . . . . . . . . . . . . . . . . 6.3.2.1 Model Fit Comparison . . . . . . . . . . . . . . 6.3.2.2 Prediction Capability Comparison . . . . . . . 6.3.3 Model Interpretation . . . . . . . . . . . . . . . . . . . . 6.4 E?ects of Existence of Price Recommendation . . . . . . . . . . 6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Future Research 7.1 Data Driven Bidding Strategy . . . . . . . . . . . . . . . 7.2 Model Selection for Improved Forecasting . . . . . . . . . 7.3 Weighted Forecasting of Closing Prices . . . . . . . . . . 7.4 A Flexible Model for Price Dynamics in Online Auctions 7.5 Pricing and Sales Person Decision Making . . . . . . . . A Data Sets Used in the Study A.1 eBay Bids Level Data . . . . . . . . . . . . . A.1.1 Palm Pilot M515 PDA data . . . . . A.1.2 Laptop Data . . . . . . . . . . . . . . A.2 Transactions Data from A Grocery Products vii . . . . . . . . . . . . . . . . . . . . . Distributor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

B Simulation Results for Alternate Bidding Schemes 174 B.1 Alternate Bidding Heuristics . . . . . . . . . . . . . . . . . . . . . . . 174 B.2 Robustness of Last-Minute Bidding . . . . . . . . . . . . . . . . . . . 175 C Model Selection Results for Palm Data Set from eBay 177

viii

List of Tables

2.1 2.2 2.3 2.4 2.5 2.6

Candidate competition features . . . . . . . . . . . . . . . . . . . . . 28 Candidate information for the forecasting model . . . . . . . . . . . . 30 Percentage of signi?cant time points . . . . . . . . . . . . . . . . . . 32

Average BIC computed across all time periods . . . . . . . . . . . . . 34 Comparison of di?erent bidding strategies . . . . . . . . . . . . . . . 48 Tradeo? between the width of the prediction window and expected surplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Variable index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Best models according to BIC and AIC . . . . . . . . . . . . . . . . . 69 Selected models and prediction accuracy . . . . . . . . . . . . . . . . 75 Summary of information sources characterizing online auctions . . . . 93 Optimal choice of K for di?erent distance metrics D and time horizons ?T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Correspondence between the Beta model and the four growth models 131 Beta distribution summary statistics and their auction meaning . . . 131 Summary of all models . . . . . . . . . . . . . . . . . . . . . . . . . . 152

3.1 3.2 3.3 4.1 4.2

5.1 5.2 6.1

A.1 Description of the Palm auctions . . . . . . . . . . . . . . . . . . . . 170 A.2 Summary statistics of the laptop auctions . . . . . . . . . . . . . . . 171 A.3 Description statistics for transaction data from a grocery products distributor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 B.1 Alternative bidding heuristics. . . . . . . . . . . . . . . . . . . . . . . 175 B.2 Robustness of last-minute bidding to di?erent increments. . . . . . . 176

ix

C.1 1-Parameter Models C.2 2-Parameter Models

. . . . . . . . . . . . . . . . . . . . . . . . . . . 178 . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

C.3 2-Parameter Models (Continued) . . . . . . . . . . . . . . . . . . . . 180 C.4 3-Parameter Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

C.5 3-Parameter Models (Continued) . . . . . . . . . . . . . . . . . . . . 182 C.6 4-Parameter Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

C.7 4-Parameter Models (Continued) . . . . . . . . . . . . . . . . . . . . 184 C.8 5-Parameter Models C.9 6-Parameter Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

x

List of Figures

1.1 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9

Bidding data for an eBay Palm PDA auction. . . . . . . . . . . . . .

4

Snapshot of the live price curves during eBay auctions. . . . . . . . . 23 Smooth price curves (left panel) and corresponding velocities (right panel) for two sample auctions . . . . . . . . . . . . . . . . . . . . . . 25 Illustrating of auction competition . . . . . . . . . . . . . . . . . . . . 26 Pairwise relationships between competition features . . . . . . . . . . 29 The illustration of the update scheme in the forecasting model . . . . 36 Estimated coe?cients of model parameters, together with 95% con?dence bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Model ?t and prediction accuracy for di?erent time intervals . . . . . 40 Prediction accuracy for competing models . . . . . . . . . . . . . . . 41 Illustration of bidding strategies . . . . . . . . . . . . . . . . . . . . . 46 . . . . . . . . . . . . 51

2.10 Process of automated bidding and alternatives. 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 4.4

Illustration of the modeling task. . . . . . . . . . . . . . . . . . . . . 60 Distribution of model selection criterion. . . . . . . . . . . . . . . . . 62 Pairwise mean correlation among all variables . . . . . . . . . . . . . 66 Pairwise minimum and maximum correlation among all variables . . . 67 Prediction accuracy of the 6 top models . . . . . . . . . . . . . . . . 72

Prediction accuracy of the ?ve best models . . . . . . . . . . . . . . . 73 Typical price paths from Beta distribution . . . . . . . . . . . . . . . 87 Residual comparison for ?tting three models . . . . . . . . . . . . . . 89 Illustration of the forecasting idea. . . . . . . . . . . . . . . . . . . . 91 Illustration of forecasting scheme. . . . . . . . . . . . . . . . . . . . . 92 xi

4.5 4.6 4.7 4.8 4.9 5.1 5.2 5.3 5.4 6.1 6.2 6.3

Optimal values of K and D . . . . . . . . . . . . . . . . . . . . . . . 101 Relative prediction accuracy for di?erent values of K at di?erent time horizons ?T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Comparison of di?erent distance metrics . . . . . . . . . . . . . . . . 104 Comparison of di?erent forecasting methods . . . . . . . . . . . . . . 107 Optimal values of K for the Palm PDA data at ?T = 15 min . . . . . 108 Illustration of the three existing smoothing methods . . . . . . . . . . 123 Beta CDF (top panel) and corresponding PDF (bottom panel) with di?erent shape parameters (?, ? ) . . . . . . . . . . . . . . . . . . . . 125 Residuals for all four models . . . . . . . . . . . . . . . . . . . . . . . 134 Comparison of forecasting accuracy for di?erent time horizons. . . . . 137 Comparison of R2 and BIC . . . . . . . . . . . . . . . . . . . . . . . . 155 Comparison of RMSE . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Model comparison for transactions with or without price recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

xii

List of Abbreviations
MAPE WTP FDA KNN fKNN KL Distance B2C B2B H2H DST Salesrep Triplet Mean Absolute Percentage Error Willingness To Pay Functional Data Analysis K Nearest Neighbors Functional K Nearest Neighbors Kullback-Leibler Distance Business to Customer Business to Business Human to Human Decision Support Tool Sales Representative Salesrep-Customer-Product Triplet

xiii

Chapter 1 Introduction
Due to the availability of rich, high-quality data, employing data mining tools to solve business problem and related research have gained great popularity in recent years. Business data typically comes from two sources - online transactions, such as the bidding history for eBay online auction, and o?ine transactions, such as sales transaction data from a grocery product distributor. In this dissertation, we ?rst develop several forecasting models and bidding strategies for online auction data. Then we investigate sales representatives’ price formation process, in particular, the impact of price recommendation on such process, for sales transaction data.

1.1 Introduction to Online Auctions 1.1.1 Online Auctions
Online auction is a signi?cant marketplace, which allows consumers and businesses to sell, buy, and bid on a variety of goods. People shop for consumer electronics on uBid (ubid.com), for consumer loans on Prosper.Com, and for almost everything on eBay (ebay.com). eBay is one of the major online marketplaces and currently the biggest consumer-to-consumer online auction site. Founded in 1995, eBay Inc. has attracted over 200 million registered users and touts net revenue of $8.37 billion for the year 2008 despite the ecomony recession. Dispersed across 1

twenty thousands of categories, several millions of items are listed on any given day. In fact, [6] refers to online auctions as “one of the most successful forms of electronic commerce”. Typically in an online auction, the opening price is set by the seller or the auction house, and bidders submit bids online. There are various auction formats: Online auctions can be ascending (e.g., in eBay auctions) or descending (e.g., in Dutch ?ower auctions where the price is bid down); ?rst price or second price (i.e. whether the ?nal price is equal to the highest bid or second highest bid); with ?xed or soft closing time (i.e. where the auction duration extends with the arrival of new bids); for single items or bundles. On eBay, most auctions are second-price ascending auctions for single items, with a ?xed duration. The seller sets the opening price, and bidders place ascending bids until the auction end time is reached. At that time, the winner is the highest bidder, and s/he pays the second highest bid (plus an increment). Online auctions di?er from their o?ine counterparts in their longer duration (typically several days), the anonymity of participants, the low barriers of entry, their global reach, and around-the-clock availability. These conditions lead to a highly dynamic environment, where bidders engage in competitive behavior that is motivated by both psychological e?ects and economic reasoning. Auctions allow bidders to adjust their behavior based on the previous progress of the auction of interest and competing auctions, which in turn contributes to the dynamic changes in auction progression and price.

2

1.1.2 eBay Data Structure
Empirical research of online auctions has been ?ourishing in recent years due to the important role that these auctions play in the marketplace and the availability of large amounts of high-quality bid data from eBay (as well as Yahoo!, OnSale, uBid, etc). eBay makes public a vast amount of rich bidding data that include all the bidding information as well as information about the bidders, the seller, and the product being auctioned. A typical example of the bid data for a single auction is shown in Figure 1.1. From the bid data, we can determine the price as shown on eBay at any time during the ongoing auction1 . We use two eBay data sets about auction bidding history throughout the research. One includes the complete bidding records for 380 auctions for new Palm Pilot M515 handheld PDA’s that took place on eBay between March and May, 2003; the other data set contains information on 4,965 laptop auctions that took place on eBay between May and June, 2004. For details about the two data sets, please see Appendix A.

1.1.3 Online Auction Literature
Statistical and data mining techniques have been extremely instrumental in gaining insights into auction processes, and we describe some of the major contributions to the online auction literature to date. One important stream of research has focused on various auction features and
On eBay, the price shown at any point in time is the second highest bid at that point rather than the highest bid. Thus, the bid data might include bids that are lower than the highest bid.
1

3

home | register | sign in | services | site map | help

c d e f g
eBay.com Bid History for
Currently Quantity Time left Started Ends Seller (Rating)

Search
Search titles and descriptions

tips

PALM M515 COLOR PDA LIKE NEW HANDHELD (Item # 3041545039) $157.50 1 Auction has ended. Aug-16-03 10:34:26 PDT Aug-21-03 10:34:26 PDT daynathegreat ( 27 )
First bid # of bids

$60.00 19

View page with email addresses (Accessible by Seller only)
Bidding History (Highest bids first)

Learn more.

User ID moonwolfdesigns ( 481 rondaroo1 ( 65 ) ) )

Bid Amount $157.50 $155.00 $151.99 $150.00 $145.00 $140.00

Date of Bid Aug-21-03 10:33:20 PDT Aug-21-03 10:32:52 PDT Aug-21-03 10:19:00 PDT Aug-21-03 10:32:23 PDT Aug-21-03 10:32:11 PDT Aug-21-03 09:01:49 PDT Aug-21-03 10:03:09 PDT Aug-21-03 10:02:45 PDT Aug-21-03 08:31:09 PDT Aug-21-03 07:48:01 PDT Aug-21-03 08:28:58 PDT Aug-21-03 07:25:57 PDT Aug-21-03 07:19:48 PDT Aug-21-03 07:25:43 PDT Aug-21-03 07:25:30 PDT Aug-21-03 07:25:11 PDT

moonwolfdesigns ( 481 rondaroo1 ( 65 rondaroo1 ( 65 rondaroo1 ( 65 ) ) ) ) )

cpumpkinbatman ( 16 cpumpkinbatman ( 16 moonwolfdesigns ( 481 quest3487 ( 68 )

$125.95 $120.95 ) $115.95 $110.25 ) ) $108.35 $102.75 $100.25 ) ) ) $100.00 $95.00 $90.00

moonwolfdesigns ( 481 moonwolfdesigns ( 481 quest3487 ( 68 )

moonwolfdesigns ( 481 moonwolfdesigns ( 481 moonwolfdesigns ( 481

Figure 1.1: Bidding data for an eBay Palm PDA auction.

mhtml:file://C:\WINDOWS\Temporary%20Internet%20Files\Content.IE5\S12ZWP6F\eB... 9/12/2003 4

their impact on the closing prices. In fact, a seller’s reputation [6], an auction’s duration [37], opening and reserve prices [74], or an item’s shipping costs [38], all have been shown to a?ect the ?nal price (see also [89]). Statistical tools used for this type of research are mainly classical regression models, and the results from such analyses help answer sellers’ questions about which auction setting or listing enhancements are worth the extra fee and improve the design of the online market. Besides auction features, there has been much interest in understanding the dynamics of the price formation process recently, in an attempt to better capture, understand, and forecast price in online auctions. Novel statistical approaches have been developed in gaining deeper understanding about price dynamics. [52; 10] have shown that price dynamics can be very heterogeneous, even for auctions of the same product, using descriptive statistics. [51] have shown this for auction of new Palm PDA handheld devices sold on eBay; [21] found similar behavior in auctions for contemporary Indian art. [51] further segmented auctions based on price dynamics: “steady auctions” are those with constant dynamics, “low-energy auctions” are those with late dynamics, and “bazaar auctions” see mostly early activity. [86] developed a three-stage non-homogeneous Poisson process for capturing bid timing, and showed its ?exibility in capturing the bid timing for various items and auction durations, etc. [97] introduced a single class of functional di?erential equation models that captures a wide range of auction price paths and dynamics. Finally, [96] developed real-time forecasting models for ongoing auctions that use as input the price path and its dynamics until the time of prediction. They show that the inclusion of the dynamic information signi?cantly improves predictive accuracy compared to models 5

that exclude such information. Researchers also study the interplay between auction features and dynamics. [84] illustrated the e?ect of auction features (such as the opening bid) on auction dynamics, and found that higher opening bids result in lower price dynamics. [55] developed model-based regression trees to relate di?erential equation models for di?erent auction dynamics to auction features. Auction dynamics re?ect unobservable dynamic behavior such as competition between bidders within the auction and across auctions. The fact that millions of auctions are taking place simultaneously and many of these auctions sell the same or similar items introduces competition both to the sellers and the bidders of the products, which results in competition thus cross-dependencies among auctions and their outcomes. Consequently, adequately capturing and modeling the price path can be used for studying the e?ects of competition. [44] developed visualizations for the price formation process and its dynamics to study the e?ect of concurrency among online auctions. [21] have investigated the relationship between within-auction and between-auction competition on price dynamics and have shown that price dynamics are good proxies for the harder-to-measure competition. Although the presence and importance of competition are broadly noticed by many scholars, quanti?ed study of its e?ect is rather limited. This is mainly due to lack of measures for competition in the dynamic environment. In this dissertation, we set out to quantify the competition between simultaneous auctions and use such information in our forecasting of auction outcomes and designing bidding strategies in this competition environment. 6

There exist two very well documented (and researched) bidding strategies for online auctions, early bidding and last-minute bidding. [8] have shown that early bidders often discourage potential competitors from entering the same auction by signaling their commitment early in the auction. In contrast, last-minute bidders [83; 86] wait until the very last moment to avoid being out-bid. However, neither strategy takes into account the e?ect of competition, thus provides no guide for bidders to select the right auction from many simultaneous auctions to bid on. We build a forecasting model which accounts for competition among simultaneous auctions, and develop a bidding strategy around the model that can determine not only the best auction to bid on but also the right bidding amount. An alternative way to capture competition is to assign heavier weights to auctions with high level of competition when estimating the model and making forecasts. This is in contrast to conventional methods where information from each auction is weighted equally in the process of model estimation. Examples for conventional methods include [96] which used regression-based models to forecast an auction’s ?nal price in a dynamic fashion (see also [32; 59]) and [16] which employed a classi?cation and regression tree method for forecasting. In this dissertation, we develop a method for computing weights for each auction based on auction similarity (therefore competition level) and making weighted forecasts.

7

1.2 Introduction to Pricing in B2B and B2C and Literature Review
Business transactions are normally divided into four groups: business-to-consumer (B2C), business-to-business (B2B), business-to-public administration, and consumerto-public administration. We only discuss the ?rst two types of transactions here because the latter two depend heavily on government policies which is basically a di?erent research area. A business can ensure pro?tability and longevity by utilizing appropriate pricing strategy. [69] stated that improved pricing can yield 20%-35% reduction in waste or unused inventory, 2%-4% increase in corporate revenues, and 1%-3% increase in pro?t. However, with the increasing production size and customer population, setting the right price has become an non-trivial task. Pricing in B2C settings usually involves setting prices for hundreds even thousands of products/services over hundreds of stores nationally and/or internationally (e.g. retail stores, airlines, and hotels). Such complicated task is typically done by decision support tool (DST). DST collects vast amounts of data and employ data mining and optimization routines to uncover the holy grail of pricing - customer’s willingness-to-pay (WTP) - based on which optimized price is computed. DST has proven itself to be extremely helpful in enduring pro?tability in B2C business. For instance, by the help of DSTs, Marriott’s annual pro?t increase for individual hotels totaled $86 million after the rollout of their in-house developed pricing and revenue management system in 2004 [76]. Customer WTP is often endogenously determined by many factors, some ob-

8

servable and some not. The observable traits, such as customer’s purchase history or price of the same products from competing companies [67], can be captured and incorporated into DST. The non-observable part, however, cannot be quanti?ed. Such non-observable factors speak to how a customer perceives/ internalizes a price quote and reacts to it. For instance, [65] introduce the concepts of fairness, and they ?nd it is unfair to exploit shifts in demand by raising prices. [64] discuss the notion of anchoring and how customers make adjustments under uncertain market conditions. [94] study the framing of the price quote, and ?nd that customers respond di?erently to price quotes framed di?erent by salesreps. While both observable and unobservable factors may exist and hence be useful in determining customer WTP in B2C markets, the relatively small dollar spend of each customer coupled with the large number of customers present in the market generally imply that DSTs can make reasonable pricing decisions while ignoring the unobservable traits. The situation changes as we turn to B2B markets where the characteristics of each customers matter to pricing. In such settings, sale representatives (hereafter referred to as “salesreps”) are entrusted with determining the impact of the unobservable customer traits on each customer’s WTP, and managing the (relatively) large accounts of and relations with several business customers. For example, salesreps must assess if a customer will ?nd a price to be fair (whether or not it is a price that is justi?ed by current market conditions), how and on what the customer anchors his willingness to pay (e.g., the past price paid or possibly a competitor’s current price), the strength of the relationship between salesrep and customer and hence whether a customer will trust a quoted price as being reasonable, how customer 9

reacts to price increases, etc. To emphasize the human involvement, we hereafter refer to B2B settings by H2H (Human-to-Human). With such intangible pieces of information about customers, salesreps are often considered experts for quoting appropriate prices. However, studies show that being an “expert” does not always imply better decisions [93]. No matter how experienced salesreps are, they are all human beings who are subject to their own decision biases and judgment heuristics (e.g., memory bias, [93], satis?cing behavior, [77], status quo bias, [66]), which leaves space to improve pricing. For instance, salesresps decision is generally a?ected by irrelevant information [40], thus providing them with only most relevant info may lead to appropriate price quotes. DST, on the other hand, can gather information across hundreds of salesreps, and is able to make better aggregate predictions about WTP and demand. Hence, DST price recommendation may provide a valuable reference point on which salesreps can anchor their price decisions. However, it is not very clear whether price recommendations to salesreps in H2H markets as they have in B2C business. While there is a large amount of literature on pricing in economics, marketing, or operations management for B2C markets (e.g. [94; 20; 101]), surprisingly little research has been done on H2H pricing, and even less so on behavior of salesreps in this context. With limited understanding of what salesreps anchor on when making price quotes, it is di?cult to improve pricing in B2B setting. We set out to study how salesreps form prices and respond to price recommendations in H2H markets in this lack of study. The results will aid designing of DSTs to counter salesrep biases and improving pro?tability for companies. 10

1.3 Contributions of the Dissertation
Applying statistical tools to help make informed business decisions has attracted enormous amounts of research interest in recent years. Because of the huge amounts of information available, distinguishing useful from noisy information and drawing informed conclusions from data becomes a non-trivial task and requires employment of novel statistical tools. In this dissertation, we develop/apply data mining techniques to two sources of business data - online auction data and H2H transaction data. We develop prediction models and bidding strategies in online auction setting and investigate the impact of DST price recommendation on sales representatives’ pricing decision. This dissertation has resulted in several papers under review at Statistics and Business journals; and another paper is coming out at the end of summer.

1.3.1 Data Driven Bidding Strategy
Bidders participating in online auction often face many complicated bidding decisions. They have to decide whether to bid early or late, whether to place a single bid or multiple updates, whether to bid high or low. Bidding is further complicated by the existence of many auctions that o?er the same, or similar item simultaneously. All in all, a complete bidding strategy has to include decisions on which auction to bid on and how much. Many bidders rely on two conventional strategies, early bidding and last-minute bidding. Although proven to e?ectively yield a high winning probability for a careful

11

selected auction, neither strategy answers bidders’ question about which auction to bid on given thousands of simultaneous auction. The ?rst contribution of this dissertation is to propose a novel automated and data-driven bidding strategy which provides bidders with complete decision guides. Our strategy consists of two main components. First, we develop a dynamic, forward-looking forecasting model for price in competing auctions. Then, using the idea of maximizing consumer surplus, we build a bidding framework around this model that determines the best auction to bid on and the best bid-amount. We also conduct a simulation study which shows that our strategy results in a much higher surplus than two conventional bidding rules. This research, discussed in Chapter 2, is currently under the second round review at the INFORMS Journal of Computing [57].

1.3.2 Model Selection for Improved Forecasting
One important component of our bidding strategy is a good forecasting model for auction closing prices. Knowing the auction’s closing price has several advantages for auction participants. Bidders can use this information to make more informed bidding decisions [57]. Sellers can use predictions to identify times when the market is more favorable to sell their products and to better evaluate the value of their inventory. In this chapter, we investigate forecasting alternatives by developing modelselection strategies for online auctions. Model selection in this setting is di?erent

12

compared to classical time series analysis. In classical forecasting, one typically wants to forecast a particular time point; while in the context of online auction, one needs to forecast an entire time interval to satisfy bidders’ need of bidding on any auction that is expected to close in that time window. Our second contribution of this dissertation is to extend the classical model selection criteria which are applicable only to a time point to the setting where forecasting a time interval is required. we do so by computing an entire distribution of a model selection criterion over the prediction interval. In this Chapter, we investigate di?erent ways to summarize the distribution and the impact of di?erent summaries on the prediction task. We ?nd that the models selected by the volatility of classical AIC or BIC’s distribution over the prediction window have extremely poor prediction performance, while the models selected by minimum or maximum predict very well. This research is discussed in Chapter 3 and has been submitted to the Journal of Business and Economic Statistics for review [58].

1.3.3 Weighted Forecasting of Closing Prices
Besides studying model selection criteria for regression models, we also investigate forecasting alternatives by developing novel weighted forecasting methods. For the closing price of an ongoing auction, the natural reference points are the ?nal prices of past auctions. Previous forecasting methods, including ones developed in Chapter 2 and 3, put equal weight on the information from all training auctions when estimating model coe?cients and making forecasts. Nevertheless, as-

13

suming that more similar auctions contain more relevant information for forecasting, a forecasting method that weighing the information from each auction di?erently, depending on how similar that auction is to the auction of interest, is more appropriate. For this purpose, we apply the popular weighted prediction method K-Nearest Neighbors (KNN) - for forecasting closing prices of an ongoing auction. One key aspect to the success of KNN is the choice of distance metric based on which the distances between samples (i.e. the reciprocal of sample weights) are measured. This is especially challenging in the context of online auctions because auctions vary on many conceptually di?erent dimensions, such as static (e.g. auction starting prices), time-varying (e.g. number of bids) and functional dynamics information (the dynamics/shapes of the auction price paths). Although there exist standard measures for static or time-varying information, measuring the distance between functional dynamic information (e.g., between two curves) is more involved because of in?nite dimensionality. An important contribution of this research is to point out a new research area - developing weighted forecasting models for better forecasts. In the study, we introduce a parametric Beta model to capture auction price paths, which allows measuring the distance between auctions’ dynamics in a very parsimonious way via the Kullback-Leibler distance (KL distance). Furthermore, we de?ne distance metrics that integrates information of various types, including dynamics. Using the reciprocal of the distance as weights, we ?nd that weighing information unequally yields better forecasts compared to classical methods such as regression models or trees and this result holds in auctions of varying levels of heterogeneity. This research, dis14

cussed in Chapter 4, has been recommended for publishing in International Journal of Forecasting [102].

1.3.4 A Flexible Model for Price Dynamics in Online Auctions
Besides allowing measuring distance between auctions’ dynamics via KL distance, the Beta model developed in Chapter 4 has many other useful properties in online auction context. We explore those properties in details and compare it with existing models for capturing auction price paths. The fourth contribution of this dissertation is to study the characteristics of the parsimonious parametric Beta model and show its advantages as a representation for auction price paths over existing methods. We show that the model can accurately capture price paths and price dynamics of various types, summarize the bid timing distribution, measure pairwise distances between price paths or price dynamics curves, and is computational e?cient. This work is discussed in Chapter 5 and currently under review at the Journal of the Royal Statistical Society (Series C) [56].

1.3.5 Decision Making in H2H Transactions
Di?erent from B2C settings where decision support tools (DST) have been adopted and proven to be extremely valuable in aiding ?rms and improving their pro?ts, sales representatives have signi?cant responsibility in pricing decisions in B2B (H2H) transactions. Salesreps may rely on many observable and non-observable

15

information, such as their personal expertise, knowledge of individual customers, and price recommendations from DST, to make price quotes. Given those many pieces of related information, especially DST price recommendation, it is not very clear which are the important factors that take e?ect in salesreps’ mental model, by which we refer to their price formation process. One important contribution of this dissertation is to identify important factors that determine a salesrep’s mental decision model in a H2H setting. We study how sales people adjust price quotes for di?erent products and di?erent customers over time with special attention to the impact of DST price recommendation. We use various model selection criteria to identify most in?uential factors, and we ?nd that salesreps anchor most on cost related information, including cost, sign and size of cost change, and types of products (perish commodities or non-commodities). Furthermore, we ?nd that price recommendation from DST, whenever provided, in?uence salesreps’ decisions in a positive way. It serves as a price focal point, without which, salesreps are in?uenced more by unobservable factors and thus make price decisions di?cult to explain and predict. This work is anticipated to be submitted to a top business journal (such as Management Science) by the end of summer.

1.3.6 Summary of Dissertation Contributions
To summarize, the contributions of this dissertation are to: • Propose a novel automated and data-driven bidding strategy which helps bidders make bidding decision (Chapter 2).

16

• Investigate various model selection criteria for forecasting over a time interval in online auction setting (Chapter 3). • Propose a K-Nearest Neighbor forecaster for forecasting closing price of online auctions; introduce a parsimonious model to capture auction price paths that allows measuring distances between auctions’ dynamics; and propose a novel distance metric for online auctions that takes into account both static and time-varying features as well as the auction’s price dynamics information (Chapter 4). • Study characteristics of the beta model and illustrate its advantages over existing models as representations of auction price paths (Chapter 5). • Identify the key factors in?uence saleresps’ pricing decisions and investigate the impact of decision support tool on salesreps (Chapter 6).

17

Chapter 2 An Automated and Data-Driven Bidding Strategy for Online Auctions 2.1 Introduction
The ?exibility of time and location as well as the availability of many di?erent products make online auctions an important marketplace. However, bidders participating in this marketplace often face many complicated bidding decisions. They have to decide whether to bid early or late, whether to place a single bid or multiple updates, whether to bid high or low. Bidding is further complicated by the existence of many auctions that o?er the same, or similar item simultaneously. In that case, one’s bidding strategy has to be expanded to include decisions on which auction to bid on, when to bid on that auction, and how much. There exist two very well documented bidding strategies, early bidding and last-minute bidding. By signaling their commitment early, early bidders [8] discourage competitors from entering the same auction. In contrast, last-minute bidders [83; 86] wait until the very last moment as the chances of being out-bid decrease with the time left in the auction. However, both bidding strategies su?er from limitations since neither takes into account the e?ect of competition [39]. In other words, neither strategy considers the information from simultaneous auctions o?ering similar

18

products. While there is emerging literature [100] that suggests that bidders should shade their bids in the presence of sequential auctions, the precise amount of the optimal shade on an auction-by-auction basis is not quite clear. In this chapter, we propose a novel automated and data-driven bidding strategy. Our strategy consists of two main components. First, we develop a dynamic, forward-looking forecasting model for price in competing auctions. Then, using the idea of maximizing consumer surplus, we build a bidding framework around this model that determines several decisions: the best auction to bid on and the best bid-amount. This work is currently under the second round review at Informs Journal of Computing. The ?rst component of our automated bidding strategy is a dynamic forecasting model for the price in competing auctions. There has been considerable amount of work on predicting an auction’s closing price using static (or pre-auction) information (see e.g. [6; 37; 38; 74]). One drawback to these approaches is that they only consider information available before the start of the auction and thus ignore the dynamic nature of the auction process. Dynamics have only recently been found to a?ect the outcome of an online auction [10]. [51] ?nd that auctions selling identical products fall into one of three segments of price dynamics, namely “steady auctions” which experience a constant ?ow of dynamics, “low-energy auctions” with late dynamics and “bazaar auctions” which see the largest jump of dynamics. [84] illustrate the e?ect of auction parameters (such as the opening bid) on an auction’s dynamics and ?nd that higher opening bids result in lower price dynamics. [97] show that an auction’s price dynamics can 19

be characterized well using a single class of functional di?erential equation models and [55] extend upon this idea and develop model based regression trees to relate di?erential equation models to auction covariates. Moreover, [96] show that the inclusion of price dynamics into forecasting models signi?cantly improves the predictive capability of an auction (see also [7]). In this study, we build upon the ideas developed in previous studies. However, one key di?erence is that, in contrast to previous studies, we study dynamics in the context of competing auctions. That is, we study the e?ect of dynamics generated in simultaneous auctions, selling the same or similar product as the auction of interest. We incorporate the dynamic nature of the auction process by employing a modern statistical approach called functional data analysis (FDA). See [79] for a general introduction to FDA or [52] for an illustration of FDA in the context of electronic commerce. Besides incorporating dynamics, our model also explicitly accounts for the information from competing auctions. Competition between auctions has come to the spotlight only recently (e.g. [39; 37; 3; 17]). One problem with competition is the precise quanti?cation of its e?ect. [53] propose a spatio-temporal model to measure similarity among concurrent auctions. [44] take a functional approach to visualize concurrent auctions and their dynamics. (See also [50] for additional visualizations of concurrent auctions.) In this chapter, we propose several innovative measures for auction competition using the concept of functional data analysis. Our measures can be grouped into three conceptually di?erent classes: measures that capture competition from static (or pre-auction) information; from time-varying information; and from dynamics. We perform variable selection to identify the most 20

predictive set of competition measures. Our proposed forecasting model incorporates measures of dynamics and competition as predictors. In contrast to [96], our model also takes into account competition; in contrast to [53], we measure competition in ways that are easily scalable and do not rely on spatial methodology. We compare our model’s predictive capability to several alternate approaches, and ?nd that our model can predict an auction’s price with higher accuracy. In the second component, we build a comprehensive bidding strategy around our forecasting model. The idea is based on maximizing consumer surplus, which refers to the di?erence between the bidders’ willingness to pay (WTP) and the price actually paid. We formulate an automated algorithm for selecting the best auction to bid on, and for determining the best bid-amount. The best auction provides bidders with the highest surplus, and the best bid-amount equals the predicted closing price. We conduct a simulation study where we compare our automated, data-driven bidding strategy with early bidding and last-minute bidding. We compare all bidding strategies on two di?erent dimensions: the probability of winning an auction, and the surplus extracted. We ?nd that although snipers have the highest probability of winning, our strategy results in a much higher surplus. We also investigate the impact of the prediction window on the resulting surplus. The prediction window is equivalent to the given time frame within which a bidders wants to win an auction. Shorter time frames correspond to bidders that want to win more quickly; longer time frames correspond to bidders that allow more time for search and selection. We ?nd that, as the width of the prediction window is increasing, surplus goes up 21

while the probability of winning goes down. The chapter unfolds as follows: In Section 2.2, we describe the data; in Section 2.3, we derive our forecasting model. Results of model estimation are discussed in Section 2.4. In Section 2.5, we present the framework for our automated bidding strategy and the results of our simulation study. We conclude with further research directions in Section 2.6.

2.2 Data Description
The data used in this study are the complete bidding records for new Palm M515 handheld devices, auctioned between March 14, 2003 and May 25, 2003. The market price at the time of data collecting was $230 (based on Amazon.com). Each bidding record includes the auction number, starting- and closing-time and -prices, bids with associated time stamps, and other information, such as auction duration, shipping fee, seller’s feedback score, whether the seller is a power seller, whether the product is from an eBay store, and whether auction’s description includes pictures. A summary of this information can be found in Table A.1. Figure 2.1 illustrates the information-overload that bidders face. In particular, we see, for each individual auction, the live price curve, that is, the price that bidders see at any given time during the ongoing auction. We can see that the information can be quite overwhelming: the amount of concurrent auctions, the variation in prices and the fact that some auctions are only in the early stages, while others are about to end, all cause challenges for properly processing the given information.

22

Live bid

0 3/14

50

100

150

200

250

300

3/19

3/24

3/29

Calendar time

Figure 2.1: Snapshot of the live price curves during eBay auctions. Moreover, we see that prices increase unevenly throughout most auctions. They increase fast in some auctions, but much slower in others. We refer to this as price dynamics, which will be an important factor in our modeling approach.

2.3 A Model for Forecasting Competing Auctions
Our forecasting model has several features: We model the real-time price process of ongoing auctions using functional data analysis (FDA), which allows us to incorporate information about the dynamics of price. We also propose several innovative ways of incorporating competition across concurrent auctions, and then we suggest an innovative way to perform model selection1 and model updating. We describe these features in detail next.
1

A more complete study about the model selection can be found in next chapter.

23

2.3.1 Functional Data Analysis and Price Dynamics
The price process of online auctions is characterized by an extremely dynamic environment. One aspect of this environment is the changing bid density, where the number of bids per time unit changes constantly. The resulting unequallyspaced time-series of bids deem traditional models (which assume evenly spaced measurements) inadequate. Furthermore, the changing bidding patterns also result in varying price dynamics. By price dynamics we mean the change in price and the rate at which this change occurs. Traditional forecasting models, which do not account for such instantaneous change, fail to accurately predict auction prices [96]. To incorporate this dynamic environment, we take a functional data modeling approach. Functional data analysis [79] uses smoothing methods2 to recover (or estimate) the underlying price curve from observed bidding data. From the price curve, we then obtain estimates of the price dynamics via its ?rst and second derivatives. Figure 2.2 illustrates the process of generating smooth price curves from observed data (left panel) and estimating the corresponding price velocities (right panel). We see that the smooth curves capture the trend of the price increase due to the discrete bids; the velocity captures the instant change of price increase. For more details on FDA in the context of online auctions, refer to [84] or [52]; and for more details on the smoothing process, see Section 5.2.1 in Chapter 5.
2

In particular, we use polynomial smoothing methods (p-splines) in this study.

24

3/27

3/29

3/31

4/02

4/04

Price?Velocity

Live Price

3/27

3/29

3/31

4/02

4/04

Calendar Time

Calendar Time

Figure 2.2: Smooth price curves (left panel) and corresponding velocities (right panel) for two sample auctions. The dots in the left panel denote the observed bids.

2.3.2 Capturing Competition
One major component of our model is competition. That is, we want to capture the e?ect of what happens in other, simultaneous auctions. To that end, we must ?rst de?ne meaningful measures for competition. There are many di?erent ways of de?ning competition measures and we explore several alternatives below. All measures are driven by the same general principle which is illustrated in Figure 2.3. We de?ne a focal auction (indicated by the solid line in Figure 2.3) as the auction for which a bidder wants to decide whether or not to bid on. At time T of decision-making, there are several other auctions that take place simultaneously (indicated by the dotted lines). One meaningful measure of competition is the level of price in other auctions. In our example, there are four di?erent prices levels at time T , varying from high (p1) to low (p4). The price level in the focal auction at that time is p3. Thus, a possible measure for the price competition is given by the 25

average price in concurrent auctions (which we denote by c.avg.price), that is, by the average of p1, p2 and p4. In similar fashion, the average price velocity (c.avg.vel) in concurrent auctions would be de?ned as the average of the corresponding price velocities, and so on.

200

250

P1 150

Live Price

P2

100

P3 P4

0

50

T

Calendar Time

Figure 2.3: Illustrating competition: The sold black line denotes the focal auction; the dashed lines denote competing auctions; T denotes the time of decision-making.

In this chapter, we investigate several di?erent competition features and their impact on the price of the focal auction. Table 2.1 categorizes these features by the information that they carry: Static competition features are known at the outset of the auction and do not change during the auction process; examples include the opening price of concurrent auctions (a high opening price in other auctions could discourage bidders and make them participate in the focal auction) or the duration of concurrent auctions (if competing auctions have a shorter duration, then bidders with an immediate desire may be attracted to those auctions); Timevarying competition features change during the auction process, such as the current 26

price of concurrent auctions (if the price is low in other auctions, bidders may leave the focal auction) or the number of bidders of concurrent auctions (bidders may feel that their chances of winning are higher in auctions with lower competition); and price dynamic competition features capture the e?ect of changing dynamics in competing auctions (if the price dynamics increase in competing auctions, e.g., due increased bidding activity in those auctions, then the price speed in the focal auction is likely to slow down). In Figure 2.4, we explore the relationships between some of the competition features from Table 2.1 and the future price in the focal auction. We can see that some features (e.g., the average price and its velocity in competing auctions) have a strong relationship with price, while others (e.g., the average opening bids or the shipping fee in simultaneous auctions) have a rather weak relationship. Pairwise correlation analysis (not reproduced here) also shows that, unsurprisingly, many of the features in Table 2.1 are multicollinear. Thus, a good modeling strategy will start with a suitable variable selection procedure. We will use the initial observations from Figure 2.4 for guidance when selecting the most relevant set of competition features in the next section.

2.3.3 Variable Selection
Many di?erent pieces of information can a?ect price in online auctions. We di?erentiate between two main components, information from within the focal auction vs. information from other, competing auctions that take place simultaneously.

27

Table 2.1: Candidate Competition Features
Name Description Static Features c.openbid.avg c.dura.avg c.ship.avg c.f eedback.avg c.power.avg c.store.avg c.pic.avg Average opening price of concurrent auctions Average duration of concurrent auctions Average shipping fee of concurrent auctions Average sellers’ feedback of concurrent auctions Average number of power seller in concurrent auctions Average number of eBay stores in concurrent auctions Average number of pictures in concurrent auctions Time-varying Features c.price.avg c.price.vol c.price.disc c.t.lef t.avg c.t.lef t.vol c.nbids.avg c.nbids.vol c.nbidders.avg c.nbidders.vol Current average price in concurrent auctions Current price volatility (stdev) in concurrent auctions Price discount (di?erence) between focal auction and highest concurrent price Average time left in concurrent auctions Volatility (stdev) of time left in concurrent auctions Average number of bids in concurrent auctions Volatility (stdev) of number of bids in concurrent auctions Average number of bidders common to focal and concurrent auctions Volatility (stdev) of number of bidders common to focal and concurrent auctions Price dynamic Features Average price velocity in concurrent auctions Volatility (stdev) of price velocity in concurrent auctions Average price acceleration in concurrent auctions Volatility (stdev) of price acceleration in concurrent auctions

c.vel.avg c.vel.vol c.acc.avg c.acc.vol

28

5

5

4

4

3

3

2

2

0.85

0.90

0.95

1.00

4.30

4.35

4.40

4.45

2 0.75

3

4

5

0.80

0.85

0.90

0.95

c.openbid.avg

c.price.avg

c.vel.avg

5

5

4

4

3

3

2

2

2.950

2.960

2.970

2.980

9.375

9.380

9.385

2 ?0.50

3

4

5

?0.46

?0.42

c.ship.avg

c.t.left.avg

c.acc.avg

Figure 2.4: Pairwise relationships between some of the competition features from Table 2.1 (measured at time T ) and the price (measured at T + 1) in the focal auction. Within each component, information can be further segmented into static, timevarying and price dynamic information, similar to Table 2.1. Table 2.2 lists all the di?erent pieces of information that are candidates for our forecasting model. Table 2.2 shows that there are over 30 di?erent variables that are candidates for our forecasting model. Thus, an important ?rst step in our modeling e?orts is the selection of a parsimonious subset of relevant predictors. Variable selection has been researched in the statistics literature for a while [13] and it is receiving increasing attention today with the availability of more and more data sets featuring larger and larger number of variables [30]. A complicating factor in our situation is the time-varying nature of our model. Our goal is to ?nd a model that predicts well at time T + 1, universally across all time periods T = 1, 2, 3, . . . , NT . Classical variable 29

selection procedures focus on cross-sectional data, that is, on data corresponding to a single time period only. Since our data varies over time, it is quite plausible that there exists one model that best predicts at time T + 1, while another (di?erent) model best predicts at a di?erent time T + 1. Our goal is to ?nd a model that is not geared to a single time period only, but applies rather universally to the eBay market over a longer time window. To that end, we choose a model which has good average performance3 , averaged over all time periods T of interest. We describe this approach next. Table 2.2: Candidate information for the forecasting model Information from within the focal auction Static information opening bid, auction duration, shipping fee, seller’s feedback, power seller, eBay store, picture current price, time left, current number of bids, current number of bidders

Time-varying information

Price dynamic information price velocity, price acceleration Information from competing auctions Static information Time-varying information c.openbid.avg, c.dura.avg, c.ship.avg, c.feedback.avg, c.power.avg, c.store.avg, c.pic.avg c.price.avg, c.price.vol, c.price.disc, c.t.left.avg, c.t.left.vol, c.nbids.avg, c.nbids.vol, c.nbidders.avg, c.nbidders.vol

Price dynamic information c.vel.avg, c.vel.vol, c.acc.avg, c.acc.vol
3

Note that our decision to use the criterion that has the best “average” performance is rather intuitive. We conduct a more complete study on di?erent criteria in the next chapter.

30

Our model has the general form

yT +1 = ? T xT

(2.1)

where yT +1 denotes the auction prices at T + 1, xT = (xT 1 , . . . , xT p ) is a vector of predictors, and ? T = (?T 1 , . . . , ?T p ) is a vector of coe?cients to be estimated from the data. The goal is to select only those predictors that are important for predicting the price yT +1 , across all time periods T = 1, 2, 3, . . . , NT . We accomplish this in several steps. In the ?rst step, we run simple regressions (i.e. p = 1) between each individual predictor from Table 2.2 and the response yT +1 at each time period T, T = 1, 2, 3, . . . , NT . We then calculate the percentage of time points a predictor is signi?cant (at the 5% signi?cance level). That is, for each predictor xk = (x1k , . . . , xNT k ) , k = 1, . . . , p, we compute the average4

p.sigk :=

1 NT

T

1{x signi?cant at 5% level} . Tk

(2.2)

Table 2.3 shows the results for a ?ne grid of hourly forecasts (i.e. (T +1) ? T = 1 hour) which results in NT = 1, 754 di?erent time periods. We can see that the predictors that individually have a strong e?ect on yT +1 (consistently across all time periods T ) are the current price, price velocity and acceleration, time left and the number of bids (from within the focal auctions) and c.price.avg, c.price.vol, c.price.disc, c.t.left.avg, c.t.left.vol, c.nbids.avg, c.nbids.vol, c.vel.avg, c.vel.vol, c.acc.avg and
While we use an un-weighted average, a possible alternative would be to weight each time point according to its distance from the close of the auction.
4

31

Table 2.3: Percentage of signi?cant time points. The two leftmost columns refer to predictors from within the focal auction; the two rightmost columns refer to predictors from competing auctions. Focal auction openbid duration shipping sellerfeed powerseller store picture currenprice p.sig .199 .032 .039 .055 .061 .092 .028 1.00 Competing auctions p.sig c.openbid.avg c.dura.avg c.ship.avg c.feedback.avg c.power.avg c.store.avg c.pic.avg c.price.avg c.price.vol c.price.disc c.t.left.avg c.t.left.vol c.nbids.avg c.nbids.vol c.nbidders.avg c.nbidders.vol c.vel.avg c.vel.vol c.acc.avg c.acc.vol .193 .032 .046 .044 .076 .104 .028 1.00 .886 1.00 .771 .758 .777 .509 .188 .086 .762 .624 .309 .306

timeleft numbids numbidders price velocity price acceleration

.775 .780 .197 .762 .308

32

c.acc.vol (from competing auctions). It is interesting that most of these variables relate to price (or price movement) from the focal auction relative to competing auctions. This suggests that information about price and its dynamics e?ectively captures much of the relevant auction information such as information about the product, the auction format, the seller and competition between bidders. However, also note that the results so far are based only on simple regressions (p = 1) and thus may not fully re?ect the joint e?ect of a predictor in the presence of other predictor variables. To that end, we investigate pairwise correlations (again, averaged across all time periods, T = 1, . . . , NT ; correlation-table not reported here) and ?nd high collinearity between ten pairs: the current price and c.price.avg, the current price and c.price.vol, the current price and c.price.disc, the current price and time left, the current price and c.t.left.avg, the current price and number of bids, the current price and c.nbids.avg, price velocity and c.vel.avg, price velocity and c.vel.vol, and price acceleration and c.acc.avg. This high collinearity is not surprising since many of these predictors carry similar information, only coded in a slightly di?erent way. We eliminate all highly collinear predictors; next we derive our ?nal model using the Bayesian Information Criterion (BIC). In similar fashion to equ.(2.2), one can compute the average BIC across all time periods. That is, let avg.BIC := meanT (BIC(T )), where BIC(T ), T = 1, 2, 3, . . . , NT , denotes the Bayesian Information Criterion (e.g. [30]) of a model computed at time period T 5 . By comparing all possible subsets of non-collinear predictors, we arrive
Note that we calculate the average of BIC across only the time points where BIC is applicable. That is, time points where BIC is not available due to non-su?cient data for modeling is ignored in this de?nition.
5

33

at our ?nal forecasting model as

yT +1 = ?T + ?1T current priceT + ?2T velocityT + ?3T accelerationT ? ?4T c.acc.volT . (2.3) Table 2.4 shows the avg.BIC of our ?nal forecasting model (2.3) compared to several competitor models. We can see that our model results in the lowest avg.BIC. It is also interesting to see that models with only information from competing auctions perform almost as well as models with the corresponding information only from within the focal auction. This is yet another piece of evidence for the tight connectivity of the auction marketplace. Table 2.4: Average BIC computed across all time periods T . The ?rst row shows the value of avg.BIC for our model in (2.3); the remaining rows show the corresponding values of several competing models. Model Our model from eq. (2.3) Full model (all 33 predictors from Table 2.2) All 13 predictors from the focal auction (Table 2.2) Only 2 focal auction dynamics Only 4 focal auction time-varying predictors All 20 predictors from competing auctions (Table 2.2) Only 4 competing auction dynamics Only 9 competing auction time-varying predictors avg.BIC -381.59 -147.08 -319.82 37.77 -83.87 -313.52 37.58 -84.29

A few comments about our ?nal model in (2.3) are in order. It is interesting to see that the model relies only price related information. In particular, it is interesting to see that many variables that have been found signi?cant in previous studies have dropped out of our model. For instance, [74] ?nd, among other things, 34

a signi?cant e?ect of the seller’s rating. One key di?erence between previous studies and our study is that while they take a static look at online auctions, our model captures the dynamic nature of the auction process. In other words, previous studies typically only look at the static, pre-auction information that is available before the start of the auction (such as the auction length, the opening bid or a seller’s rating). In such a static view, the e?ect of the seller’s rating is highly signi?cant (since the seller’s reputation and trustworthiness will impact the ?nal price). However, our model is dynamic in the sense that all previous price considerations and bidding decisions have already been factored into the current price and its current dynamics (current priceT , velocityT , accelerationT ). In that sense, price dynamics re?ect the expectations of all bidders about the product, the seller and the bidding competition up to this time point. It is thus not too surprising that all static variables drop from our ?nal model. The e?ect that captures concurrency is more intriguing. Note that the information from concurrent auctions is captured in a single variable, the volatility of dynamics from competing auctions (c.acc.volT ). As there has not been much prior research on the e?ect of concurrent auctions, it is hard to formulate an expectation about c.acc.volT . However, the negative sign indicates that higher variance in the price movements of competing auctions will result in smaller price advances of the focal auction. In other words, more price activity in di?erent parts of the market will lead to price stalling of the focal auction. We will conduct a more complete study regarding the model selection in Chapter 3.

35

2.3.4 Model Updating
The goal of our model is to predict price at a future time T + 1 using only information from the present (i.e. time T ) and the past (T ? 1, T ? 2, etc.). We accomplish this by estimating the functional relationship between T ? 1 and T and then applying this relationship to predict T + 1 from T . Figure 2.5 illustrates this updating scheme.

Past

Present

Future

T -1
XT1

E T-1

T

T 1

YT
XT
ˆ E T

ET
E T -1

ˆ Y T1
YT1

Figure 2.5: The illustration of the update scheme in the forecasting model

At time T (present), we wish to make a prediction about the future price at time T + 1. Per our model, yT +1 is given by ? T xT , where xT contains information observed in the present (or past). Note that we cannot estimate ? T directly since the response (yT +1 ) is yet unobserved. We therefore estimate the relationship from ˆ := the past: We estimate ? T ?1 for the price at T (yT ) and then estimate ? T via ? T ? T ?1 . In that sense, we “roll” the relationship from the past one time period forward. 36

We also investigated alternate updating approaches (such as estimating ? T via a ˆ T := MA{? T ?1 , ? T ?2 , ? T ?3 , . . . }) but moving average (MA) of prior relationships, ? did not ?nd signi?cant improvements in model performance.

2.4 Estimation and Prediction Results
In this section we discuss estimation and prediction of our forecasting model in (2.3). We also compare its predictive capabilities to alternate forecasting approaches. To that end, we divide our data set into a training set (80% of the data), and a validation set (remaining 20% of the data). Since our data varies over time (and since we are primarily interested in making accurate predictions of the future), our training set consists of all auctions that complete during the ?rst 80% of our data’s time span (i.e. between March 14 and May 10); the validation set contains all remaining auctions (i.e. between May 11 and May 25). In that sense, we ?rst estimate our model on the training set; results of model estimation and -?t are discussed below. We then apply the estimated model to the validation set to gauge its predictive capabilities; this is discussed in the second half of this section.

2.4.1 Model Estimation
Figure 2.6 shows the estimated coe?cients for the parameters of our forecasting model (2.3). Recall that we estimate the model at every time point T, T = 1, 2, 3, . . . , NT in the training set. In our application, we consider time intervals of one hour over the time period between March 14 and May 10, hence the coe?-

37

cients also vary over that time period. Figure 2.6 shows the resulting trend of the coe?cients together with 95% con?dence bounds.

0.99996

price?velocity
3/14 3/25 4/06 4/17 4/29 5/09

Current Price

0.99992

0.99988

0.12445
3/14

0.12455

0.12465

3/25

4/06

4/17

4/29

5/09

calendar time

calendar time

0.0081

price?acceleration

0.0077

0.0073

c.acc.vol
0.0069
3/14 3/25 4/06 4/17 4/29 5/09

?0.0032
3/14

?0.0026

?0.0020

3/25

4/06

4/17

4/29

5/09

calendar time

calendar time

Figure 2.6: Estimated coe?cients of model parameters, together with 95% con?dence bounds. The x-axis denotes calendar time; the y-axis denotes the magnitude of the coe?cient. The panels show (from top left to bottom right) current price, price velocity, price acceleration and the acceleration volatility of competing auctions (c.acc.vol).

We can see that information from within the focal auction (current price, price velocity and acceleration) has a positive relationship with the future price yT +1 ; in contrast, information from competing auctions (c.acc.vol) has a negative relationship. In other words, both the current level of price and its dynamics are positive indicators of future price. On the other hand, the volatility of price acceleration in competing auctions is a negative indicator. Price acceleration in competing auctions will be high if many bidders bid in auctions di?erent from the focal auction. 38

A high volatility in price acceleration may suggest high uncertainty in the marketplace, with some auctions experiencing large price jumps and others experiencing no price movements at all. This high uncertainty results in depressed prices of the focal auction.

2.4.2 Model Fit and Varying Time Intervals
Figure 2.6 shows the estimated model coe?cients for one hour time intervals; that is, for (T + 1) ? T = 1 hour. Alternatively, one could also consider models with a larger time intervals; that is, models that forecast further into the future. Intuitively, since forecasting further into the future is harder, such models should not perform as well. Figure 2.7 (left panel) shows the model ?t for the time intervals (T + 1) ? T = 1, 2, 3, . . . , 14 hours. We measure model ?t by the average R2 value, avg.R2 := (1/NT )
T

R2 (T ), where R2 (T ), T = 1, 2, 3, . . . , NT , denotes the R2 of

a model computed at time period T . We can see that, as expected, the model ?t decreases as the time intervals get larger. Notice though that even for the largest time interval (14 hours), the value of avg.R2 is still larger than 99%.

2.4.3 Prediction Performance
As pointed out above, we estimate the model on the training set; then we gauge its predictive performance on the validation set. We measure predictive performance of a model in terms of its mean absolute percentage error (MAPE). For each time

39

1.0000

0.9995

avg.MAPE
2 4 6 8 10 12 14

avg.R?SQ

0.9990

0.9985

0.9980

0.000

0.002

0.004

0.006

0.008

0.010

2

4

6

8

10

12

14

time interval

time interval

Figure 2.7: Model ?t and prediction accuracy for di?erent time intervals. The x axis represents the time interval (in hours; ranging from 1 hr to 14 hrs); the y axis represents the value of avg.R2 (left panel) and the value of avg.MAPE (right panel). period T, T = 1, 2, 3, . . . , NT , in the validation set, we compute

MAPE(T ) =

1 mT +1
i=1

|yT +1,i ? y ˆT +1,i | |yT +1,i |

(2.4)

where yT +1,i and y ˆT +1,i denote the true and predicted values of auction i at time T + 1, respectively, and mT +1 denotes the number of auctions available at time T + 1. We compute the average MAPE across all time periods as avg.MAPE := (1/NT )
T

MAPE(T ). In similar fashion to Section 2.4.2, we investigate avg.MAPE

for di?erent time intervals, (T + 1) ? T = 1, 2, 3, . . . , 14 hours. The right panel in Figure 2.7 shows the results. Unsurprisingly, we see that as we predict further into the future (i.e. as time interval gets larger), the predictive performance decreases (i.e. avg.MAPE increases). It is interesting to see that for predictions up to 4 hours into the future, the prediction error is less than 0.1%. For time intervals larger than 4 hours, the prediction

40

error increases at a faster rate. However, even for predictions as far as 14 hours into the future, the error is still less than 1%. This predictive accuracy is quite remarkable as we will see in the next subsection where we benchmark our approach against several competing approaches. We also want to note that while we cannot claim generalizability to all eBay auctions, there has been prior evidence that real-time forecasting models can provide superior predictive accuracy, especially for books and electronics (see [96]).

2.4.4 Comparison with Alternative Models
We benchmark our model against ?ve alternative models, the generalized additive model (GAM), classi?cation and regression trees (CART), Neural Networks and two simper linear models: a purely static and an time-varying linear model.

0.16

0.00

0.04

0.08

GAM CART NEURALNET STATIC TIME?VARYING DYN&COMP

avg.MAPE

0.12

2

4

6

8

10

12

14

time interval

Figure 2.8: Prediction accuracy for competing models. The x axis represents the time interval (in hours); the y axis represents the value of avg.MAPE.

41

GAMs relax the restrictive linear model assumption between the response and predictors by a more ?exible nonparametric form [41]. CARTs [15] provide a datadriven way to partition the variable-space and are thus often viewed as alternatives to formal variable selection. Neural Networks also provide a technique that can approximate non-linear functional relationships. In addition, we consider two linear models that use a subset of the variables from Table 2.2: one model that uses only static information from the focal auction and another one that uses static and time-varying information from the focal auction; we refer to these two models as “STATIC” and “TIME-VARYING,” respectively. The static model corresponds to the information of many prior eBay studies (e.g.[74]) in that it only considers pre-auction information. The time-varying model accounts for changes due to the process of bidding, but it does not account for price dynamics or competition. Figure 2.8 shows avg.MAPE (similar to Figure 2.7) for time intervals (T + 1) ? T = 1, 2, 3, . . . , 14 hours, for all 6 di?erent models. We refer to our model (2.3) as “DYN&COMP,” since it incudes dynamics and competition features. We can see that STATIC and CART have the worst prediction performance, with an error uniformly larger than 10%. While our model performs the best, GAM, TIMEVARYING and Neural Nets are competitive, at least for smaller time intervals. In other words, for predicting less than 4 hours into the future, both GAM and TIMEVARYING pose alternatives with prediction errors not too much larger compared to DYN&COMP. However, their predictive performance breaks down for larger time intervals. In fact, the error of GAM is as large as 10% for predicting 14 hours into the future, which is 10 times larger than the corresponding prediction error 42

of DYN&COMP. While the performance of TIME-VARYING is somewhat better, its prediction error is 4 time as large as DYN&COMP for 14 hours time intervals (similar for the Neural Network). In the next section, we use the excellent forecasting performance of our model and build an automated bidding strategy around it.

2.5 An Automated Data-Driven Bidding Decision Rule
We now discuss the second component of our bidding strategy, building an automated and data-driven decision rule around our forecasting model. The decision rule provides answers to three basic bidding questions: which auction to bid on, when to bid on it and how much.

2.5.1 Decision Framework
Our decision framework is built upon the principles of maximizing consumer surplus (e.g.[11]). Consumer surplus is the di?erence between the actual price paid and the consumer’s willingness to pay for an item, CS = WTP - Price, where CS denotes consumer surplus, and WTP denotes willingness to pay. Therefore, the lower the price, the higher is a bidder’s surplus. For each individual auction, our forecasting model (2.3) provides bidders with that auction’s estimated future price; combining this with a bidder’s WTP leads to an auction’s estimated surplus. For a set of competing auctions, a plausible decision rule is to bid on that auction with the highest estimated surplus. Moreover, in order to avoid a negative surplus, a bidder should only bid on an auction if the predicted

43

price is lower than his WTP. Note that our forecasting model depends on the length of the time interval (T + 1) ? T (which we also refer to as the prediction window). Our model can only predict the ?nal price of an auction that ends at or before time (T + 1). Therefore, longer prediction windows will result in a larger number of candidate auctions, that is, in a larger supply of potential auctions to bid on. On the other hand, we have also seen in Section 2.4.3 that a larger time interval leads to an increased prediction error. Therefore, our decision rule faces a trade-o? between supply of candidate auctions and prediction accuracy for each individual auction. We will investigate this trade-o? in detail below. Our decision rule picks that auction with the highest estimated surplus, as long as the surplus is positive. After picking an auction, the next two questions are with respect to the time and amount of the bid. Since our forecasting model is based on a ?xed time interval, nothing is gained by waiting. So we suggest placing the bid as soon as an auction is picked. Moreover, since our model predicts an auction’s closing price at y ˆT +1 , we would expect to lose for bids lower than y ˆT +1 . Similarly, bids higher than y ˆT +1 are expected to overpay. Therefore, we suggest to bid exactly the expected (or predicted) closing price y ˆT +1 . In summary, our decision rule picks the auction with the highest predicted surplus, it bids the predicted price, and it places the bid immediately.

44

2.5.2 Experimental Set-Up
We conduct a simulation study to compare our automated bidding strategy to two alternate (and popular) bidding approaches: early bidding [8] and last-minute bidding [83]. Early bidding is often viewed as a bidder’s strong commitment and intends to deter others from entering the auction. Last-minute bidding is popular because it does not allow much time for other bidders to react. In our simulation, we assume that a bidder’s willingness-to-pay (WTP) is drawn from a uniform distribution [2] distributed symmetrically around the market value ($230 at the time of data-collection). That is, we assume WTP ? Uniform($220, $240). Our experiment then proceeds as follows. We randomly draw a WTP from that distribution. We also draw an auction from the validation set (i.e. we compare the bidding strategies on the same real-world data that we compare the forecasting models). The bidder then makes a bidding decision (whether or not to bid, and how much to bid) with each of the three bidding strategies outlined below. We repeat this experiment for all auctions in the validation set and for 20 di?erent random draws from a bidder’s WTP distribution.

2.5.2.1 Early Bidding Strategy
We assume that early bidders bid at the end of the ?rst auction day [8]. In fact, we ?nd that slightly earlier or later bid times barely a?ect the outcome of the experiment. The process of early bidding is illustrated in the left panel of Figure 2.9. A bidder compares his WTP with the auction’s current price at the end of

45

Price

Price

WTP 1%

WTP

Plate Auci

Pearly

T

T+1

Figure 2.9: Illustration of bidding strategies. The left panel illustrates early and last-minute bidding; the right panel illustrates our automated bidding strategy. the ?rst day (pearly ); if his WTP is higher, then he places a bid; otherwise, he does not place a bid and moves on to another auction. If he does place a bid, then the bid amount equals the WTP. Note though that due to eBay’s proxy bidding system which incrementally bids up to the WTP on behalf of the bidder, the ?nal price may be lower than the WTP. As a consequence, the bidder only pays the amount of the second-highest bid plus a pre-speci?ed bid-increment (which ranges between $2.5 and $5 in our case). We also investigate alternate bidding heuristics in Appendix B. However, none of these heuristics beat last-minute bidding or our automated bidding strategy.

2.5.2.2 Last-Minute Bidding Strategy
We assume that last-minute bidders place their bid one minute before the auction closes [83]. Last-minute bidding carries the danger that the bid does not go

46

through due to network congestion, but we will not explicitly consider this disadvantage in our simulations. The process of last-minute bidding is again illustrated in the left panel of Figure 2.9. A bidder compares his WTP with the auction’s current price one minute before closing (plate ). Similar to early bidding, if his WTP is higher, then he places a bid; otherwise, he does not place a bid and moves on to another auction. If he does place a bid, then the bid-amount is only incrementally higher than the current price, since the chances of being outbid within the last 60 seconds are small. In our simulations, we bid an increment of 2% over the current price plate . We also study the robustness to di?erent increments in Appendix B and ?nd that bid-increments of 1%, 2% or $2 yield comparable results.

2.5.2.3 Our Automated Data-Driven Bidding Strategy
Our automated bidding strategy is conceptually di?erent from early and lastminute bidding. Instead of making a bidding decision for each auction individually, our strategy requires a bidding decision for each time interval. Consider the right panel of Figure 2.9. At time T of decision making, there are four competing auctions, denoted by Auc1 -Auc4 , which all close before time T + 1. The solid lines correspond to the observed part of the auction history; the dotted lines denote the future (and yet unobserved) price path. Since all auctions close before T + 1, our model yields predictions of their ?nal prices (denoted by the solid black circles). Note that Auc4 has the highest predicted price; moreover, its predicted price is higher than the bidder’s WTP; hence the bidder will never consider this auction. Auc3

47

has the smallest predicted price; since the predicted price is also smaller than the bidder’s WTP, he places a bid on this auction. He bids the predicted price and he bids immediately, i.e. at time T . If he wins, then the bidder’s surplus will be the di?erence between his WTP and the actual closing price.

2.5.3 Simulation Results
Similar to [8], we compare all bidding strategies on two dimensions: the probability of winning the auction and the average surplus accrued. We compute the probability of winning (p.win) as the number of auctions won divided by the total number of auctions that the bidder placed a bid on. We compute the average surplus (avg.sur) as the corresponding di?erence between the WTP and actual price paid for an auctions. The results are shown in Table 2.5. Table 2.5: Comparison of di?erent bidding strategies. The ?rst row corresponds to last-minute bidding; the second row corresponds to early bidding; and the last row corresponds to out automated bidding strategy. We report the mean estimates (with standard errors in parentheses). p.win Last-moment bidding 95% (.5%) Early bidding Automated bidding 53% (2%) 61% (1%) avg.sur $17.97 ($0.35) $18.85 ($0.57) $32.33 ($1.95)

We see that last-minute bidders have the highest probability of winning (95%, compared to 61% for our automated bidding strategy). This is not surprising, since last-minute bidding is geared to out-witting the competition in the last moment. However, we also see that last-minute bidding accrue a signi?cantly lower surplus 48

compared to our automated bidding strategy ($18 vs. $32). Another way of comparing the two bidding strategies is via their expected surplus, i.e. the product (p.win × avg.sur). We ?nd that last-minute bidding yields an expected surplus of $17.11
6

while that of our automated bidding strategy is higher: $19.72. Moreover

while early bidders have a probability of 53% of winning the auction, their expected surplus is signi?cantly lower: $9.99 (=53% × $18.85).

2.5.3.1 E?ect of the Prediction Window
We have pointed out earlier that the length of the prediction window (i.e. the length of the time interval (T + 1) ? T ) has an e?ect on the outcome of our automated bidding strategy in that longer windows result in a larger supply of candidate auctions, but at the same time reduce the prediction accuracy of each individual auction. The results from the previous section (Table 2.5) are based on a prediction window of 12 hours and we have seen that it yields an expected surplus of $19.72 for our automated bidding strategy. Longer prediction windows yield a larger number of candidate auctions and as such a larger probability of including an auction with a lower price (and hence a higher surplus). On the other hand, longer prediction windows also lead to less accurate predictions. Less accurate predictions can either lead to overpayment (if the predicted price, and hence our bid, are higher than the actual price); overpayment leads to a lower surplus. Less accurate predictions can also lead to a reduced probability of winning the auction (if the predicted price, and hence our bid, are lower than the actual price). Thus,
6

For the expected surplus corresponding to other bid-increments, please refer to Appendix B.

49

a change in the prediction windows a?ects both the probability of winning as well as the average accrued surplus and it is not quite clear how it a?ects the overall expected surplus. To that end, we repeat the simulation study from Table 2.5 for prediction windows of di?erent lengths. Table 2.6 shows the results. Table 2.6: Tradeo? between the width of the prediction window and expected surplus. The ?rst column denotes the width of the window; the second column denotes the probability of winning; the third column denotes the average accrued surplus; and the last column denotes the expected surplus, i.e. exp.sur = (p.win × avg.sur). prediction window 14hrs 12hrs 9hrs 6hrs 3hrs p.win avg.sur exp.sur 59.01% 61.44% 67.82% 69.43% 75.77% $35.29 $32.33 $30.99 $29.60 $27.21 $20.82 $19.80 $21.02 $20.55 $20.62

We can see that larger prediction windows result in a larger average surplus which suggests that the e?ect of having a larger pool of candidate auctions outweighs the e?ect of overpayment. But we also see that larger windows result in a smaller probability of winning since the less accurate predictions more frequently yield bids below the auction’s actual closing price and hence an unsuccessful auction. Interestingly, the expected surplus is maximized for a prediction window of 9 hours. While our results do not prove optimality, they suggest a very interesting global optimization problem for future research.

50

Crawling, Data Purchase

User Preferences

Model Alternatives

Decision Alternatives

Search

Selection

Model

Bidding Decisions

Figure 2.10: Process of automated bidding and alternatives.

2.5.4 Practical Considerations
It is important to understand that our automated bidding approach relies on a number of key ingredients. In this study, we assume that appropriate bidding records are available and we only focus on deriving a model from these bidding records and subsequently designing a bidding strategy around that model. Before deploying our automated bidding approach, one also needs to put in place methods for searching and selecting the right bidding records (see Figure 2.10). Finding suitable bidding records can be accomplished in several ways, e.g., using automated agents such as web crawlers (e.g. [9]) or by directly purchasing bidding data (from data vendors such as Data Unison). Having a pool of bidding records, the next challenge is to select, from this pool, the right set of most relevant bidding records. One could ?nd this set via, e.g., a vector of desired product features (e.g., “iPod Nano, 8GB, yellow”) and then selecting only those bidding records that are most similar to the feature vector. Deriving a suitable similarity metric can be done using, for example, the spatial feature model proposed in [53], or the comprehensive metric proposed 51

in Chapter 4. Isolating product features from bidding records is made possible via eBay’s e?ort of standardizing certain product descriptions (e.g., product descriptions for MP3 players require ?elds such as brand, product line, storage capacity, color or condition); additional information is often contained in the unstructured descriptive text which may take more e?ort to mine. Related to the issue of search and selection is the issue of incorporating individual user preferences or risk tolerance into the bidding process. While some bidders may consider all relevant auctions as potential candidates, others may be more selective and wish to eliminate auctions based on certain constraints (e.g., eliminate auctions with seller ratings lower than a certain threshold, eliminate red iPods, etc.). This can again be accomplished in the selection step (see again Figure 2.10). In fact, when applying our method only to high-reputation sellers, the expected surplus increases to $20.05. We also want to point out that in practice one would apply our method repeatedly until a consumer’s demand is satis?ed. We assume here that a consumer has demand for only a single unit (for more discussion on multiple units see the next section) and that s/he does not have any time constraints. Then, our automated strategy would place a bid while continuously monitoring the remaining market – which could be done at no extra cost for the bidder using automated agents. Once the outcome of the ?rst bid is known, the strategy would then decide whether and (if the previous bid was unsuccessful) where to place the next bid, and so on. While a bidder could also decide to place more than one bid simultaneously, this runs the risk of winning two auctions which is undesirable in the case a single unit demand. 52

It is also important to note that the method proposed in this manuscript is modular in the sense that individual modules can be exchanged. For instance, one can replace the dynamic forecasting model by alternatives (such as regression models with di?erent sets of variables, GAM, CART, or our K-Nearest Neighbor forecaster proposed in Chapter 4); similarly, one can replace the bidding decisions by an alternate set of rules. All in all, in order for the approach to be deployed, one will ultimately have to rely on agent-based technologies, similar to those currently in place for bid-sniping (e.g. Cniper.com). With such technology in place, our automated bidding strategy will not only yield real monetary bene?ts in terms of a higher expected surplus, but also less tangible bene?ts such as more convenience in terms of a truly automated bidding process.

2.6 Conclusions
The increasing popularity of online auctions puts more and more pressure on bidders to make informed bidding decisions in the face of competition. While classic bidding strategies such as early bidding or last-minute bidding are well-understood in the academic literature, they do not account for competition originating from simultaneous auctions selling same or similar items. Moreover, while it is unlikely that every bidder uses early or last-minute bidding in exactly the same way, to date they can only augment and adapt these strategies with gut-feeling, intuition or experience. We propose a novel automated and data-driven approach that provides bidders with valuable objective information about an auction’s projected price in

53

the face of competition. Our approach consists of two main components. In the ?rst component, we derive a novel dynamic forecasting model for price in competing auctions. We show that our model outperforms several competitor models. In the second component, we build a comprehensive bidding strategy around our forecasting model, using ideas from maximizing consumer surplus. We ?nd that our strategy outperforms classical bidding strategies such as early bidding or last-minute bidding in terms of expected surplus accrued. One important issue is the potential e?ect of a forecasting model on the market as a whole. If every bidder had access to the same model and bid on the same auction (with the lowest forecasted price), then forecasts, and as a consequence bidding decisions, would become unstable. This is very similar to the stock market where investment houses deploy complex math models to guide investment decisions. In such a scenario there is a risk that, if all investors base their decisions on the same model, the model – and not the investments’ performance – could eventually drive the market. In this research, we are much less ambitious. While, at least in theory, one single model could eventually drive all bidding decisions on eBay, it is unlikely that it ever will. Rather, we view our automated bidding strategy, if ever deployed, as a decision tool that would be made available only to a few, select bidders and thus not destabilize the market. There are several ways in which this research can be expanded. We have already pointed to the problem of selecting the optimal prediction window in Section 2.5.3.1. Another way to expand this research is via allowing for closing and contin54

uing auctions. Recall that our current approach only consider auctions that close within the given prediction window. The reason is that our forecasting model is geared to the ?xed time interval (T + 1) ? T so we can only predict the ?nal price of auctions that end within that interval. Of course, one can roll the model one additional time period forward to make predictions at T + 2, based on the predicted values at T +1; however, predictions two time periods into the future (i.e. T ? T +2) are more uncertain than predictions only one step forward (i.e. T ? T + 1). It is not quite clear how to discount the additional prediction uncertainty in our decision framework. Another way to expand this research is via allowing for variable and adaptive WTP distributions. In our simulations, we assume that both early and last-minute bidders have the same WTP distribution. It may be possible that bidders with di?erent strategies also have di?erent product valuations. Moreover, we assume that the WTP distribution remains constant over our prediction window. While this may be realistic for short windows over only several hours, a bidder that wants an item immediately may have a di?erent valuation compared to a bidder that is willing to wait several weeks. All-in-all, there are many opportunities for future research and we hope to inspire some of it with this study.

55

Chapter 3 Model Selection for Improved Forecasting 3.1 Introduction
People participate in online bidding day and night and from all over the world in a competitive fashion which sometimes results in price advantages for the consumer. However, given a choice between several hundred or thousand identical (or similar) options, all closing at di?erent times, how can a consumer decide – in an e?cient manner – which option results in the lowest possible price? We accomplish this goal by developing a forecasting model for auction closing prices. Such models could alleviate the bidder’s decision process by, e.g., ranking all available auctions by their lowest predicted price (see Chapter 2 for details). The bidder could then focus his or her bidding e?orts only on those, say, K auctions with the lowest K predicted prices which greatly reduces the number of irrelevant choices and improves the e?ciency of the search task. One such model was developed in Chapter 2. To build that model, we ?rst create many features, including static, time varying, and price-dynamics features for focal and competing auctions, to capture the many di?erent pieces of information from the market that can a?ect the outcome of an auction, then conduct model selection to ?nd our ?nal model. However, the criterion based on which we selected the ?nal model is rather intuitive. 56

Model selection is quite challenging in this setting because the forecasting goal is di?erent compared to classical time series analysis. In classical time series analysis, one typically wants to forecast one particular time point of one particular series, such as sales at the end of the fourth quarter or the gasoline prices at the beginning of January 2009. This is di?erent in the context of online auctions. In the auction context, one needs to forecast an entire time window of a stream of simultaneous auction processes. For example, a bidder discovers the need for a product, such as a Palm hand-held device on 4/14, and that s/he decides to purchase this item within the next 12 hours. There are many qualifying auctions available in this online market; some may close within the next hour, while others remain open for another 9 or 10 hours. Thus, the bidder needs to predict the outcome of each auction that closes within the next 12 hours. In other words, we need a forecasting model that not only predicts well at the beginning of the 12 hour time window or at its end, but during its entire 12 hour duration. Classical model selection criteria such as AIC or BIC optimize the model performance for only one time point, and are thus not directly applicable to our situation. In this study, we investigate di?erent approaches to overcome these challenges. We address the problem of model selection for a continuous time interval by computing an entire distribution of a model selection criterion rather than only a point value. In Chapter 2, we intuitively use the average BIC score over the time interval as the selection rule. We now investigate di?erent ways to summarize this distribution for decision making and the impact of di?erent distribution summaries on the prediction task. We ?nd while the volatility of AIC or BIC’s distribution over the 57

prediction window results in extremely poor performance, their extremes work very well. We also ?nd that both price dynamics and competition features play a crucial component in forecasting an online auction. This work is currently under review at the Journal of Business and Economic Statistics. This chapter is organized as follows. In the next section, we brie?y restate how to capture all potentially relevant information, both from within an auction as well as from simultaneously competing auctions as we did in Chapter 2. Section 3.3 proposes an idea to perform model selection with the goal of making good forecasts across a continuous time interval rather than only a single time point. We conduct empirical studies using the Palm data set (see Appendix A for data description) to compare the di?erent approaches, both in terms of the di?erent models they select as well as in terms of their actual predictive capabilities in Section 3.4. We conclude with further remarks in Section 3.5.

3.2 Create Features to Capture Important Information
Many di?erent pieces of information potentially matter for the outcome of an ongoing auction. For example, we have discussed ways to capture dynamics and competition information for forecasting in Chapter 2. We now summarize features that are created to capture related information in order to forecast the outcome of an auction. The outcome of an auction may be a?ected by what happens within that auction. We therefore create a set of features to capture the information from within

58

the focal auction, including static information (such as condition of the product or the rating of the seller), time varying information (such as the number of bidders and time left), and price-dynamics (e.g. price-velocity and acceleration). Besides what happens within an auction, the outcome of an auction may also be a?ected by what happens outside, that is, in simultaneous auctions that all sell the same (or similar) product and thus compete for the same bidders. For instance, the seller ratings, the current prices or the number of bidders in those simultaneous auctions could all a?ect the outcome of the focal auction. Therefore, we create another set of features to capture the information from competing auctions. To the end, we use the average of the features in concurrent auctions to capture the average market condition and use the standard deviation to capture the volatility of the market. For example, the price competition is given by the average and standard deviation of prices in concurrent auctions, and the average price-velocity in concurrent auctions would be de?ned as the average of the corresponding pricevelocities, and so on. A complete list of created features can be found in Table 2.2 in Chapter 2, and a more detailed description of creation of all features is described in section 2.3.1 - 2.3.2 in Chapter 2. It is clearly seen that the competition features can be categorized into static competition features, time varying competition features, and price-dynamic competition features by the information that they carry. Moreover, our created features incorporate both price dynamics and competition information which is necessary for accurately forecasting the extremely dynamic and competitive environment. 59

Table 2.2 in Chapter 2 shows that over 30 di?erent variables are candidates for our forecasting model. Thus, an important ?rst step in our modeling e?orts will be the selection of a parsimonious subset of relevant variables.

3.3 Model Selection for Auctions Markets

Price

Auction End

Auction Start

T

T+1 hour

T+3 hours

T+8 hours Time

Decision Window

Figure 3.1: Illustration of the modeling task.

Our task is to ?nd a model for the auction market. As pointed out earlier, the market consists of all auctions that sell the same (or similar) item (during a certain period of time). Take Figure 3.1 for illustration. In that market, we have 3 auctions selling the same item. What complicates the modeling task is that all auctions start (and hence end) at di?erent times: one auction ends in the next hour, another one ends in 3 hours and the third auction ends in a little more than 8 hours. At time T , the bidder wants to make a decision on which of the three auctions to bid on. Since

60

the bidder’s decision window extends 8 hours into the future, the market model must consider all time points inside that window. The implication for model selection is that we need a model that works well not only at a single time point, but across the entire time window. Thus, our goal is to ?nd a model that best describes price in this market, taking into account the e?ect of competition between auctions as well as di?erences in price dynamics across auctions. Forecasting an entire time window is challenging since all statistical models are geared towards a single time point only. Take for illustration the (intentionally simple) time series model yT +1 = yT +
T.

(3.1)

(The same argument would apply for more complex models also.) Model (3.1) implies that, given information up to time point T, we can forecast the response at time (T+1). However, model (3.1) also implies that we can only forecast the ) or at (T + 2 ). Thus, the best model selected response at (T+1), and not at (T + 1 2 3 (e.g. using model selection criteria such as AIC or BIC) is optimal only for time steps of length ? := (T + 1) ? T , and not for any time steps that are shorter (or longer). As we’ve argued above, such a model is not very meaningful for the eBay bidder! We propose to investigate new model selection criteria that can overcome this challenge. Model selection has been researched in the statistics literature for a while [13] and it is receiving increasing attention today with the availability of more and more data sets featuring larger and larger number of variables [30]. Our goal

61

is to ?nd a model that, given a desired time T of decision making, predicts well universally across an entire time window, say, T + ?, where ? could be as small as a few hours or as long as a few days, depending on the time frame within which the bidder wants to place a bid. We describe our idea next.

3.3.1 Model Selection for Time Windows
Classical model selection criteria, such as AIC or BIC, are geared towards models such as in equation (3.1) and thus only produce pointwise optimal results. We propose to generalize this idea to apply to entire time windows and thus to produce a distribution of model selection results.

AIC(?)

T

T+ ?i

T+ ?k

Time

Decision Window: ?

Figure 3.2: Distribution of model selection criterion.

The basic idea is illustrated in Figure 3.2. Suppose we have a model of the

62

form yT +? = f (yT , XT ) +
T,

? ? [0, ?],

(3.2)

where T denotes the time of decision making, and ? denotes the time increment which we would like to predict. (While f () could denote any functional relationship between response and predictors, we consider linear models in our application.) Note that we let this time increment vary in the interval [0, ?], where ? corresponds to the length of our decision window. For each time increment ? , we can compute a corresponding model selection criterion. Let us assume (for the moment) that we choose Akaike’s information criterion (AIC) for decision making. (Later, we will also consider alternate model selection criteria such as the Bayesian Information criterion, BIC.) Then, for each ? ? [0, ?], we compute AIC(? ) and thus obtain a distribution of model selection values. This distribution is indicated by the solid black line in Figure 3.2. Note that we can only compute AIC(? ) for a training set, that is, for a set of data for which we know all values at T as well as at T + ? . Later, we will apply the model to a holdout set, that is, a set of data for which we only know values at T and we wish to predict future values T + ? . Our objective is to select, among a set of candidate models, a model that performs well across all time increments ? ? [0, ?]. In practice, there are two challenges associated with this objective. First, AIC(? ) is measured over the continuous interval [0, ?]. Since we cannot evaluate AIC(? ) over a continuous interval, we ?rst select a ?ne grid 0 ? ?1 < ?2 < · · · < ?n ? ? and then compute AIC(?i ) for all

63

i ? {1, 2, . . . , n}. The second challenge is that requiring a model to perform uniformly better than all other models does not lead to any results in our application. In other words, let Mk , k = 1, . . . , K , denote a set of candidate models and let AICMk (?i ) denote the k th model’s AIC value at time increment ?i . Then, in our application, we may not ?nd any model, say M? , for which

AICM? (?i ) ? AICMk (?i ), ?i = 1, . . . , n.

Thus, we resort to an approach where we do not require uniformly better performance, but rather performance that is better as measured by an appropriate summary statistic of AIC(? ). We consider several di?erent summary statistics, such as the average, the extremes as well as the variance, to elicit a model that performs well across the entire interval [0, ?]. More speci?cally, we compute, for i ? {1, 2, . . . , n},

AICavg := (1/Ni )
Ni

{AIC(?i )}

(3.3)

AICmed := Mediani {AIC(?i )} AICmin := Mini {AIC(?i )} AICmax := Maxi {AIC(?i )} AICsd := Standard Deviationi {AIC(?i )} AICmean+sd := AICavg + AICsd

The rationale for investigating these 6 di?erent summary model selection cri64

teria is as follows. Both AICavg 1 and AICmed consider a model’s central performance and are as such natural candidate selection criteria. However, a model that is performing well on average may not be the best model for decision making. To that end, AICmin and AICmax consider a model’s extreme performance. While AICmin points out a model’s best performance, AICmax gauges the worst performance. In that sense, choosing AICmax for model selection ?nds the model that minimizes the worst loss – very similar in spirit to a minimax criterion. On the other hand, AICmin is the most optimistic selection criterion which identi?es the time increment ? and associated model with the best performance. The next model selection criterion, AICsd , gauges the volatility of a model’s predictive performance over the interval [0, ?]. The rationale is that models with less volatility will, in general, lead to more stable decision making. The last criterion, AICmean+sd , combines the e?ect of good average performance (i.e. small AICavg ) and little volatility (i.e. small AICsd ). Since we have no a-priori knowledge on which of these 6 model selection criteria will perform best, we will let the data speak.

3.3.2 Variable Pre-Selection and Multicollinearity
As we saw in Table 2.2, we have 33 di?erent candidate variables for our forecasting model. Our goal is to select the subset of these 33 variables that results in the best-possible model. There are 8,589,934,591 di?erent ways of selecting a subset of m, 1 ? m ? 33, variables, too many to enumerate manually. Moreover, in order
Notice that compare with the de?nition for AICavg in Chapter 2, all time points, including those where AIC (or BIC) is not available due to non-su?cient data for modeling, is now counted in this de?nition of AICavg .
1

65

to gauge a model selection criterion’s entire distribution as in Figure 3.2, we need to evaluate each candidate model on a ?ne grid 0 ? ?1 < ?2 < · · · < ?n ? ? which becomes computationally infeasible as n becomes larger. Our goal here is not to derive a computationally e?cient algorithm for this task (which would indeed be an interesting challenge for future research), but rather to illustrate the performance of the di?erent model selection summary criteria in Equation 3.3. Therefore, we ?rst preselect a subset of more realistic candidate variables from the total of 33, and then perform exhaustive search on the remaining variables. We select this subset of more realistic candidate variables considering potential multicollinearity among predictors.

1.0 30 25

0.5

20

0.0 15 10

?0.5

5

5

10

15

20

25

30

Figure 3.3: Pairwise mean correlation among the 33 variables. The x- and y-axes denote the variable index (between 1 and 33); the color corresponds to the strength (between -1 and 1) of the pairwise correlation.

To that end, we investigate the pairwise correlations of all 33 variables. Note that this correlation may change, depending on the density at which a variable is

66

measured. Therefore, in similar fashion to above, we ?rst evaluate each variable at di?erent densities, T + ?i , 0 ? ?i ? ?, and then compute pairwise correlations between pairs of variables at each density level. After that, we summarize the correlations in the same way as in Equation 3.3. Figures 3.3 and 3.4 show image plots of the corresponding pairwise mean correlations, minimum correlations and maximum correlations, respectively.

1.0
30 30

1.0

25

0.5

20

20

25

0.8

0.6

0.0
15 15

0.4

10

?0.5

10

0.2

5

?1.0

5

0.0

5

10

15

20

25

30

5

10

15

20

25

30

Figure 3.4: Pairwise minimum (left panel) and maximum (right panel) correlation among the 33 variables. The x- and y-axes denote the variable index (between 1 and 33); the color corresponds to the strength (between -1 and 1) of the pairwise correlation. We can see that many of the variables are highly correlated. For instance, we ?nd high collinearity between the current price and c.price.avg, the current price and c.price.vol, the current price and c.price.disc, the current price and time left, the current price and c.t.left.avg, the current price and number of bids, the current price and c.nbids.avg, price-velocity and c.vel.avg, price-velocity and c.vel.vol, and price-acceleration and c.acc.avg. This high collinearity is not surprising since many 67

of these predictors carry similar information, only coded in a slightly di?erent way. We eliminate all highly collinear predictors. In particular, for each pair of collinear predictors, we keep the variable that measures price or price-dynamics of the focal auction (i.e. current price, price-velocity or price-acceleration). While we could also keep alternate variables, this approach results in the most parsimonious model. In that fashion, we retain 6 variables for further analysis; these 6 variables are listed in Table 3.1.

3.4 Results 3.4.1 Model Selection
Table 3.1 lists the 6 candidate variables that we retain for further analysis including both information from within (price, price-velocity and -acceleration) and outside (volatility of acceleration, time left and number of bids of competing auctions). For ease of notation, we refer to each variable by a number. For instance, variable #1 refers to the price, variable #2 refers to the price velocity, and so on. We now enumerate all possible models based on these 6 variables; that is, we evaluate a total of 63 di?erent models. For each model, we evaluate the 6 di?erent summary model selection criteria in equation (3.3), using ? = 12 hours and a grid of 0 ? ?1 < ?2 < · · · < ?n ? ? of n = 12 di?erent time increments. We use a grid-density of ?i ? ?i?1 = 1 hour since, for our data, on average 1 bid arrives every hour (during the last 12 hours of the auction). We do this for both AIC and BIC model selection criteria. Tables C.1-C.9 (Appendix C) list all the results. 68

Table 3.1: Variable Index Var Number 1 2 3 4 5 6 Var Name price price-velocity price-acceleration c.acc.vol c.tleft.vol c.nbids.vol

Table 3.2: Best models according to BIC (top half) and AIC (bottom half). The number of model parameters is denoted by p. The bold cells correspond to the best model within each column. p 1 2 3 4 5 6 BICavg 1 1,2 1,2,3 1-5 1-6 BICsd 2 2,3 2,5,6 2-6 1-6 BICmed 1 1,2 1,2,3 1,2,3,5 1-5 1-6 BICmin 1 1,2 1,2,3 1-4,6 1-6 BICmax 1 1,2 1,2,3 1-3,5,6 1-6 BICmean+sd 1 1,2 1,2,3 1,2,3,5 1-5 1-6

1,2,3,5 2,3,5,6

1,2,3,6 1,2,3,5

p 1 2 3 4 5 6

AICavg 1 1,2 1,2,3 1-5 1-6

AICsd 2 2,3 2,5,6 2-6 1-6

AICmed 1 1,2 1,2,3 1-5 1-6

AICmin 1 1,2 1,2,3 1-4,6 1-6

AICmax 1 1,2 1,2,3 1,2,3,5 1-3,5,6 1-6

AICmean+sd 1 1,2 1,2,3 1,2,3,5 1-5 1-6

1,2,3,5 2,3,5,6

1,2,3,5 1,2,3,6

69

Table 3.2 summarizes the results from Appendix C. In each cell, the table lists the best model for a corresponding combination of (selection criterion)× (number of model parameters (p)). For instance, the cell corresponding to p = 1 and BICavg says that the best 1-parameter model picked by BICavg is a model with only the price (see again Table 3.1 for the index of all variables). The highlighted cell within each column corresponds to the best model across all values of p (for a speci?c selection criterion). For instance, the highlighted cell ({1-5}) in the ?rst column of the top half says that BICavg picks a model with all variables (except c.nbids.vol) as the best model across all values of p. We can make several observations in Table 3.2. First, we note that the cells in the top half are identical to the cells in the bottom half; this means that both AIC and BIC pick the same model (given a certain selection criterion and ?xed value of p). However, we also note that the highlighted cells in the top and bottom tables are not identical; this implies that while for a given p, AIC and BIC result in the same model, they select di?erent models across p. For instance, while BICavg picks a 5-parameter model, AICavg picks a 6-parameter model. (Similar for BICmed and BICmean+sd .) Only BICsd and BICmax select the same models as their AIC counterparts. Looking across rows, it is interesting to note that, for a given p, almost all selection criteria pick the same model – the main exception being BICsd and AICsd . That is, while BICavg , BICmed , BICmin , BICmax and BICmean+sd almost always agree on the same model (and similar for their AIC counterparts), BICsd consistently disagrees. This seems to suggest that using the volatility of the model selection 70

distribution measures a very di?erent aspect of model performance compared to its center or its extremes. In conclusion, we ?nd that di?erent selection criteria can result in di?erent models, with the volatility of the model selection distribution showing the strongest deviations. However, we also want to point out that we have not yet determined the overall winner. The reason is that we have not yet determined which of the selected models actually results in the best forecasts. To that end, we investigate each model’s predictive performance on a holdout set. We discuss this next.

3.4.2 Prediction Accuracy
We measure each model’s predictive capabilities on a holdout set. To that end, we divide our data into a training set (80% of the data), and a validation set (remaining 20% of the data). Since our data varies over time (and since we are primarily interested in making accurate predictions of the future), our training set consists of all auctions that complete during the ?rst 80% of our data’s time span (i.e. between March 14 and May 10); the validation set contains all remaining auctions (i.e. between May 11 and May 25). We measure predictive performance of a model in terms of its mean absolute percentage error (MAPE). For each T + ?i , 0 ? ?i ? ?, we compute

MAPEi =

1 m

j =1

|yi,j ? y ˆi,j | |yi,j |

(3.4)

where yi,j and y ˆi,j denote the true and predicted values of auction j at time increment 71

?i , respectively, and m denotes the number of auctions available. Figures 3.5 and 3.6 show the results.

.32

Prediction MAPE

Prediction MAPE

1,2,3,5 1,2,3 1,2,3,6 1?5 1?6 2?6

0.015

.16

.08

2

4

6

8

10

12

0.000

0

0.005

0.010

1,2,3,5 1,2,3 1,2,3,6 1?5 1?6

.24

2

4

6

8

10

12

timestep

timestep

Figure 3.5: Prediction accuracy of the 6 top models. The left panel shows the prediction accuracy for all 6 models; the right panel shows a zoom-in on the 5 best models (leaving out the worst model of the right panel). The x-axis corresponds to the time increment ?i , 0 ? ?i ? 12, at which we make a prediction.

The left panel in Figure 3.5 shows the performance of the 6 best models. We can see that one model (with variables {2-6}) performs extremely poorly relative to the remaining 5 models. For that model, the prediction error is 8% for forecasting only one hour into the future; it increases to almost 32% for forecasting 12 hours into the future. Clearly, that model is not a candidate for the best predictive model; in order to get a better understanding of the remaining 5 models, we zoom-in (right panel of Figure 3.5). We can see that, of these 5 models, 2 (with variables {1,2,3,5} and {1,2,3}) have almost identical performance. We can further distinguish the performance of these 2 models in Figure 3.6. We can see that model {1,2,3,5} (slightly) outperforms model {1,2,3} for most time increments (especially for small,

72

1,2,3,5 1,2,3 1,2,3,6 1?5 1?6
Prediction MAPE

6e?04

Prediction MAPE

4e?04

2e?04

0e+00

1.0

1.5

2.0

2.5

3.0

3.5

4.0

0.002

0.004

0.006

0.008

0.010

0.012

0.014

9.0

9.5

10.0

10.5

11.0

11.5

12.0

timestep

timestep

Figure 3.6: Prediction accuracy of the 5 best models from Figure 3.5. The left panel shows a zoom-in on the ?rst 4 hours, the right panel shows a zoom-in on the last 3 hours. e.g. 1 hour, and large, e.g. 12 hour, time increments). We comments further on the di?erence of these two models below.

3.4.3 The Winner
Let’s recap: Section 3.4.1 illustrated the performance of each individual summary model selection criterion and we learned that while di?erent model summarizes can point to di?erent models, some select the same model. Section 3.4.2 investigated the performance of the top 6 models and we learned that while there is one clear loser, the two top models perform almost equally well. Now, we connect the analysis from Sections 3.4.1 and 3.4.2 and we reveal which summary model selection criterion leads to the best predictive model. Table 3.3 shows the results. The ?st column denotes the name of each model selection criterion and the second column denotes the best model it selects. (Note

73

that since some selection criteria select the same model, we list more than one name in the ?rst column). This information is taken from the model-selection analysis in Section 3.4.1. The remaining 3 columns refer to the prediction-based ranking of each model. Since the goal is to develop a model that predicts well in the short-term (i.e. for small time increments) as well as in for medium to longer time increments, we present 3 di?erent rankings: one for predicting the next hour, one for predicting 6 hours into the future and one for predicting 12 hours into the future. The rankings are taken from the prediction-accuracy discussed in Section 3.4.2. We can see that BICmax and AICmax dominate: both criteria ?nd the (same) model (with variables {1,2,3,5}) that predicts best for short (1 hour) as well as long (12 hours) time increments. These two criteria are only outdone for the medium time increment (6 hours) for which BICmin selects the best model (one which drops variable #5). It is interesting to see that while the maximum as a distribution summary performs equally well under both AIC and BIC, AICmin performs signi?cantly worse than BICmin (and similar for the average, medium and medium + sd). Using the volatility as selection criterion (i.e. BICsd or AICsd ) performs uniformly worst. We can learn from Table 3.3 that the extremes as summaries of the model selection distribution result in the best performance. Both BICmax and AICmax select the best model; we pointed out earlier that choosing the maximum – just like a minimax criterion – protects against the worst loss, so this performance may not come too much as a surprise. It is more surprising though that BICmin fares almost equally well; the minimum selects the most optimistic model across all time increments which seems to suggest that there is not too much heterogeneity across 74

Table 3.3: Selected Models and Prediction Accuracy Selection Criteria BICmax , AICmax BICmin AICmin BICavg ,BICmed ,BICmean+sd AICavg ,AICmed ,AICmean+sd BICsd ,AICsd Selected Model 1,2,3,5 1,2,3 1,2,3,6 1-5 1-6 2-6 Prediction Rank 1 hour 6 hours 12 hours 1 2 3 4 5 6 2 1 3 5 4 6 1 2 4 3 5 6

di?erent time increments (at least for our data). The performance of the central statistics (mean, median) is most surprising since, at least intuitively, one would expect good performance from a model that is selected according to average model quality. The poor performance of the volatility as a summary measure indicates that controlling the variability of a model’s quality (around its mean) does not gain much in terms of predictive accuracy.

3.5 Conclusion
In this chapter we consider model selection when the goal is to ?nd a forecasting model that works well across an entire range of time increments, producing an entire distribution of model selection criteria. We investigate di?erent ways of decision-making based on that distribution ?nd that the extremes lead to the most accurate forecasting models. There are several avenues for future research. As pointed out earlier, e?cient

75

algorithms are necessary to perform model selection. While classical model selection can already be very computationally intensive, searching for models that work well over an entire distribution of time increments multiplies the computational burden. Moreover, it would be interesting to see if di?erent ways for summarizing the distribution of a model selection criterion leads to better models, or if summaries can be combined in a more e?cient way.

76

Chapter 4 Real-Time Forecasting of Online Auctions via Functional K-Nearest Neighbors 4.1 Introduction
Online auctions, such as those on eBay.com, have received a surge of popularity in recent years. This is in part due to their wide accessibility, their low participation barriers, and also due to the auction mechanism which engages its participants in stimulating competitive behavior. The popularity of online auctions has lead to a growth in related research and particularly in the desire to predict the outcome of an auction before its close. Knowing the auction’s closing price has several advantages for auction participants. Bidders can use this information to make more informed (and perhaps even automated) bidding decisions [57]. Sellers can use predictions to identify times when the market is more favorable to sell their products and to better evaluate the value of their inventory. Di?erent approaches have been proposed to predict the price of an ongoing auction. We used regression-based models to forecast an auction’s ?nal price in a dynamic fashion in Chapter 2 and 3 (see also [32; 59; 96]). Common across these models is that they use information from a set of past auctions to predict an ongoing auction of interest. Moreover, for the purpose of model estimation, they weigh the

77

information from each past auction equally. For instance, if the goal is to predict the price of a laptop auction based on a sample of historical auctions, then estimating a regression-type model will put equal weight on the information from a Dell laptop and from an IBM laptop – which may be inappropriate if the goal is to predict an auction for a Sony laptop. While some of the brand and product di?erences can be controlled using appropriate predictor variables, there might still be intrinsic di?erences that are hard to measure. An alternative to regression-based models which was proposed by [16] is a classi?cation and regression tree. However, the authors point out that the prediction can be poor if prices in each ?nal tree-node vary signi?cantly. Moreover, while trees, unlike regression, manage to partition the data in a very ?exible way, their predictions, like those of regression, are also based on the un-weighted information in each ?nal node. In this chapter, we propose a novel and ?exible approach for forecasting online auction prices based on the ideas of K-Nearest Neighbors (KNN). This work has been recommended for publication at International Journal of Forecasting with minor revision. KNN is a forecasting approach that weighs the information from each record di?erently, depending on how similar that record is to the record of interest. For instance, if our goal is to predict the price of an auction for a Sony laptop, then it will put more weight on information from other Sony laptops and it will downweight the information from, say, Dell or IBM laptops. More speci?cally, KNN predicts a record based on the weighted average of the K nearest neighbors of that record, where the weight is proportional to the proximity of the neighbor to the predicted record. KNN has been proven to converge to the true value for arbitrarily 78

distributed samples [91; 23; 63], but studies show that its e?ectiveness is greatly a?ected by the choice of the number of neighbors (K ) and the choice of distance metric [19; 34; 87; 62]. In the context of online auctions, the choice of the distance metric is challenging because auctions vary on many conceptually di?erent dimensions. In particular, online auctions vary in terms of three types of information: static, time-varying and dynamic information. Static information comprises of information that does not change throughout the auction. This includes product characteristics (e.g., brand, product condition), or auction and seller characteristics (e.g., auction length, whether there is a secret reserve price, or whether the seller is a powerseller). Timevarying information updates itself during the auction (e.g., the number of bids or bidders). Both static and time-varying information have been shown to be important for forecasting the auction price because di?erences in product or bidding characteristics all in?uence bidders’ decisions and hence the ?nal price. Finally, auctions also vary in terms of their dynamic information. Dynamic information refers to the price path and its dynamics. These include the price-speed and the rate at which this speed changes throughout the auction. Auction dynamics are important for forecasting the ?nal price because an auction that experiences fast price movements in the earlier stage will likely see a slow-down in price in later stages; conversely, auctions whose price travels very slowly at the beginning often see price-accelerations towards the end (e.g. [96; 54; 85]). Auction price dynamics can be captured via functional objects such as curves. This means that bids are viewed as a discrete realization of an underlying smooth 79

price path. Using smoothing methods (see [88]), this price path is recovered from the discrete observations and the smoothness of the resulting object allows gauging of dynamics via taking derivatives. In this chapter, we propose a novel functional KNN forecaster (fKNN), which combines functional and non-functional data, for forecasting price in online auctions. One challenge with functional methods is the choice of smoother. Typical smoothers include penalized splines (p-splines) or monotone splines (see Section 5.2 in Chapter 5 for more details). However, while p-splines cannot guarantee the monotonic nature of the auction price growth, monotone splines can be computationally burdensome. An alternative is to use a ?exible parametric approach that can capture di?erent types of price growth patterns. [45] proposed a set of four parametric growth models for capturing price paths of online auctions (For details, see Section 5.2 in Chapter 5). In Section 4.3, we propose a parsimonious parametric form that generalizes these four growth models. Our parametric model has many appealing features such as monotonicity and computational e?ciency. It is particularly important within the context of fKNN since it allows us to measure the distance between auctions’ dynamics in a very parsimonious way via the Kullback-Leibler distance [12]. Our fKNN forecaster, which integrates information of various types, uses different distance metrics for each data-type. In Section 4.4 we discuss the di?erent distance measures and how they are combined into a single distance metric. We also discuss another important aspect of KNN forecasters, which is the choice of K . While choosing K too small eliminates important information, choosing 80

K too large results in noise that deteriorates forecast accuracy. The goal is to ?nd a value that best balances signal to noise. [91] found that K can depend on the distribution of the data and that the optimal K often grows with the sample size. In this study, we investigate the optimal value of K as a function of di?erent distance metrics as well as of data size and heterogeneity. The chapter is organized as follows. In Section 4.2, we introduce the two sets of eBay data used in this study and discuss their level of heterogeneity. In Section 4.3, we discuss a ?exible parametric model for capturing the price path in online auctions. Section 4.4 investigates the choice of the distance metric (combining distance metrics for static, time-varying and dynamic data) and the optimal choice of K . In Section 4.5, we describe the results of applying the f-KNN forecaster to the two datasets, and compare it to some competing approaches. We conclude and discuss possible extensions in Section 4.6.

4.2 Data
We use two datasets from the popular marketplace eBay. The datasets vary in terms of heterogeneity. The ?rst dataset contains auctions that sell an identical product - Palm Pilot M515 PDA, while the second dataset contains auctions for various laptops. Each dataset is described brie?y next and in further detail in Appendix A.

81

4.2.1 Palm PDA Auctions
Our ?rst dataset includes the complete bidding records for 380 auctions that transacted on eBay between March and May, 2003. Each auction sold the same product, namely, a new Palm M515 handheld device. At the time of data collection, the market price of the product was about USD $230 (based on Amazon.com). Each bidding record includes the auction ID, the starting and closing times and prices, all bids with associated time stamps, and other information such as auction duration, shipping fee, seller’s feedback score, whether the seller is a power seller, whether the product is from an eBay store, and whether the auction descriptions include a picture. All these variables contain information that can a?ect the ?nal price of the auction. The complete summary statistics for these variables is presented in Table A.1 in Appendix A. We now brie?y describe what aspect of the auction process the individual variables measure and how they are related to the ?nal price. The opening price is set by the seller and is known to in?uence the number of bidders the auction attracts. As for the ?nal price, eBay uses second-price auctions where the winner is the highest bidder and s/he pays the second highest bid (plus an increment). Hence the ?nal price is equal to the second highest bid plus an increment. Auctions can vary in their duration (between 3 and 10 days, in our data), with 7-day auctions being the default. In terms of auction competition, the average number of bids is 17.45 and the average number of bidders is almost 9. The average shipping fee, set by the seller, is $15.44. This fee is often perceived as a “hidden cost”. Another

82

piece of relevant information is the seller’s feedback score, which is approximately the number of transactions that the seller completed on eBay. A seller’s feedback score often proxies for his/her credibility. In our data the highest seller rating is 27,652. We can also learn from Table A.1 that over 87% of all auctions featured a picture. Pictures carry visual information about products, thus enhance bidders’ con?dence in the quality of the item. Power sellers are sellers with consistently high volumes of monthly sales, over 98% positive ratings, and PayPal accounts in good ?nancial standing. We can see that 30% of sellers are power sellers. And lastly, sellers with feedback scores of 20 or higher, veri?ed ID, and PayPal accounts in good ?nancial standing are permitted to open “stores” on eBay. Stores provide easy management of accounts and improved brand boosting when the sellers have multiple items listed. In our data approximately 30% of all auctions are associated with an eBay store.

4.2.2 Laptop Auctions
While the Palm PDA dataset is very homogenous in terms of the product sold, the second dataset consists of auctions for a collection of laptops, featuring products of many di?erent makes and models. The data contain information on 4,965 laptop auctions that took place on eBay between May and June, 2004. Table A.2 in Appendix A summarizes the data. We can see that while some auction variables are similar to those of the Palm PDA

83

data, others are di?erent. For instance, Buy-It-Now auctions are listings that have the option of a ?xed-price transaction and thus forego the auction mechanism. Over 20% of the laptop auctions included that feature. Moreover, a secret reserve price is a ?oor price below which the seller is not required to sell. This feature is particularly popular for high-value auctions. We can see that roughly 30% of all laptop auctions make use of the secret reserve price feature. The main di?erence between the Palm PDA data and the laptop data is that the latter include products of a wide variety of makes and models. Table A.2 show that the data include over 7 di?erent brands, and for each brand laptops di?er further in terms of their memory size, screen size, processor speed, whether they are a new or used product, and whether or not they include an Intel chip or a DVD player. All-in-all, the products sold in these auctions are of a wide variety which is re?ected in the wide range of closing prices (between $445 and $1,000).

4.3 A Functional Model for Capturing Price Growth Patterns
Our fKNN forecaster includes both functional and non-functional data. By functional data we mean a collection of continuous objects such as curves, shapes or images. Examples include measurements of individuals’ behavior over time, digitized 2- or 3-dimensional images of the brain, or recordings of 3- or even 4-dimensional movements of objects traveling through space and time. In our context, we consider the price path of an online auction. Such data, although often recorded in discrete fashion, can be thought of as continuous objects represented by functional

84

relationships. This gives rise to the ?eld of functional data analysis [79]. Functional data consist of a collection of continuous objects. Despite their continuous nature, limitations to human perception and measurement capabilities allow us to observe these objects at discrete time points only. Thus, the ?rst step in functional data analysis is to recover, from the observed data, the underlying continuous functional object. This is usually done with the help of data smoothing. Typical data smoothers include penalized splines or monotone splines [88]. In this chapter, we suggest a novel approach to recover the functional objects via a Beta model. The main advantage of the Beta model is that it allows us to measure distances between two functional objects via the Kullback-Leibler distance. In contrast to penalized splines, it guarantees monotonicity of the resulting functional object, which is important for modeling monotonic price growth behavior in auctions. Compared to monotone splines (which also result in monotonic representations), the Beta model is computationally much more e?cient1 . Recently, [45] proposed a family of four growth models for representing auction price paths. Our approach via the Beta model generalizes this idea and includes the four growth models as special cases.

4.3.1 The Beta Model
We model an auction’s price path using the Beta cumulative distribution function (CDF). The Beta distribution is a continuous probability distribution de?ned on the interval [0, 1] with two shape parameters, ? and ? , that fully determine the
We use the popular R function smooth.monotone in the fda package. An alternative is to use the pcls function (and accompanying functions gam, smoothCon, and mono.con) in the mgcv package, which is computationally more e?cient, but from our experience it produces inferior ?ts.
1

85

distribution. Its CDF can be written as
x 0

F(x, ?, ? ) =

u??1 (1 ? u)? ?1 du B (?, ? )

(4.1)

where B (?, ? ) is the beta function2 ([1]), a normalization constant in the CDF to ensure that F (1, ?, ? ) equals to unity. We model auction price paths with the Beta CDF in the following way. Let p denote the sequence of observed prices with associated time-stamps t. Since auctions can be of varying durations, we normalize the time sequence by tn = t/Duration, which yields time-stamps between 0 and 1. Similarly, auctions close at di?erent prices, so we normalize the observed prices by pn = p/ClosingP rice which yields values of pn between 0 and 1. The goal is then to ?nd the values of ? and ? that satisfy pn =
tn 0

u??1 (1 ? u)? ?1 du/B (?, ? ) for every element of pn and tn .

In the context of real-time forecasting, we only observe price paths up to some time T (with associated price P ). We therefore estimate ? and ? by normalizing the time and price scales to [0, T ] and [0, P ], respectively (i.e. tn = t/T and pn = p/P ). Estimation is done by error minimization (The algorithm for e?ciently ?tting the Beta model to auction data is described in detail in Section 5.3.1 in Chapter 5). Figure 4.1 shows typical paths produced by the Beta model for di?erent values of ? and ? . The solid black line represents the case of rapid price growth at the beginning and at the end, but only little growth during the middle; this case would be representative of auctions with intense early and last-moment bidding, but only
2

B (?, ? ) =

1 0

u??1 (1 ? u)? ?1 du

86

1.0

0.0

0.2

0.4

0.6

0.8

(0.5,0.5) (5,1) (1,3) (2,2)

0.0

0.2

0.4

0.6

0.8

1.0

Figure 4.1: Typical price paths based on the Beta CDF with varying shape parameters (?, ? ) little bidding activity in between – a case that is pretty common on eBay. The solid gray represents auctions that experience little bidding activity during most of the auction with bidding picking up only towards the end. In contrast, the dotted black line corresponds to auctions with high early activity which levels o? as the auction progresses. And lastly, the dashed gray line corresponds to auctions where most of the bidding occurs during the middle part (and not at the beginning or the end), a case that, while rather uncommon, occurs from time to time on eBay. One important consequence of the Beta model is its closed form of representation of price dynamics and acceleration by the ?rst and second order derivatives of the price?time model. Besides this, there are other nice properties of the beta model which make it advantageous over existing models. We will explore those properties in Chapter 5. 87

4.3.2 Model Estimation
We estimate the Beta model for our auction data in a way that optimizes ?t in both the x and y directions. In the auction context, the x direction corresponds to time and a good ?t in that direction is necessary in order to accurately capture points of di?erent bidding activity (e.g., early or last-minute bidding). We also require our model to ?t well in the y-direction, which corresponds to price. A good ?t in y-direction will guarantee accurate forecasts of an auction’s ?nal price which is the main goal of this study. Since we ?t the model in both x and y directions, we measure goodness of ?t by examining the residual error in both directions. For the ith auction with n bids, we de?ne the residual as
n

1 Residi = 0.5(yk ? y ˆk )2 + 0.5(xk ? x ˆk )2 , n k=1 which is the average of the sum of squared errors in both x and y directions. Note that the smaller the residual error, the better our model represents the auction price-path. Figure 4.2 illustrates the model ?t for the Palm PDA data. The left panel shows the distribution of residuals for the Beta model; the other two panels show the corresponding distributions for the growth models [45] and penalized splines, respectively. We can see that the Beta model results in the best model ?t, i.e. in the smallest residual error3 . The results are very similar for the laptop auctions.
3

We conduct a more complete comparison of all smoothing models in Chapter 5.

88

Beta model
80 80

4?member growth model
80

p?splines

60

Density

60

Density

Density

40

20

40

0

20

0.00

0.05

0.10 SSE

0.15

0.20

0

0

20

40

60

0.00

0.05

0.10 SSE

0.15

0.20

0.00

0.05

0.10 SSE

0.15

0.20

Figure 4.2: Residual comparison for ?tting three models: Beta model (left), growth model (middle), and p-splines (right).

4.3.3 Kullback-Leibler Distance
Since the fKNN forecaster uses both functional and non-functional data, we must de?ne distance measures for both data types. While there exist standard measures for the distance between non-functional data (e.g., Euclidian distance), measuring the distance between functional data (e.g., between two curves) is more involved because of in?nite dimensionality. One of the main advantages of the Beta model is that it allows us to measure the distance between two auction price paths in a very parsimonious way via the Kullback-Leibler (KL) distance. The KL distance [68] is a non-commutative measure of the di?erence between two probability distributions. For two distributions X and Y , it measures how Y di?ers from X . The KL distance is widely used in the ?eld of pattern recognition for feature selection (e.g. [12]) or in physics for determining the states of atoms or other particles (e.g. [75]). In our case, X and Y both refer to the Beta distribution with parameters ?, ? , and ? , ? , respectively. The KL distance between X and Y

89

is then given by a very simple function of the Beta parameters [81]:

DKL (X, Y ) = ln

B (? , ? ) ? (? ? ?)? (?) ? (? ? ? )? (? ) + (? ? ? + ? ? ? )? (? + ? ), B (?, ? ) (4.2)

where B and ? denote the Beta and Digamma function, respectively ([1]). Returning to the four auctions in Figure 4.1, consider the solid black line (Beta(0.5, 0.5)) as the focal auction that we want to forecast. Using equation (4.2), the KL distance to the focal auction is 9.69 from the solid gray line (Beta(5, 1)), 6.40 from the dotted black line (Beta(1, 3)), and 7.10 from the dashed gray line(Beta(2, 2)). While the dashed gray line may, at least visually, not appear very distant from the focal auction, its distribution is in fact very di?erent, as captured by the KL distance.

4.4 Functional K -Nearest Neighbors (fKNN)
In this section we discuss the components of our functional KNN forecaster. We start by explaining the basic forecasting idea and then discuss the two main elements of our fKNN implementation: the choice of a suitable distance metric and the choice of K .

4.4.1 Overview
Our goal is to predict the ?nal price of an ongoing auction. Consider Figure 4.3. The solid line corresponds to the price-process of an auction that is observed

90

Price

Closing Price

f

Current Price

Time Starting T T Closing T

Figure 4.3: Illustration of the forecasting idea. until time T . The dotted line corresponds to the (future) price path until the close of the auction. Our goal is to predict the closing price. As the closing price is determined by the current price plus the price-increment problem is equivalent to predicting
f. f,

our forecasting
f

We will therefore use fKNN to estimate

based on a training set of completed auctions. In order to estimate
f,

we look for the K most similar auctions in the training

set. Consider Figure 4.4 for illustration. In that scenario, we have a training set with 6 auctions,
1

–

6.

We also have associated distances, D1 –D6 , between the

focal auction and each of the auctions in the training set. If K equals 3, then we will estimate
f

by the weighted average of the 3 nearest auctions, in this case by

91

4 D4

Feature Space

'f
D3 3

2 D2

1 'i ¦ K i Di

f
D5 5 1 6 D1 D6

Neighbors: K=3

Figure 4.4: Illustration of forecasting scheme. the weighted average of
1

–

3.

More generally, we estimate
K i=1 K i=1

f

as

f

=

i /Di

1/Di

.

(4.3)

As we can see in equation (4.3), the two main elements of this approach are the choice of K and the choice of a distance metric D. We discuss these next.

4.4.2 Choice of Distance Metric
As pointed out earlier, online auction data comprise of three types of information: Static information captures information that does not change during the course of the auction, time-varying information that changes during the auction, and auction dynamics, which are captured and represented by functional data. Table

92

4.1 summarizes the three types and the speci?c variables for each data type. We now discuss distance metrics for both data-types. Table 4.1: Summary of information sources characterizing online auctions

Data Type

Measurement Scale Interval

Example opening price, screen size, process speed buy-it-now, reserve price, condition brand number of bids, current price price-velocity, -acceleration

Nonfunctional

Static

Binary Categorical

Time-Varying Functional

Interval Functional

4.4.2.1 Static and Time-Varying Data
Static and time-varying information includes data measured on di?erent scales (interval, binary and categorical). Following [53], we use separate metrics for each individual scale, and then combine the individual metrics into an overall distance metric for non-functional data. For binary data xB and xB (e.g., an auction with the buy-it-now option vs. an auction without that feature), we de?ne the distance as

dB = 1(xB = xB ),

(4.4)

where 1 denotes the indicator function and thus dB equals 1 if and only if xB = xB ; otherwise it is 0. 93

We adopt a similar measure for categorical data. For instance, the categorical variable “brand” can assume 8 di?erent levels (Dell, Fujitsu, Gateway, etc) which can be coded as a vector of 7 di?erent binary variables. Thus, each categorical variable can be represented as a set of binary variables. Let xC and xC denote two vectors representing categorical data, then we de?ne their distance, similar to equation (4.4), as dC = 1(xC = xC ), which takes the value of 1 if and only if xC = xC , and 0 otherwise. For interval-scaled data xI and xI (e.g. two auctions with di?erent opening prices), we use a scaled version of the Minkowski metric [47]: (4.5)

dI =

|x ˜I ? x ˜I | , ˜I R

(4.6)

˜ denotes the range of x where x ˜ denotes the standardized value of x, and R ˜. The advantage of the Minkowski metric is that it renders interval-scaled data onto the interval [0, 1]. Note that the maximum and minimum values of dI are 1 and 0 respectively, which are also the values taken by the binary and categorical distance metrics in equations (4.4) and (4.5). Having metrics in comparable magnitudes makes it easier to combine individual distance metrics. We combine individual distance metrics in the following way. Let x = {x1 , x2 , ..., xp } be a vector of p non-functional features, including binary, categorical and interval

94

data. We compute the overall distance between x and x as
p

1 d(x, x ) = p

d? ,
i=1

(4.7)

where d? denotes the appropriate individual distance metric from equations (4.4)(4.6). As an example, let x and x be two three-feature vectors. Speci?cally, x ={w/ buy-it-now, dell, 1G memory} and x ={w/o buy-it-now, IBM, 1G memory}. The ?rst, second and third features are binary, categorical, and interval scaled, respectively. Using equation (4.7), d(x, x ) = 1/3(d1 + d2 + d3), where d1 = 1 based on equation (4.4), d2 = 1 based on equation (4.5), and d3 = 0 based on equation (4.6). The overall distance between x and x is therefore 2/3. Note that the de?nition of d in (4.7) is ?exible in the sense that one can use only subsets of the available information. For instance, dStatic would refer to the distance metric using only static information, while dT ime?V arying would refer to the metric with only time-varying information. One problem with distance metrics of this type is that they may over-weigh di?erent sources of information, depending on how elaborately each source is recorded. For instance, a data set with 100 di?erent static features and only 10 time-varying features puts 10-times more weight on the information from static features. In order to overcome this potential bias, we follow the ideas of [14] and ?rst scale each individual distance metric by its mean root square (MRS). MRS is a statistical measure of the magnitude of a vector. For a vector x = {x1 , ..., xp }, MRS is de?ned as 95
1 p p i=1

x2 i [70]. We apply the same

scaling to each individual distance metric and obtain

dStatic = dStatic /MRS(dStatic ) s
ime?V arying = dT ime?V arying /MRS(dT ime?V arying ) dT s ime?V arying &T ime?V arying . + dT = dStatic dStatic s s s

(4.8) (4.9) (4.10)

&T ime?V arying Note that the combined metric dStatic now puts equal weight on both s

static and time-varying information.

4.4.2.2 Dynamics (Functional Data)
As shown in Section 4.3.3, we can measure the distance between two functional observations using the KL distance. Let (?, ? ) and (? , ? ) denote the Beta parameters for two di?erent auction price paths, then their distance (when x is the focal auction) is de?ned as dF = |DKL (x, x )| , where DKL (x, x ) is de?ned in equation (4.2). Note that dF ranges within [0, +?) as the KL distance assumes values on the real line. In order to make dF comparable with the non-functional distance measures, we again scale it using the MRS transformation. Thus we obtain (4.11)

dDynamics = dF /MRS(dF ). s

(4.12)

96

4.4.2.3 Optimal Distance Metric
To determine which combination of individual distance metrics leads to the best forecasting model, we investigate a series of di?erent distance metrics. In parime?V arying ticular, we investigate performance using ?ve di?erent metrics: dStatic , dT , s s &T ime?V arying All &T ime?V arying dDynamics , dStatic or dAll = dStatic + dDynamics . s s s , where ds s s

We ?rst determine the optimal metric based on a validation set and then investigate the predictive accuracy of the resulting fKNN forecaster on a test set.

4.4.3 Choice of K
The second important component of fKNN is the choice of K , the number of neighbors from which the forecasting is calculated. Too small a value will ?lter out relevant neighbors; too big a value will introduce noise and weaken the prediction. [91] ?nds that the optimal value of K is data-dependent, and it usually grows with the sample size. In additional, K may also vary as di?erent distance metrics are used. Therefore, we select the optimal value of K separately for each distance metric and each data set. To do so, we again select the best value of K based on a validation set; we then apply the resulting model to the test set.

4.4.4 Forecasting Scheme
Our complete forecasting process includes determining the optimal distance metric and the optimal value of K . We determine both based on a validation set. Then, using the optimal metric and K , we estimate the fKNN model based on

97

the records in the training set. We investigate the performance of that model by predicting a new focal auction using auctions from a test set.

4.4.5 Comparison With Alternate Methods
We benchmark our fKNN forecaster against two other very popular prediction methods: parametric regression models, and nonparametric regression trees (CART). In a linear regression model, the closing price is modeled as a linear function of the observed predictor information. This information can include some or all of the three types of data from Table (4.1). Note that in such models, all auctions from the training set are weighed equally when estimating the model coe?cients. CART forecasting takes a hierarchical approach. It recursively partitions the data into smaller sub-groups; the focal auction is then forecasted based on the average of the most relevant sub-group. While CART, like KNN, uses neighboring information from similar auctions, it weighs each auction equally, which is one major aspect in which it di?ers from KNN. We discuss di?erences in prediction performance next.

4.5 Results
We now discuss the predictive performance of our functional KNN forecaster when applied to the two datasets of eBay auctions, and compare it with competing approaches. We also investigate the optimal distance metric and the optimal value

98

of K . The two datasets, Palm PDAs and laptops, are di?erent in their level of heterogeneity. While the Palm PDA dataset is very homogeneous, the laptop data are very heterogeneous. We also investigate di?erent time horizons, that is, we investigate forecasting di?erent distances into the future. We split each of the two datasets into a training set (50% of the auctions), a validation set (25%) and a test set (25%). We split the data according to the temporal nature of our prediction task. That is, auctions in the training set transact prior to those in the validation set; and auctions in the test set transact after those in the validation set. Therefore, our experiments mimic the prediction task that real bidders face. For the competing models (regression and CART), we train the models on the combined training and validation set, and then test their predictive performance on the test set. We evaluate all models using the mean absolute percentage error (MAPE)
N

1 MAPE := N

i=1

yi ? yi , yi

(4.13)

where yi and yi denote the true and estimated ?nal price in auction i, respectively.

4.5.1 Selecting the Optimal K and the Optimal Distance Metric
We select the optimal value of K in the following way. Recall that we have
ime?V arying &T ime?V arying 5 candidate metrics, D ? {dStatic , dT , dDynamics , dStatic , dAll s s s s s }.

For each metric, we select a value of K from the set K ? {1, 2, . . . , 100}. For each 99

combination of (D × K ), we estimate the corresponding fKNN model on the training set, then measure its predictive accuracy (in terms of MAPE) on the validation set. Figure 4.5 shows the results. The left panel shows the results for the laptop auctions; the right panel shows the corresponding Palm PDA results. The top panel shows an overview, the bottom panel zooms-in on the most relevant part. From the left panel in Figure 4.5 (laptop auctions) we can see that dDynamics s results in the worst model performance, regardless of the value of K . In other words, using only the dynamic information of the price path is not su?cient for achieving good prediction accuracy. We also see that, of the remaining 4 distance metrics, dAll s yields the uniformly lowest prediction error. This suggests that for laptop auctions, due to their diversity in makes and models, every single piece of auction information is necessary to achieve good prediction accuracy. Moreover, we note that for dAll s , the lowest prediction error is achieved at K = 41. We conclude that D = dAll together s with K = 41 results in the best predictions. The story is somewhat di?erent for
ime?V arying the Palm PDA data (right panel in Figure 4.5). For those data, D = dT s

results in the uniformly lowest error (across all distances). Moreover, choosing K =2 optimizes that distance. It is interesting that the two di?erent data sets result in very di?erent choices for K and D. While for the laptop data, we need all auction information (using the distance metric dAll s ) and a very large neighborhood (via K =41), the Palm PDA auctions require only the time-varying information of the auction process (using D =
ime?V arying ) and a very small neighborhood (via K =2). One possible explanation dT s

lies in the di?erence in heterogeneity between the two data sets. In the homogeneous 100

0.018

MAPE 0.045 0.050

0.040

0

20

40 K

60

80

100

0.012

MAPE 0.014 0.016

static&time?varying dynamics time?varying static all info

static&time?varying dynamics time?varying static all info

0

20

40 K

60

80

100

0.0410

MAPE 0.0400 0.0405

0.0395

MAPE 0.0115

0.0125

static&time?varying time?varying static all info

20

40

60 K

80

100

0.0105

static&time?varying time?varying all info

2

4 K

6

8

10

Figure 4.5: Optimal values of K and D. The left panel shows the results for the laptop auctions; the right panel shows the corresponding Palm PDA results. The top panel gives an overview, the bottom panel zooms-in on the most relevant part.

101

data (Palm PDA), all products are the same and di?erences in auction outcome will be mostly due to di?erences in the current price and the level of competition for that product. The competition level is re?ected in the number of bids and bidders,
ime?V arying which, together with the price level, are captured in dT . Moreover, since s

products are very homogeneous, we only need a very small neighborhood, thus K =2. This is di?erent for the laptop auctions. In that data set, products are very heterogeneous, thus the forecaster needs all available information (in dAll s ) to distinguish between more relevant samples. Since the products are very di?erent, the method also requires a larger neighborhood which leads to a larger value of K . This suggests that, as expected, forecasting more heterogeneous auctions is a more di?cult task.

4.5.2 Robustness of Optimal D and K to the Time Horizon
In the previous section, we investigated the interplay of K and D for a ?xed time horizon of 1 minute. That is, we assumed that we observe the auction until 1 minute before its close. We now investigate the robustness of this choice for di?erent time horizons. Speci?cally, we investigate the robustness of K and D for di?erent time horizons (?T ) in the set ?T ? { 2h, 1h, 30 min, 15 min, 5 min, 1 min}.

4.5.2.1 Robustness of Optimal K
Figure 4.6 investigates the robustness of K to the choice of ?T . For a given value of K (K ? {20, 40, 60, 80, 100} for the laptop data and K ? {2, 5, 10, 50, 100}

102

for the Palm PDA data), we investigate the prediction accuracy for di?erent values
ime?V arying of ?T . We hold D ?xed at D = dAll for the laptop data and D = dT s s

for the Palm data. Figure 4.6 shows the relative prediction error Rel.MAPEK := MAPEK /MAPEK ? , relative to a benchmark value (K ? = 40 for the laptop data, K ? = 5 for the Palm PDA data). We see that for the laptop data (top panel in Figure 4.6), lower values of K (K =20) lead to poor performance. We also see that while K =40 generally leads to good forecasting accuracy, it is outperformed by higher K -values when forecasting time horizons of 30 or 15 minutes. This suggests that the value of K is not very robust to the time horizon. It is even less robust for the Palm PDA data (bottom panel in Figure 4.6), where K =5 leads to good forecasting performance only for very long time horizons (?T = 2h); in contrast, choosing K =2 leads to the best performance for very short horizons (?T = 1 min). This suggests that the choice of K should be a function of ?T . Table 4.2 lists the optimal value of K for each combination of ?T and D.

4.5.2.2 Robustness of Optimal D
We now investigate the impact of the time horizon ?T on the choice of the distance metric D. Figure 4.7 shows the prediction accuracy as a function of the time horizon ?T for di?erent choices of D. Note that for each combination of D and ?T , we use the optimal values of K from Table 4.2. The left panel in Figure 4.7 corresponds to the laptop data; the right panel is

103

1.020

1.015

Rel.MAPE

1.010

Rel.MAPE

1.005

0.995

1.000

0.990

?2 h

?1 h

?30 m

?15 m

?5 m

?1 m

0.85

0.90

0.95

1.00

1.05

1.10

K=20 K=40 K=60 K=80 K=100

1.15

K=2 K=5 K=10 K=50 K=100

?2 h

?1 h

?30 m

?15 m

?5 m

?1 m

Time Horizon

Time Horizon

Figure 4.6: Relative prediction accuracy for di?erent values of K at di?erent time horizons ?T . The left panel corresponds to the laptop data; and the right panel corresponds to the Palm PDA data.

0.10

MAPE

0.08

MAPE

0.06

0.04

?2 h

?1 h

?30 m

?15 m

?5 m

?1 m

0.01

0.02

0.03

0.04

0.05

static&time?varying dynamics time?varying static all info

static&time?varying dynamics time?varying static all info

0.12

0.06
?2 h

?1 h

?30 m

?15 m

?5 m

?1 m

Time Horizon

Time Horizon

Figure 4.7: Comparison of di?erent distance metrics. The left panel is for laptop auctions; and the right panel is for palm auctions. 104

Table 4.2: Optimal choice of K for di?erent distance metrics D and di?erent time horizons ?T . The top panel corresponds to the laptop data; the bottom panel is for the Palm PDA data. Laptop Data 30min 15min 5min 99 97 100 100 98 99 100 100 96 100 81 91 100 47 44

Time Horizon 2h static 95 time-varying 31 dynamics 100 static&time-varying 40 all 33

1h 94 27 100 79 77

1min 14 89 100 44 41

Time Horizon 2h 1h static 52 time-varying 3 dynamics 94 static&time-varying 7 all 6 69 10 95 18 40

Palm PDA Data 30min 15min 5min 63 4 95 32 30 63 1 29 52 61 61 1 29 12 11

1min 37 2 68 8 2

105

, for the Palm PDA data. Each line corresponds to a distance metric D ? {dStatic s
ime?V arying &T ime?V arying dT , dDynamics , dStatic , dAll s s s s }. We can see that, for each data

set, a single distance metric yields the consistently best result across all values of the time horizon. That is, dAll s results in the best prediction accuracy for the laptop
ime?V arying data, regardless of the value of ?T ; similarly, dT yields the best results for s

all values of ?T in the Palm PDA data. This suggests that the choice of the distance metric is very robust to the forecasting horizon, at least for a given data set. We also
ime?V arying note that while dT signi?cantly outperforms all other distance metrics for s

the Palm PDA data, for the laptop data most choices of D (except for dDynamics ) s yield very comparable results, at least for short time horizons (?T ? 30 min).

4.5.3 Comparison With Alternate Prediction Methods
We evaluate the performance of functional KNN by comparing its predictive accuracy to more classical approaches – linear regression models and tree (CART)4 . We study the performance of all methods on the test set. Recall that we partitioned our data into a training set (50%), validation set (25%) and test set (25%). While we estimated the fKNN forecaster on the training set and optimized its parameters K and D on the validation set, we now compare its performance (using the optimal parameter values) on the test set. That is, for each time horizon ?T , we use the optimal combination of K and D from the previous section. In order to make fair comparisons, we apply regression and CART using the same
We used the software defaults for pruning in CART; that is, we used the defaults in the R package rpart.
4

106

information as for functional KNN. Figure 4.8 shows the results. We display the relative prediction error between fKNN and the regression model (dotted line) and between fKNN and the tree model (dashed line). We can see that fKNN generally outperforms its two competitors. In particular, for the laptop data (left panel), fKNN outperforms the tree model by as much as 40%. While the gap between the regression model and fKNN is smaller, fKNN leads to improvements that range between 5% and 10%. The picture is similar for the Palm PDA data (right panel). While for this data set fKNN also leads to general improvements, it is curious to see that only for the longest time horizon (?T = 2h), both alternate approaches are competitive.

1.4

KNN regression tree

KNN regression tree

1.3

Rel.MAPE 1.2

1.1

1.0

?2 h

?1 h

?30 m ?15 m Time Horizon

?5 m

?1 m

1.0
?2 h

1.1

Rel.MAPE 1.2

1.3

1.4

?1 h

?30 m ?15 m Time Horizon

?5 m

?1 m

Figure 4.8: Comparison of di?erent forecasting methods. The left panel corresponds to laptop auctions; the right panel is for the Palm PDA auctions. It is revealing to compare performance on each of the two data sets. While for the laptop data, both fKNN and regression signi?cantly outperform CART, the 107

Validation Set

Test Set

0.0310

0.0300

0.0305

MAPE

0.0295

MAPE
5 10 K 15 20

0.0290

0.0280

0.0285

0.0275

0.035

0.036

0.037

0.038

0.039

5

10 K

15

20

Figure 4.9: Optimal values of K for the Palm PDA data at ?T = 15 min. The left panel corresponds to the validation set; the right panel corresponds to the test set. gap is not as large in the Palm PDA data; in fact, for the Palm PDA data, CART and regression are comparable for almost all time horizons. The poor performance of CART on the laptop data illustrates the general problem of the method with prediction: while it often ?ts the training set well, it has a tendency to over-?t and thus perform poorly on the test set, especially in situations like the laptop data where the underlying population is very heterogeneous. On the other hand, functional KNN can handle heterogeneous populations well by selecting only those neighbors that are most relevant for the focal auction; in particular, compared to regression, it performs especially well for forecasting longer time horizons (one hour, two hours), which is very relevant in practical situations. Functional KNN also leads to improvements for less heterogeneous data sets such as the Palm PDA data. While the right panel in Figure 4.8 suggests that 108

fKNN outperforms both competitors for every time horizon, there is a sharp drop for the competitors at ?T ? 15 min. At this point, both regression and the tree model perform almost as well as fKNN. A closer investigation of this phenomenon reveals that for this time horizon, the optimal value of K (based on the validation set) equals one (see left panel in Figure 4.9); however, that value leads to very poor performance on the test set (right panel in Figure 4.9). This suggests that ?nding the right value of K is especially di?cult for homogeneous data sets (such as the Palm PDA data). While the data-homogeneity suggests very small values of K , slight perturbation of the homogeneity can lead to weaker results. This was already implied by the lack of robustness seen in Section 4.5.2.1.

4.6 Conclusions
In this chapter, we propose a novel functional KNN forecaster for forecasting the ?nal price of an ongoing online auction. Assuming that more similar auctions contain more relevant information for incorporation into forecasting models, we propose a novel dissimilarity measure that takes into account both static and timevarying features as well as the auction’s price dynamics information. The latter is obtained via a functional representation of the auction’s price path. We select both the optimal distance metric as well as the optimal number of neighbors based on a validation set. We ?nd that weighting information unequally yields better forecasts compared to classical methods such as regression models or trees, and this result holds in auctions of varying levels of heterogeneity. Moreover, the proposed Beta

109

model has many nice properties as a representation of auction price path besides providing distance measures for functional price curves. We explore those properties in further detail in the next Chapter. Although we observe improvement of the KNN forecaster over regression and CART for auctions of varying levels of heterogeneity, our study shows that the improvement is bigger for heterogeneous data. This means that selecting the most useful information and making use of only most relevant neighbors is especially crucial for prediction accuracy in situations where objects are heterogeneous and information is noisy. This fact is true not only for forecasting online auctions but also in many other forecasting situations (e.g., weather forecasting). Another ?nding worth noticing is the robustness of the optimal distance to the time horizon. The fact that the same distance metric is optimal regardless of the time horizon implies that the most important information for making price prediction is time-invariant. This insight simpli?es the process of decision making. To compute forecasts, we only need to ?nd the optimal distance once, and this distance can then be re-used as the forecasting process proceeds. There are several ways to extend this study. While we scale distance metrics for di?erent information sources to achieve equal weighing across all metrics, one could alternatively assign individual weights to individual metrics and then optimize the weights. There are also alternative ways to de?ne distances for di?erent data types. For example, for categorical data we can de?ne several levels of category “similarity”, such as “US brand”. Then, the distance between items can be set to 0.5 for “similar categories” (e.g., US brand) or 1 for categories more di?erent. 110

Another way to complement this study is by investigating alternates to classical linear regression and trees, e.g. via weighted regression or tree models, which might lead to forecasting advantages especially for heterogeneous data.

111

Chapter 5 A Flexible Model for Price Dynamics in Online Auctions 5.1 Introduction
One stream of online auction research has focused on the dynamics of the auction price paths which leads to deeper understanding of price formation process. Descriptive studies have shown that price dynamics can be very heterogeneous, even for auctions of the same product (e.g. [51; 82]). Furthermore, statistical approaches, such as functional data analysis, has been developed and employed in providing insight into price dynamics [52; 10; 86]. Price dynamics have also been shown instrumental for price forecasting. [96] pioneered real-time forecasting models for ongoing auctions where price dynamics serve as important predictors. [57] recently expanded upon this idea. Both studies show that the inclusion of the dynamic information adds additional predictive power to the forecasting model compared to models that do not make use of such information. In summary, the price path and the price dynamics are of special interest in online auctions, and therefore developing models that can capture them e?ectively are both important and useful. In Chapter 4, we have brie?y introduced a Beta model for capturing the price path and its dynamics; and it is employed to help measure distances between auction price paths. The Beta model is parsimonious, 112

?exible, and computationally e?cient. In this Chapter we further explore the properties of the beta model, compare our model to alternative existing models and show its advantages both in terms of ?t as well as forecast accuracy. This work has been submitted to the Journal of the Royal Statistical Society C for review. The remainder of the paper is organized as follows: Section 5.2 presents existing models for capturing price path and dynamics in online auctions. In section 5.3, we introduce our Beta model and describes its properties, estimation, and advantages over alternate approaches. We then compare the Beta model to several competitors empirically in Section 5.4, in terms of ?t as well as forecast accuracy. We conclude in Section 5.5.

5.2 Models for Auction Price Paths
Dynamics (e.g. velocity or acceleration) are typically computed as the ?rst or second derivative of an underlying smooth function. However, observed bids create a non-decreasing step function with jumps at the times of bids (see e.g. Figure 2.1). Thus, in order to gauge an auction’s price dynamics, one needs some smooth representation of the price path. There have been two general approaches to obtaining smooth auction price paths from observed bid data: non-parametric and parametric. Using a functional data analytic (FDA) approach, [54] employed penalized smoothing splines (p-splines) to generate smooth curves. An alternative to p-splines are monotone splines [78] which guarantee the monotonicity of the resulting curves. Finally, [45] proposed a parametric family of four distributions that capture

113

an auction’s most typical price path shapes. Each of these three approaches yield smooth price curves, and then price dynamics are computed by taking derivatives of the smooth curves. The ?rst derivative captures price velocity (i.e. how fast the price is moving at any point in time); the second derivative captures price acceleration, and so forth. In the following we describe each of these three approaches and discuss their strengths and weaknesses.

5.2.1 Smoothing Splines
Penalized smoothing splines (p-splines) [88] ?t a polynomial of order p. In order to control the smoothness of the ?tted curve, a penalty is imposed on the estimating function. Let ?1 , ?2 , . . . , ?L be a set of knots, then a polynomial spline of order p is given by
L

f (t) = ?0 + ?1 t + ?2 t + . . . + ?p t +
l=1

2

P

?pl (t ? ?l )p +

(5.1)

where u+ = uI (u ? 0) is the positive part of the function u. Many functions of this type tend to ?t the data too closely (and thus model noise in addition to the signal); therefore, a roughness penalty approach is often employed which takes into account the trade o? between data-?t (i.e., minimizing f (t) =
j (yj

? f (tj ))2 ) and

function smoothness. A popular measure of roughness, which measures degree of

114

departure from a straight line, is of the form

P ENm (t) =

[Dm f (t)]2 dt

(5.2)

where Dm f, m = 1, 2, 3, . . . denotes the mth derivative of the function f . A highly variable function will yield a high value of P ENm (x). If the highest derivative of interest is m, then using m + 2 as the polynomial order will assure m continuous derivatives. The penalized smoothing spline f minimizes the penalized squared error

P EN SSE?,m =

(y (t) ? f (t))2 + ?P ENm (t).

(5.3)

When the roughness parameter is set to ? = 0, the penalized squared error drops out, and the function ?ts the data perfectly. Larger values of ? penalize the function for being curvy, and as ? ? ?, the ?tted curves approach a linear regression. Smoothing splines are widely used in functional data analysis. They are advantageous in terms of their ?exibility which results in good data-?ts, in terms of their ease of obtaining derivatives (i.e. dynamics) and in terms of their computational e?ciency. However, there are several challenges when applying penalized smoothing splines to the online auction context. Firstly, although bidding data are non-decreasing over time, smoothing splines do not necessarily result in monotonically non-decreasing curves. Hence, they may not truly re?ect the monotonic nature of the auction price process. Second, the ?tted curves are often very variable, especially at their ends, even with a heavy smoothness penalty. This is problematic in

115

the auction context, where the opening and closing prices are of special importance. Moreover, in a forecasting context it is crucial to obtain precise estimates at the ?nal stages of the auction (see Chapter 2 and [96]). Finally, smoothing splines require the speci?cation of many nuisance parameters (such as the smoothing parameter, the number and position of knots, and the polynomial order) which are often determined in an ad-hoc fashion. While one can estimate some of these parameters from the data (e.g., using cross-validation), their optimal choice is not always guaranteed.

5.2.2 Monotone Splines
Monotone splines [78] are a natural alternative to smoothing splines in the online auction context, since they guarantee a monotone path of the resulting price process. The idea behind monotone smoothing is that monotonously increasing functions have a positive ?rst derivative. The exponential function has this property and can be described by the di?erential equation f (t) = w(t)f (t). This means that the rate of change of the function is proportional to its size. Consider the linear di?erential equation D2 f (t) = w(t)Df (t). Here, w(t) =
D2 f (t) , Df (t)

(5.4)

which is the ratio between acceleration and velocity. The

di?erential equation has the following solution:

t

v

f (t) = ?0 + ?1
t0

exp
t0

w(v )dv du

(5.5)

116

where t0 is the lower bound over which we are smoothing. After some substitutions (see [79]), we can write f (t) = ?0 + ?1 ewt . (5.6)

and estimate ?0 , ?1 , and w(t) from the data. Since w(t) has no constraints it may be de?ned as a linear combination of K known basis functions (i.e., w(t) = The penalized least squares criterion is thus
k ck ?k (t)).

P EN SSE? =
i

[yi ? f (t)]2 + ?
0

T

[w2 (t)]2 dt.

(5.7)

For capturing online auction price paths and dynamics, monotone smoothing indeed solves the excess variability of penalized smoothing splines and their non-monotonicity problems. The resulting curves are better representations of a continuous non-decreasing price path, and dynamics can be computed via curve derivatives. However, some challenges remain and new ones arise. First, monotone smoothing is computationally intensive, as it relies on an iterative ?tting process where several passes have to be made through the data. Therefore, ?tting a dataset of even tens or hundreds of auctions can take a long time. Second, like smoothing splines, there are many nuisance parameters to be determined (the number and location of knots and the smoothing parameter). Hence, while, at least conceptually, monotone splines are preferable over smoothing splines, they can be slow to implement even with medium-sized datasets.

117

5.2.3 Parametric Growth Models
To overcome the disadvantages of smoothing splines and monotone splines, [44] proposed a family of four growth models for representing the price process. They ?nd that the shape of auction price paths can be categorized into four main types: exponential growth, logarithmic growth, logistic growth, and re?ected-logistic growth. These four models not only provide parametric ?t of monotone data, but they also have appealing interpretations, and are easy to estimate. We describe each of the four models next.

5.2.3.1 The Exponential Model
Exponential growth has been used for describing a variety of natural phenomena including the dissemination of information, the spread of disease, and the multiplication of cells in a petrie dish. In exponential growth the rate of growth is proportional to a function’s current magnitude; that is, growth follows the di?erential equation Y (t) = rY (t), or the equivalent equation Y (t) = Aert , (5.9) (5.8)

where t denotes time, and r > 0 is the growth constant. Equivalently, exponential decay, when r < 0, can model phenomena such as the half-life of an organic event. In an online auction context, exponential growth describes a price process with gradual

118

price increases until mid-to-late auction, and a heavy price jump towards the end.

5.2.3.2 The Logarithmic Model
Logarithmic growth is technically the inverse of the exponential function,

Y (t) =

1 t ln( ), r A

(5.10)

The resulting curves are re?ections of exponential growth over the line x = y . In the online auction context, such behavior occurs when early bidding quickly increases the price during the opening stages of the auction, but because of market constraints (e.g. a market value or budget constraints), price ?attens out for the remainder of the auction. This type of price behavior tends to be rare, as most bidders do not wish to reveal their valuations early in the auction. However, inexperienced bidders who may not completely understand eBay’s proxy bidding mechanism, may place high early bids.

5.2.3.3 The Logistic Model
Logistic growth is useful for describing processes which reach a limit or a “carrying capacity”. In the context of auction prices, in many cases there are competing online and brick-and-mortar markets for the auctioned item, thereby creating a “market value” for the item.

119

The logistic model is given by

Y (t) =

L , 1 + Cert

(5.11)

and the di?erential equation is

Y (t) = rY (t)(

Y (t) ? 1), L

(5.12)

where L is the carrying capacity, t is time, r is the growth rate, and C is a constant. Logistic growth can also be explained in the auction context as a stretched-out “S”shaped curve, where the price increases slowly early, jumps up during mid-auction, and levels o? towards the end of the auction. The resulting closing price is analogous to the carrying capacity L in the logistic growth function.

5.2.3.4 The Re?ected-Logistic Model
Another common price process in online auctions is a re?ected “S” shaped curve. Such behavior can be captured by the inverse of logistic growth, or re?ectedlogistic growth, given by the function

Y (t) =

? 1) ? ln(C ) ln( L t . r

(5.13)

In the online auction context, this type of growth occurs when there is some early bidding that results in a price increase, followed by little to no bidding in the middle

120

of the auction, and then another price increase as the auction approaches its close. In particular, price spikes near the end may be caused by sniping.

5.2.3.5 The 4-member growth model family
The set of four growth models is used to approximate price paths as follows: For a dataset of auctions, each of the four models is ?tted to each auction. Then, for each auction, the four estimated models are compared in terms of ?t, and the best ?tting model is chosen (for more on the ?tting process, see [45]). Hence, the ?tting process is a two-stage process. Since the family is entirely parametric, no nuisance parameters require determination. Moreover, since the family is monotonic, it is well-suited for capturing auction price processes. Moreover, the 4-member family of growth models is computationally e?cient compared to monotone splines, and ordinary least squares functions can be used for estimation. The main disadvantage of the four-model family is it is limited to only four basic shapes – exponential, logarithmic, logistic, and re?ected-logistic – which may be overly simplistic for some auction scenarios. Moreover, because the four models are not nested within a single model, comparing ?t (for choosing the best model) is nontrivial. Finally, when ?tting the exponential and logistic models via least squares, the models minimize error in the bid amount space. In contrast, when ?tting the two re?ected models (logarithmic and re?ected-logistic growth) using least squares, the error minimization is done in the bid time space. A comparison

121

is therefore more complicated.

5.2.4 Comparison
To illustrate the di?erence between penalized splines, monotone splines and the 4-member family of growth models, consider Figure 5.1, which displays the ?t of the three methods to two sample auction price paths. The solid lines represent psplines, the dashed lines represent monotone splines, and the dotted lines represent the best ?t of the 4-member growth model family. We see that while p-splines can ?t the data very well, they are very variable and do not capture the monotonic nature of the price path. While the 4-member growth family results in a monotonous price path, it does not ?t the data well. Monotone splines appear to provide the best ?t in this example; however, it takes on average almost 7 seconds to ?t a single monotone spline (compared to 0.02 seconds for one p-spline and and 0.04 seconds for one 4-member growth model).

5.3 A New Model for Auction Price Paths: The Beta Model
In light of the shortcomings of existing models for online auction price paths and dynamics, we introduce a single parametric model that is ?exible yet parsimonious for approximating price paths and their dynamics. We have brie?y introduced the model in Chapter 4 based on which Kullback-Leibler distance is used to measure the distance between two auction price paths. We now discuss model estimation and its properties in detail.

122

400

price 200

300

100

0

0

1

2

3

4

5

6

7

0

50

price 100

150

200

0

1

2

3

4

5

6

7

Figure 5.1: Illustration of the three existing smoothing methods. The solid lines represent p-splines, dashed lines are for monotone splines, and the dotted lines are for growth models. Our proposed model is based on the Beta cumulative distribution function (CDF). The Beta distribution is a continuous probability distribution de?ned on the interval [0, 1] with two shape parameters (? and ? ) that fully determine the distribution. Its CDF can be written as
x 0

F(x, ?, ? ) =

u??1 (1 ? u)? ?1 du , B (?, ? )

(5.14)

where B (?, ? ) is the beta function1 , which serves as a normalization constant in the CDF to ensure that F (1, ?, ? ) = 1. We use the Beta model to capture auction price paths in the following way. Let p denote the sequence of observed bids with associated time-stamps t. Since auctions can be of varying durations, we thus normalize the time sequence to a 0-1
1

B (?, ? ) =

1 0

u??1 (1 ? u)? ?1 du

123

scale by using the transformation tn = t/Duration. tn are time-stamps between 0 and 1. Similarly, because auctions close at di?erent prices, we normalize the observed bids to a 0-1 scale by using the transformation pn = p/ClosingP rice. pn are bid values between 0 and 1. The goal is then to ?nd the values of ? and ? that satisfy pn =
tn 0

u??1 (1 ? u)? ?1 du/B (?, ? ) for every element of pn and tn . An

algorithm for achieving this goal e?ciently is described in Section 5.3.1. The Beta model is very ?exible in the types of curves that it can produce. It includes as special cases the four shapes of the 4-member growth model family of [45]. The top panel in Figure 5.2 shows the Beta model curves for di?erent values of ? and ? . The solid line represents the case where price grows rapidly at the auction beginning and at the end, but not in the middle, corresponding to logit growth. The long-dashed line represents the situation where rapid growth only occurs at the end, corresponding to exponential growth. The short-dashed line shows early rapid growth, corresponding to logarithmic growth. And ?nally, the dotted-dashed line captures a rapid increase in price somewhere in the middle of the auction, corresponding to the reverse-logit growth pattern.

5.3.1 Fitting the Beta Model
Fitting the Beta model to bid data can be done in a way that results in curves that ?t well in two dimensions: bid time and bid amount. In the auction context both dimensions are important. In particular, a good ?t in terms of the bid timing is necessary in order to accurately capture points of di?erent bidding activities.

124

1.0

0.0

0.2

0.4

0.6

0.8

(0.5,0.5) (5,1) (1,3) (2,2)

0.0

0.2

0.4

0.6

0.8

1.0

2.5

0.0

0.5

1.0

1.5

2.0

(0.5,0.5) (5,1) (1,3) (2,2)

0.0

0.2

0.4

0.6

0.8

1.0

Figure 5.2: Beta CDF (top panel) and corresponding PDF (bottom panel) with di?erent shape parameters (?, ? )

125

Periods of vastly di?erent biding activity, e.g. early or last-minute bidding, have been documented well in the auction literature (e.g. [86]), and they are important to capture adequately. In terms of bid amounts, a model that adequately captures the bid amounts (i.e., the price at that point of the auction), is necessary for generating accurate forecasts of an auction’s ?nal price. Auction price forecasting is of practical interest and di?erent forecasting models have been suggested in the literature [32; 33; 96; 57; 53; 21]. The only inputs required for ?tting the Beta CDF are the observed bid amounts and their associated time stamps. The resulting price path representation is characterized by only two parameters. The simplicity and parsimony of the Beta model distinguish it from alternative approaches. Our algorithm for ?tting the Beta CDF minimizes residuals in both bid amount and bid time dimensions simultaneously.

5.3.1.1 Beta-Fitting Algorithm
For a given auction, we estimate ? and ? from the observed bids as follows: Step 1: Standardize bid amounts and bid times Since the range

as well as the domain (x) of the Beta CDF is [0, 1], we ?rst standardize the bid amounts and bid times by the following two transformations. y? and x? time ? min(time) . max(time) ? min(time) 126 bid ? min(bid) max(bid) ? min(bid)

x and y are now bid times and bid amounts standardized within [0, 1]. ˆ Step 2: Compute ?0 and ?0 , the initial values of ? ˆ and ? Since we treat x as a Beta-distributed random variable, it is reasonable to assume that the empirical average and variance of x are close to their theoretical mean and variance. That is, mean(x)
? ?+?

and var(x)

?? . (?+? )2 (?+? +1)

Therefore, the initial values of ? and ? are found by solving the minimization problem:

(?0 , ?0 ) = (?? , ? ? )|DIST A (?? , ? ? ) = min(DIST A (?, ? )) ,

where DIST A (?, ? ) = mean(x) ? ˆ Step 3: Compute ? ˆ and ?

? ?+?

2

+ var(x) ?

?? (?+? )2 (?+? +1)

2

.

In order to capture both the bid levels as well as the bid times, our model minimizes error both in y and x directions simultaneously. Speci?cally, we choose to minimize the sum of the squared residuals in y and x directions. ˆ through With the initial values ?0 and ?0 from Step 2, we solve for ? ˆ and ? the following minimization problem:

ˆ) = (?? , ? ? )|DIST B (?? , ? ? ) = min(DIST B (?, ? )) (ˆ ?, ?

where DIST B (?, ? ) =

(y ? pbeta(x, ?, ? ))2 +

(x ? qbeta(y, ?, ? ))2 ; and

pbeta and qbeta represent the cumulative distribution function and the inverse of the cumulative distribution function of the beta distribution respectively. 127

The above algorithm is computationally very e?cient. It takes, on average, 0.0489 seconds to ?t the Beta model to one auction (using the above algorithm), which compares favorably to the 4-member growth family (0.0362 seconds). Unsurprisingly, penalized splines fare better (0.0190 seconds) since they do not encounter any iterations. Conversely, ?tting monotone splines, which do require iterative passes through the data, result in 150 times larger computing times (on average of 6.9726 seconds per auction).

5.3.2 Properties of the Beta Model
The Beta model shares the main properties of competing methods (p-splines, monotone splines and the 4-family growth model), but it also has several additional properties that set it apart. Like all competing methods, the derivatives of the continuous Beta curves can be used to capture price dynamics. The Beta model produces monotonically non-decreasing curves, yet it is computationally fast (using the algorithm from Section 5.3.1). Unlike non-parametric approaches, ?tting the Beta model does not involve any nuisance parameters. Like the 4-member parametric growth model, the two-parameter Beta model can be used to characterize auction types (e.g. exponential, logarithmic, logistic or re?ected logistic) in terms of price dynamics. The Beta model has two additional unique properties, which make it especially advantageous in the online auction context: (1) Because both of its dimensions (bid time and bind amount) are derived from a probability function, the Beta summary

128

statistics can be used to learn about the bid timing distribution, and (2) there is an easy and straightforward way to measure pairwise distances between price paths. The latter is especially useful in the context of pairwise comparisons and dynamic forecasting as shown in Chapter 4. We discuss those model properties in detail next.

5.3.2.1 Representing Price Dynamics
The Beta CDF representation of the price paths means that price velocity, which is the ?rst derivative of the price curve, is given by the Beta probability density function (PDF). In particular, at any given time T , the price velocity of an auction with shape parameters ? and ? can be computed as:

t??1 (1 ? t)? ?1 V el(t, ?, ? ) = , B (?, ? )

(5.15)

where t is the normalized T on a scale of [0, 1] (t = function.

T ) duration

and B (?, ? ) is the beta

The bottom panel in Figure 5.2 plots the price velocities corresponding to the price paths in the top panel. The solid black line shows rapid dynamics at the beginning and end, but not much price activity during the middle; in contrast, the dashed gray line signals heightened price dynamics during mid-auction. Similarly, the solid gray line captures increased price-velocity towards the end while the dotted dashed line captures early price spurts. Higher order price dynamics can also be readily obtained by taking higher

129

order derivatives. For example, price acceleration can be computed as

Acc(t, ?, ? ) =

t??1 (1 ? t)? ?1 B (?, ? )

??1 ??1 ? t 1?t

Price dynamics carry important information about the auction process [10]. Therefore accurate approximations of price dynamics are bene?cial across multiple applications. In section 5.4 we show that the price dynamics generated via the Beta model lead to more accurate price forecasts compared to competing approaches.

5.3.2.2 Characterizing Growth Patterns
Similar to the 4-member growth family of [45], the Beta model provides a tool for characterizing price process types. In fact, there is a one-to-one mapping between the Beta model and the four growth models via the shape parameters ? and ? . For example, if both ? and ? are smaller than 1, then the price curve is similar to the re?ected-logistic model. Table 5.1 lists the relationship between the Beta model and the 4-member growth family. The implication of this relationship is that it allows us to easily characterize auctions in terms of their type of price dynamics, without the need of more specialized techniques such as functional clustering (e.g. [54]) or via laborus visual examination (e.g. [44]).

5.3.2.3 Characterizing Bid Timing
The estimated Beta parameters ? and ? can be used to compute summary statistics which capture bid timing information. Table 5.2 gives the formulas for the 130

Table 5.1: Correspondence between the Beta model and the four growth models Growth Models Exponential Logarithmic Logistic Re?ected-logistic Beta Model ?=1 ?>1 ?1 ?1 ?

Report on Mining of Business Data

Attachments