Hotel Booking Demand Datasets

1y ago
11 Views
2 Downloads
589.66 KB
9 Pages
Last View : 11d ago
Last Download : 3m ago
Upload by : Harley Spears
Transcription

View metadata, citation and similar papers at core.ac.ukbrought to you byCOREprovided by Repositório Institucional do ISCTE-IULData in Brief 22 (2019) 41–49Contents lists available at ScienceDirectData in Briefjournal homepage: www.elsevier.com/locate/dibData ArticleHotel booking demand datasetsNuno Antonio a,b,n, Ana de Almeida a,c,d, Luis Nunes a,b,daInstituto Universitário de Lisboa (ISCTE-IUL), Lisbon, PortugalInstituto de Telecomunicações, Lisbon, PortugalCISUC, Coimbra, PortugaldISTAR-IUL, Lisbon, Portugalbca r t i c l e i n f oabstractArticle history:Received 5 October 2018Accepted 26 November 2018Available online 29 November 2018This data article describes two datasets with hotel demand data.One of the hotels (H1) is a resort hotel and the other is a city hotel(H2). Both datasets share the same structure, with 31 variablesdescribing the 40,060 observations of H1 and 79,330 observationsof H2. Each observation represents a hotel booking. Both datasetscomprehend bookings due to arrive between the 1st of July of 2015and the 31st of August 2017, including bookings that effectivelyarrived and bookings that were canceled. Since this is hotel realdata, all data elements pertaining hotel or costumer identificationwere deleted. Due to the scarcity of real business data for scientificand educational purposes, these datasets can have an importantrole for research and education in revenue management, machinelearning, or data mining, as well as in other fields.& 2018 The Authors. Published by Elsevier Inc. This is an openaccess article under the CC BY /).Specifications tableSubject areaMore specific subject areaType of dataHow data was acquirednHospitality ManagementRevenue ManagementText files and R objectsExtraction from hotels’ Property Management System (PMS) SQLdatabasesCorresponding author.E-mail address: nuno miguel antonio@iscte-iul.pt (N. 2352-3409/& 2018 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY /).

42N. Antonio et al. / Data in Brief 22 (2019) 41–49Data formatExperimental factorsExperimental featuresData source locationData accessibilityMixed (raw and preprocessed)Some of the variables were engineered from other variables fromdifferent database tables. The data point time for each observationwas defined as the day prior to each booking's arrivalData was extracted via TSQL queries executed directly in the hotels’PMS databases and R was employed to perform data analysisBoth hotels are located in Portugal: H1 at the resort region of Algarveand H2 at the city of LisbonData is supplied with the paperValue of the data Descriptive analytics can be employed to further understand patterns, trends, and anomaliesin data; Used to perform research in different problems like: bookings cancellation prediction, customersegmentation, customer satiation, seasonality, among others; Researchers can use the datasets to benchmark bookings’ prediction cancellation models againstresults already known (e.g. [1]); Machine learning researchers can use the datasets for benchmarking the performance of differentalgorithms for solving the same type of problem (classification, segmentation, or other); Educators can use the datasets for machine learning classification or segmentation problems; Educators can use the datasets to obtain either statistics or data mining training.1. DataIn tourism and travel related industries, most of the research on Revenue Management demandforecasting and prediction problems employ data from the aviation industry, in the format known asthe Passenger Name Record (PNR). This is a format developed by the aviation industry [2]. However,the remaining tourism and travel industries like hospitality, cruising, theme parks, etc., have differentrequirements and particularities that cannot be fully explored without industry's specific data. Hence,two hotel datasets with demand data are shared to help in overcoming this limitation.The datasets now made available were collected aiming at the development of prediction modelsto classify a hotel booking's likelihood to be canceled. Nevertheless, due to the characteristics of thevariables included in these datasets, their use goes beyond this cancellation prediction problem.One of the most important properties in data for prediction models is not to promote leakage offuture information [3]. In order to prevent this from happening, the timestamp of the target variablemust occur after the input variables’ timestamp. Thus, instead of directly extracting variables from thebookings database table, when available, the variables’ values were extracted from the bookingschange log, with a timestamp relative to the day prior to arrival date (for all the bookings createdbefore their arrival date).Fig. 1. Diagram of PMS database tables where variables where extracted from.

N. Antonio et al. / Data in Brief 22 (2019) 41–4943Table 1Variables ingADRNumericAverage Daily Rate as defined by [5]AdultsAgentIntegerNumber of adultsCategorical ID of the travel agency that made thebookingaIntegerDay of the month of the arrival dateCategorical Month of arrival date with 12 categories:“January” to “December”IntegerWeek number of the arrival dateIntegerYear of arrival dateCategorical Code for the type of room assigned to thebooking. Sometimes the assigned roomtype differs from the reserved room typedue to hotel operation reasons (e.g.overbooking) or by customer request.Code is presented instead of designationfor anonymity reasonsIntegerNumber of babiesIntegerNumber of changes/amendments madeto the booking from the moment thebooking was entered on the PMS untilthe moment of check-in or cancellationBO, BL and TR / Calculated by dividingthe sum of all lodging transactions bythe total number of staying nightsBO and BLBO and nWaitingListDepositTypeBO and BLBO and BLBO and BLBO and BLBO and BLBO and BLBO and BL/Calculated by adding thenumber of unique iterations thatchange some of the booking attributes,namely: persons, arrival date, nights,reserved room type or mealIntegerNumber of childrenBO and BL/Sum of both payable andnon-payable childrenCategorical ID of the company/entity that made the BO and BL.booking or responsible for paying thebooking. ID is presented instead of designation for anonymity reasonsCategorical Country of origin. Categories are repre- BO, BL and NTsented in the ISO 3155–3:2013 format [6]Categorical Type of booking, assuming one of fourcategories:Contract - when the booking has anallotment or other type of contractassociated to it;Group – when the booking is associated to a group;Transient – when the booking is notpart of a group or contract, and is notassociated to other transient booking;Transient-party – when the booking istransient, but is associated to at leastother transient bookingIntegerNumber of days the booking was in thewaiting list before it was confirmed tothe customerCategorical Indication on if the customer made adeposit to guarantee the booking. Thisvariable can assume three categories:No Deposit – no deposit was made;Non Refund – a deposit was made inthe value of the total stay cost;Refundable – a deposit was made witha value under the total cost of stay.BO and BLBO/Calculated by subtracting the datethe booking was confirmed to thecustomer from the date the bookingentered on the PMSBO and TR/Value calculated based onthe payments identified for the booking in the transaction (TR) table beforethe booking's arrival or cancellationdate.In case no payments were found thevalue is “No Deposit”.If the payment was equal or exceededthe total cost of stay, the value is set as“Non Refund”.Otherwise the value is set as“Refundable”

44N. Antonio et al. / Data in Brief 22 (2019) 41–49Table 1 (continued )VariableTypeDistributionChannelCategorical Booking distribution channel. The term“TA” means “Travel Agents” and “TO”means “Tour Operators”Categorical Value indicating if the booking wascanceled (1) or not (0)Categorical Value indicating if the booking namewas from a repeated guest (1) or not escriptionSource/EngineeringBO, BL and DCBOBO, BL and C/ Variable created byverifying if a profile was associatedwith the booking customer. If so, and ifthe customer profile creation date wasprior to the creation date for thebooking on the PMS database it wasassumed the booking was from arepeated guestBO and BL/ Subtraction of the enteringIntegerNumber of days that elapsed betweenthe entering date of the booking into the date from the arrival datePMS and the arrival dateBO, BL and MSCategorical Market segment designation. Incategories, the term “TA” means “TravelAgents” and “TO” means “TourOperators”MealCategorical Type of meal booked. Categories arepresented in standard hospitality mealpackages:Undefined/SC – no meal package;BB – Bed & Breakfast;HB – Half board (breakfast and oneother meal – usually dinner);FB – Full board (breakfast, lunch anddinner)PreviousBookingsNotCanceled IntegerNumber of previous bookings notcancelled by the customer prior to thecurrent bookingBO, BL and MLBO and BL / In case there was nocustomer profile associated with thebooking, the value is set to 0. Otherwise, the value is the number ofbookings with the same customerprofile created before the currentbooking and not canceled.Number of previous bookings that were BO and BL/ In case there was nocancelled by the customer prior to the customer profile associated with thebooking, the value is set to 0. Othercurrent bookingwise, the value is the number ofbookings with the same customerprofile created before the currentbooking and canceled.Number of car parking spaces required BO and BLby the rkingSpacesIntegerReservationStatusCategorical Reservation last status, assuming one of BOthree categories:Canceled – booking was canceled bythe customer;Check-Out – customer has checked inbut already departed;No-Show – customer did not check-inand did inform the hotel of the reasonwhy

N. Antonio et al. / Data in Brief 22 (2019) 41–4945Table 1 (continued ecialRequestsaDescriptionDate at which the last status was set.This variable can be used in conjunctionwith the ReservationStatus to understandwhen was the booking canceled or whendid the customer checked-out of thehotelCategorical Code of room type reserved. Code ispresented instead of designation foranonymity reasonsIntegerNumber of weekend nights (Saturday orSunday) the guest stayed or booked tostay at the hotelIntegerNumber of week nights (Monday to Friday) the guest stayed or booked to stayat the hotelIntegerNumber of special requests made by thecustomer (e.g. twin bed or high floor)Source/EngineeringBOBO and BLBO and BL/ Calculated by counting thenumber of weekend nights from thetotal number of nightsBO and BL/Calculated by counting thenumber of week nights from the totalnumber of nightsBO and BL/Sum of all special requestsID is presented instead of designation for anonymity reasons.Not all variables in these datasets come from the bookings or change log database tables. Somecome from other tables, and some are engineered from different variables from different tables. Adiagram presenting the PMS database tables from where variables were extracted is presented inFig. 1. A detailed description of each variable is offered in the following section.2. Experimental design, materials and methodsData was obtained directly from the hotels’ PMS databases’ servers by executing a TSQL query onSQL Server Studio Manager, the integrated environment tool for managing Microsoft SQL databases[4]. This query first collected the value or ID (in the case of foreign keys) of each variable in the BOtable. The BL table was then checked for any alteration with respect to the day prior to the arrival. Ifan alteration was found, the value used was the one present in the BL table. For all the variablesholding values in related tables (like meals, distribution channels, nationalities or market segments),their related values were retrieved. A detailed description of the extracted variables, their origin, andthe engineering procedures employed in its creation is shown in Table 1.The PMS assured no missing data exists in its database tables. However, in some categoricalvariables like Agent or Company, “NULL” is presented as one of the categories. This should not beconsidered a missing value, but rather as “not applicable”. For example, if a booking “Agent” is definedas “NULL” it means that the booking did not came from a travel agent.Summary statistics for both hotels datasets are presented in Tables 2–7. These statistics wereobtained using the ‘skimr’ R package [7].A word of caution is due for those not so familiar with hotel operations. In hotel industry it is quitecommon for customers to change their booking's attributes, like the number of persons, stayingduration, or room type preferences, either at the time of their check-in or during their stay. It is alsocommon for hotels not to know the correct nationality of the customer until the moment of check-in.Therefore, even though the capture of data took considered a timespan prior to arrival date, it isunderstandable that the distribution of some variables differ between non canceled and canceledbookings. Consequently, the use of these datasets may require this difference in distribution to betaken into account. This difference can be seen in the table plots of Fig. 2 and Fig. 3. Table plots are apowerful visualization method and were produced with the tabplot R package [8] that allow for theexploration and analysis of large multivariate datasets. In table plots each column represents avariable and each row a bin with a pre-defined number of observations. In these two figures, each bin

46N. Antonio et al. / Data in Brief 22 (2019) 41–49Table 2H1 dataset summary statistics – Date tusDate2014-11-182017-09-142016-07-31913Table 3H1 dataset summary statistics – Categorical variables.VariableUniqueTop : 13 095, NULL: 8 209, 250: 2 869, 241: 1 721Aug: 4 894, Jul: 4 573, Apr: 3 609, May: 3 559A: 17 046, D: 10 339, E: 5 638, C: 2 214NULL: 36 952, 223: 784, 281: 138, 154: 133PRT: 17 630, GBR: 6 814, ESP: 3 957, IRL: 2 166Tra.: 30 209, Tra.-Party: 7 791, Con.: 1 776,Gro.:284No Dep.: 38 199, Non-Refund.: 1 719, Ref.: 142TA/TO: 28 295, Dir.: 7 865, Cor.: 3 269, Und.: 10: 28 938, 1: 11 1220: 38 282, 1: 1 778Onl.: 17 729, Off.: 7472, Dir.: 6 513, Gro.: 5 836BB: 30 005, HB: 8 046, Und.: 1 169, FB: 754C.Out: 28 938, Can.: 10 831, No-Show: 291A: 23 399, D: 7 433, E: 4 892, G: 1610Table 4H1 dataset summary statistics – Integer and numeric contains 100 observations. The bars in each variable show the mean value for numeric variables orthe frequency of each level for categorical variables. Analyzing these figures it is possible to verifythat, for both of the hotels, the distribution of variables like Adults, Children, StaysInWeekendNights,StaysInWeekNights, Meal, Country and AssignedRoomType is clearly different between non-canceledand canceled bookings.

N. Antonio et al. / Data in Brief 22 (2019) 41–4947Table 5H2 dataset summary statistics – Date tusDate2014-10-172017-09-072016-08-10864Table 6H2 dataset summary statistics – Categorical variables.VariableUniqueTop onStatusReservedRoomType2241292081664352284389: 31 955, NULL: 8 131, 1: 7 137, 14: 3 640Aug: 8 983, May: 8 232, Jul: 8 088, Jun: 7 894A: 57 007, D: 14 983, E: 2 168, F: 2 018NULL: 75 641, 40: 924, 67: 267, 45: 250PRT: 30 960, FRA: 8 804, DEU: 6 084, GBR: 5315Tra.:59 404, Tra.-P.: 17 333, Con.: 2 300, Gro.:293No Dep.: 66 442, Non-Refund.: 12 868, Ref.: 20TA/TO: 68 945, Dir.: 6 780, Cor.: 3 408, GDS: 1930: 46 228, 1: 33 1020: 77 298, 1: 2 032Onl.: 38 748, Off.: 16 747, Gro.: 13 975, Dir.: 6 093BB: 62 305, SC: 10 564, HB: 6 417, FB: 44C.Out: 46 228, Can.: 32 186, No-Show: 916A: 62 595, D: 11768, F: 1 791, E: 1 553Table 7H2 dataset summary statistics – Integer and numeric 32316415

48N. Antonio et al. / Data in Brief 22 (2019) 41–49Fig. 2. H1 dataset partial visualization of all observations.Fig. 3. H2 dataset partial visualization of all observations.

N. Antonio et al. / Data in Brief 22 (2019) 41–4949AcknowledgementsThe authors would like to thank the hotels' administration for allowing their data to be sharedpublicly.Transparency document. Supplementary materialTransparency document associated with this article can be found in the online version at https://doi.org/10.1016/j.dib.2018.11.126.Appendix A. Supporting informationSupplementary data associated with this article can be found in the online version at es[1] N. Antonio, A. Almeida, L. Nunes, Predicting hotel bookings cancellation with a machine learning classification model, in:Proceedings of the 16th IEEE International Conference Machine Learning Application, IEEE, Cancun, Mexicopp. 1049–1054.doi:10.1109/ICMLA.2017.00-11, 2017.[2] International Civil Aviation Organization, Guidelines on Passenger Name Record (PNR) data, (2010). t/assets/doc PNR.pdf〉 (accessed 17February 2016).[3] D. Abbott, Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst, Wiley, Indianapolis, IN,USA, 2014.[4] Microsoft, SQL Server Management Studio (SSMS), (2017). erver-management-studio-ssms〉 (accessed 24 March 2018).[5] American Hotel & Lodging Association, Uniform System of Accounts for the Lodging Industry, 11th Revised edition,.Educational Institute, New York, 2014.[6] International Standards Organization, ISO country codes 3166-3:2013, :ed-2:v1:en,fr〉 (accessed 24 March 2018), 2013.[7] A. McNamara, E.A. de la Rubia, H. Zhu, S. Ellis, M. Quinn, skimr: Compact and flexible summaries of data. R package �skimr〉, 2018.[8] M. Tennekes, E. de Jonge, tabplot: Tableplot, a visualization of large datasets 2017.

Data Article Hotel booking demand datasets Nuno Antonioa,b,n, Ana de Almeidaa,c,d, Luis Nunesa,b,d a Instituto Universitário de Lisboa (ISCTE-IUL), Lisbon, Portugal b Instituto de Telecomunicações, Lisbon, Portugal c CISUC, Coimbra, Portugal d ISTAR-IUL, Lisbon, Portugal article info Article history: Received 5 October 2018 Accepted 26 November 2018 Available online 29 November 2018

Related Documents:

987 Prague Hotel, Hotel Adria, Hotel Ametyst, Aria Hotel, Art Deco Imperial Hotel Praha, . Hotel Belvedere Praha, Hotel Beránek Praha, Hotel Caesar Praha, Hotel Čechie Praha, Hotel Don Giovanni Praha, Hotel Duo Praha, Hotel Elite, Hotel Elysee Praha, Hotel Esplanade praha, Hotel Expo, Hotel Extol Inn, Hotel

view the booking. 2 4. Serko advises you the status of the booking and whether it is permissible to make changes. CLONE OR SHARE A BOOKING 5. Click on More Actions in the header to clone this booking or share it with another traveller. AMEND A BOOKING 6. Click on More Actions beside each option in the booking to add a hotel or car, or to amend .

B2B taxi booking A B2B taxi booking is a successful booking that has been placed from a Taxi Butler at a venue. Successful booking A successful booking is a booking that has been placed and accepted by a taxi driver and confirmed to have arrived at the desired location. Taxi Butler A taxi booking device that allows venue staff to book taxis .

Hotel Tech Report 2019 Booking Engine Buyer's Guide 3 OVERVIEW A A hotel booking engine is essentially the shopping cart equivalent for a hotel website and it's sole purpose is to drive and convert direct bookings. A good booking engine should be optimised for conversion, and should provide a simple booking process for your guests.

Availpro Booking Engines - Installing Availpro Booking Engines 4 VII. INSTALLING THE FACEBOOK BOOKING ENGINE 48 VIII. APPENDICES 50 1. Overview of the booking steps for the Crystal booking engine (single- and multi-hotel) 50 a. Calendar and search form version 51 b. Rate list version 51 c. Dynamic version of rates and rooms 51 2.

Table 1.1 Demand Management (source: taken from Philip Kotler, Marketing Management, 11th edn, 2003, p. 6) Category of demand Marketing task 1 Negative demand Encourage demand 2 No demand Create demand 3 Latent demand Develop demand 4 Falling demand Revitalize demand 5 Irregular demand Synchronize demand 6 Full demand Maintain demand

1. Airport Hotel 3-5 star 2. Beach Hotel 3-5 star 3. Boutique Hotel 4-5 star 4. Business Hotel 1-5 star 5. City Hotel 1-5 star 6. Convention Hotel 1-5 star 7. Family Hotel 3-5 star 8. Resort Hotel 3-5 star 9. Apartment Hotel 1-5 star Designators are awarded after the hotel has met the requirements of the respective designators.

MAGENTO BOOKING SYSTEM PRO - [SELECT DATE] 7 Enable: To select publish or unpublish the Magento Booking System Pro extension Format Date: Set format date to display in frontend Maximum Booking: Type the number of maximum booking Use default price: Set a default price for booking item Map Z