Info
This is a project for EECS 349: Machine Learning, a Computer Science course at Northwestern University
The students working on this project are Daniel Stein, Devon D'Apuzzo, Jackson Middleton, and Maryssa Sklaroff
Contact us at [email protected]
The students working on this project are Daniel Stein, Devon D'Apuzzo, Jackson Middleton, and Maryssa Sklaroff
Contact us at [email protected]
Synopsis
Ticket prices for the exact same ticket on Stubhub can vary greatly depending on when you buy. For example, a 3-day General Admission ticket on Stubhub to the New York music festival, Governors Ball 2016, varied from a minimum price of $234 to $850 over the course of ticket sales. That’s a potential savings of $616 if you are able to predict when the best ticket price is available. Our goal is to look at the attributes that most affect a ticket price for music festivals on Stubhub in order determine the best time to buy a ticket, which allows people to save money.
The main features we looked at were: when the ticket was bought (in relation to when the festival was taking place), what the ticket-price trends were around the time the ticket was bought, and the number of tickets available on the site at that time. We had two possible outcomes: “good time to buy” and “bad time to buy.” The machine learning algorithms we used on this data were Random Forest, BFTree and J48 in Weka.
In total we had 1,597 data points. We split this data into a training set consisting of 1,314 data points and a validation set consisting of 283 data points. We ran the aforementioned classifiers on the training set. Then we used the results of the training on the validation set to see how accurately we could predict the outcome from validation examples. We measured our success by the accuracy of our predictions.
We found that we were able to accurately predict whether or not it was a good time to buy a ticket about 90% of the time. We also found that the most important attribute when determining if it was a good time to buy was the difference in ticket price 60 hours prior to the observed price. One might expect the time until the festival to play a greater role. However, as we hypothesized from the start, ticket prices do not increase or decrease linearly over time. Instead, they fluctuate greatly up and down over time, and can be predicted only through algorithms that take into account the fluctuations themselves as well as a few other key attributes.
The main features we looked at were: when the ticket was bought (in relation to when the festival was taking place), what the ticket-price trends were around the time the ticket was bought, and the number of tickets available on the site at that time. We had two possible outcomes: “good time to buy” and “bad time to buy.” The machine learning algorithms we used on this data were Random Forest, BFTree and J48 in Weka.
In total we had 1,597 data points. We split this data into a training set consisting of 1,314 data points and a validation set consisting of 283 data points. We ran the aforementioned classifiers on the training set. Then we used the results of the training on the validation set to see how accurately we could predict the outcome from validation examples. We measured our success by the accuracy of our predictions.
We found that we were able to accurately predict whether or not it was a good time to buy a ticket about 90% of the time. We also found that the most important attribute when determining if it was a good time to buy was the difference in ticket price 60 hours prior to the observed price. One might expect the time until the festival to play a greater role. However, as we hypothesized from the start, ticket prices do not increase or decrease linearly over time. Instead, they fluctuate greatly up and down over time, and can be predicted only through algorithms that take into account the fluctuations themselves as well as a few other key attributes.
Below is a simplified Best First Decision Tree representation of our training set of data
Final Report
View our final report here: Final Report