Facebook ProphetTime series forecasting is an important part of decision-making in many industries, including banking, retail, healthcare, and others. Accurately forecasting future trends and patterns may assist firms in optimizing inventory management, anticipating demand, allocating resources effectively, and making informed strategic decisions. Traditional forecasting approaches, on the other hand, frequently involve domain expertise, manual parameter adjustment, and complicated modeling procedures, rendering them out of reach for non-experts. To solve these issues, Facebook created Prophet, an open-source forecasting tool that democratizes time series forecasting by offering a straightforward yet powerful framework for analysts, data scientists, and domain experts alike. Facebook Prophet is intended to ease the process of time series forecasting by automating many of the manual tasks required by standard forecasting methods. It is based on the ideas of flexibility, simplicity, and scalability, allowing users to concentrate on analyzing findings rather than dealing with complicated modeling methodologies. Now for a better understanding of the Facebook Prophet, we will use it to forecast hourly energy use. Code: Importing LibrariesDataWe will be using hourly electricity usage statistics from PJM. Energy use has certain distinctive characteristics. It will be fascinating to observe how the Prophet picks them up. Pulling data from the PJM East, which covers the whole eastern area from 2002 to 2018. Output: EDA(Exploratory Data Analysis)We'll create several time series features to examine how trends change depending on the day of the week, hour, time of year, and so on. Output: Plotting the Features
Output: Splitting the DatasetTo utilize it as a validation set, we cut out data after 2015. We will train with previous data. Output: Facebook's Prophet ModelThe Prophet model expects the dataset to be named in a specific way. We will rename our dataframe columns before feeding them into the model. Output: Output: Output: Output: ComparingOutput: Look at the first month of predictions Output: Single Week of PredictionsOutput: Error MetricsOur RMSE error is 43761675. Our MAE error is 5181.78. Our MAPE inaccuracy is 16.5 percent. In comparison to the XGBoost model, our errors were substantially lower (8.9% MAPE). Output: Output: Output: Adding HolidaysNext, we'll examine if adding holiday markers improves the model's accuracy. Prophet has a Holiday Effects parameter that may be passed to the model before training. We will utilize the built-in pandas USFederalHolidayCalendar to retrieve the list of holidays. Output: Predict With HolidaysPlot Holiday EffectOutput: Error Metrics with Holidays AddedSurprisingly the error has gotten worse after adding holidays. Output: Output: Compare Models Just for Holiday DatesLet us plot the Forecast model with and without the Fourth of July holidays. The model with holidays appears to be more realistic for this particular holiday. Output: Output: Compare Error for the 4th of JulyThe mistake has been resolved for this day. Output: Output: Error of all HolidaysHoliday mistakes have increased! This is surprising. Output: Output: Identify Error by HolidayThis model demonstrates how various holidays respond differently. The model might perform better if we explicitly identify holidays instead of lumping them all together as "USFederalHolidays" Output: Plot Error of Each ForecastWe can observe that both of our models generalize effectively, but struggle at peak demand periods. It appears to be underestimating many days. Output: Data CleaningData cleansing is a critical aspect of the forecasting process. If the incoming data contains trash values, the forecast will use them to make predictions, potentially causing major problems. We note in the training data that there are some bad measurements with much lower values. Could these be causing the underestimation? Let's attempt to remove this faulty data. This graphic highlights the poor data in red. Output: Output: Hurricane Sandy hit the eastern United States on October 29-30, causing severe winds and coastal flooding that left an estimated 8 million people without power. The storm, which made landfall as a Category 1 hurricane near Atlantic City, New Jersey, eventually left scores of homes and businesses without power in New Jersey (2.7 million), New York (2.2 million), Pennsylvania (1.2 million), Connecticut (620,000), Massachusetts (400,000), Maryland (290,000), West Virginia (268,000), Ohio (250,000), and New Hampshire (210,000). Power outages were also recorded in many additional states, including Virginia, Maine, Rhode Island, Vermont, and the District of Columbia. Output: After cleaning, we see a modest improvement in the score compared to the initial model. More data cleansing, along with holidays, might lead to even better outcomes. Give it a try! Output: Output: |