Artificial intelligence (AI) and machine learning (ML) have the potential to drastically improve estimated times of arrival (ETAs), the single most important data point for rail shippers. It doesn’t matter what you’re shipping, where it’s going, or how it’s getting there if you don’t know when it will arrive. But ETA is a data point that historically, everyone struggles to get right.
A few years ago, Railinc developed PETA, or predicted ETA. PETA created estimates based on historical central tendencies, or the average of time it takes for a rail shipment to move from location A to location B.
The benefits of this approach are that it is simple and easy to interpret. But, we quickly realized it is not reliable enough for shippers’ needs.
Many things can affect a rail ETA: service days, differences in train types, weather and more. Because PETA is based on averages, it is unable to account for the variety of possible railroad operating conditions without creating dozens of rules for every origin-destination (OD) pair. With thousands of OD pairs in the North American rail network, this approach is unmanageable. Additionally, basing estimates on only average travel times easily missed non-glaring patterns, such as occasional but consistent delays.
So, we went back to the drawing board to solve those challenges.
In 2018, Railinc introduced a first iteration of advanced ETAs. This model produced estimates based on a mixture of the averages used in PETA and models over a predicted route. For example, if we know a shipment is going from origin A to ultimate destination E, we can model how long a shipment might take if it went A to B to E versus A to B to C to D, and then to E.
This model is able to account for more operating scenarios than PETA but requires dozens of models for each origin-destination pair. The models themselves are over reliant on accurate route predictions — still leaving shippers with less-than-reliable ETAs.
For the next phase of ETAs, Railinc’s data science team deployed artificial intelligence and machine learning to create sequence models trained per origin-destination, at scale.
“Sequence models are used to predict future events in a time sequence,” says Railinc data scientist David Dodsworth, who has a doctorate in physics. “In the context of Advanced ETA, the model is first trained on historical, time-ordered events for completed trips, then deployed on trips happening in real time to iteratively predict future events in the sequence, up until arrival at the final destination. The model then returns a combined ETA prediction from the most recent observed event to the predicted final arrival.”
These advanced ETAs are capable of modeling complex operating practices such as train types, delay trends and more.
Sequence modeling used in this iteration of Advanced ETA learns rich feature representations through a series of non-linearities, meaning, it is capable of learning looking at all of the origin-destination data, identifying hierarchies, and recognizing and retaining important information from historical performance. Furthermore, the model ignores useless data, ensuring that a data point is not factored into the prediction unless it should be. As it trains over time, the model for an OD pair will improve.
Advanced ETA better accounts for scenarios such as, what if a car takes a different route from origin to destination? Or, what if it moves through, say, San Antonio three times faster than usual?
These new, more accurate ETAs show a significant improvement for both freight and intermodal lanes.
ETA is a complex problem to solve but leveraging advanced technology such as artificial intelligence and machine learning continues to bring us closer. The data science team at Railinc is constantly improving our ETA offering, but for now, our Advanced ETA Phase II model is more accurate than ever before.
Advanced ETA is an optional module in the TransmetriQ Platform, a one-stop shop for complete rail management that enables smarter rail shipping.