Auction Price Prediction

Auction price prediction from NIR spectral data represents the highest-value potential application in the tea quality stack - the ability to tell a factory, before the auction, what their lot is likely to sell for. Early experiments were promising, but the complexity of auction pricing requires a substantially larger and more temporally diverse dataset than is currently available.

MLTeaAuctionPrice PredictionGradient Boosting

Findings

What We've Established

Spectral Quality Proxy Correlation

NIR-predicted quality parameters (TPP, moisture, colour score) show statistically significant correlation with Colombo auction lot prices in our initial dataset. Lots with high TPP and optimal moisture show a consistent price premium. However, the predictive accuracy degrades significantly when controlling for non-spectral variables such as estate reputation, seasonal supply, and buyer mandate preferences.

Non-Spectral Confounders

A critical finding is that auction price is influenced by at least as many market factors as quality factors. Grade classification, estate origin, seasonal scarcity, and individual buyer mandates can account for large price swings that no quality-based spectral model can predict. This limits the application to price-band estimation rather than precise point prediction.

Methodology

Technical Approach

The initial approach used gradient boosting regression on a combined feature set: NIR-predicted quality parameters alongside categorical features (grade, estate region, season). Cross-validation showed acceptable performance within a single estate over a short time horizon but poor generalisation across estates and auction sessions. Direct end-to-end regression from raw spectral features to price is also being explored.

Status

Where We Stand

Paused

Development is paused. The core technical problem is a dataset size and diversity issue - not a modelling ceiling. We require a minimum of 1,500–2,000 paired auction records (spectral scan + validated reference data + realised auction price) spanning at least two full seasons before a generalisable model can be trained. Collecting this dataset is a medium-term data partnership objective.

Roadmap

Next Steps

Define data collection protocol for systematic spectral-auction price pairing

Engage auction-house stakeholders for structured data-sharing discussions

Revisit feasibility once dataset threshold is met

In the meantime, explore price-band classification as a lower-precision but more immediately viable output

Explore More

Related Projects

Model Ready

R&D Partnerships

Interested in This Research?

If you have relevant data, domain expertise, or a measurement problem in this area, we're open to research collaboration and data-sharing agreements.

Contact the Research Team Back to Pipeline