Auction Price Prediction
Auction price prediction from NIR spectral data represents the highest-value potential application in the tea quality stack - the ability to tell a factory, before the auction, what their lot is likely to sell for. Early experiments were promising, but the complexity of auction pricing requires a substantially larger and more temporally diverse dataset than is currently available.
Findings
What We've Established
Spectral Quality Proxy Correlation
NIR-predicted quality parameters (TPP, moisture, colour score) show statistically significant correlation with Colombo auction lot prices in our initial dataset. Lots with high TPP and optimal moisture show a consistent price premium. However, the predictive accuracy degrades significantly when controlling for non-spectral variables such as estate reputation, seasonal supply, and buyer mandate preferences.
Non-Spectral Confounders
A critical finding is that auction price is influenced by at least as many market factors as quality factors. Grade classification, estate origin, seasonal scarcity, and individual buyer mandates can account for large price swings that no quality-based spectral model can predict. This limits the application to price-band estimation rather than precise point prediction.
Methodology
Technical Approach
The initial approach used gradient boosting regression on a combined feature set: NIR-predicted quality parameters alongside categorical features (grade, estate region, season). Cross-validation showed acceptable performance within a single estate over a short time horizon but poor generalisation across estates and auction sessions. Direct end-to-end regression from raw spectral features to price is also being explored.
Status
Where We Stand
Development is paused. The core technical problem is a dataset size and diversity issue - not a modelling ceiling. We require a minimum of 1,500–2,000 paired auction records (spectral scan + validated reference data + realised auction price) spanning at least two full seasons before a generalisable model can be trained. Collecting this dataset is a medium-term data partnership objective.
Roadmap
Next Steps
Define data collection protocol for systematic spectral-auction price pairing
Engage auction-house stakeholders for structured data-sharing discussions
Revisit feasibility once dataset threshold is met
In the meantime, explore price-band classification as a lower-precision but more immediately viable output
Explore More
Related Projects
Total Polyphenols (TPP)
Total polyphenol profiling for black tea - the leading proxy for antioxidant quality and a primary auction pricing signal.
View project Model ReadyMoisture Content - Green Tea
Real-time MC measurement for green tea. Model trained on TRI-standard data and independently validated in December 2025.
View project Model ReadySugar in Tea
Total sugar quantification across black tea grades - a key differentiator for premium blend specification and export documentation.
View projectR&D Partnerships
Interested in This Research?
If you have relevant data, domain expertise, or a measurement problem in this area, we're open to research collaboration and data-sharing agreements.
