Amazon AI Introduces ‘PAVE’: A Novel Reinforcement Learning Model That Uses Lazy-MDP Formalism To Improve Recall of Product Attribute Extraction Models

Millions of products are available in e-commerce stores’ catalogs. A significant portion of these products is listed by independent vendors. There are often errors in the listed information caused by vendors’ poor linguistic competency, a lack of a comprehensive grasp of international clients, and discrepancies between vendors’ and e-commerce interpretations. This results in inaccurate information about product attributes, which dissatisfies customers.

Studies show that this issue can be addressed if the data is available as extractable free-text fields, such as titles, bullets, or images. Many have transformed the attribute extraction (AE) issue from product profiles using conventional methods such as information extraction or text classification as a natural language task. However, due to:

  1. Incorrect or missing information in product profiles.
  2. Generalization errors.
  3. Confidence thresholding to operate at high precision, such models typically operate with high precision but may produce low recall, especially on attributes with an open vocabulary.

Recent research from Amazon proposes PAVE: Product Attribute Value Ensemble as Lazy-MDP, a novel ensemble model using reinforcement learning (RL) that increases recall of conventional AE models without sacrificing precision, as a solution to this problem.

The trained RL agent emits the best value after idly scanning through each product neighbor to determine whether the neighbor attribute value is useful or not. The team claims that this method scales well and generalizes effectively to unknown qualities. In addition to handling noise to maintain precision, the method also supports dynamic neighbor length.

As mentioned in their paper, “PAVE: Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models,” they first apply RL in product attribute extraction. Then they propose several cutting-edge techniques to address the abovementioned problems with conventional approaches.

The researchers train a policy network that learns to select the right value from the sequence via proximal policy optimization. They tested their model against robust baselines, including BERT-based AE models and various ensemble approaches using real-world e-commerce datasets. The results demonstrate that this strategy outperforms AE models for closed characteristics and even simple aggregation methods like the nearest neighbor, majority vote, and binary classifier ensembles. With an average lift of 10.3% and no loss in precision, they notice a consistent gain in recall across all open attributes compared to traditional AE models.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'PAVE: Lazy-MDP based ensemble to improve recall of product attribute extraction models'. All Credit For This Research Goes To Researchers on This Project. Check out the paper.

Please Don't Forget To Join Our ML Subreddit
🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...