# Semantically Aligned Universal Tree Structured Solver for Math Word Problems

EMNLP 2020, pp. 3780-3789, 2020.

Weibo:

Abstract:

A practical automatic textual math word problems (MWPs) solver should be able to solve various textual MWPs while most existing works only focused on one-unknown linear MWPs. Herein, we propose a simple but efficient method called Universal Expression Tree (UET) to make the first attempt to represent the equations of various MWPs uniforml...More

Code:

Data:

Introduction

- Math word problems (MWPs) solving aims to automatically answer a math word problem by understanding the textual description of the problem and reasoning out the underlying answer.
- A typical MWP is a short story that describes a partial state of the world and poses a question about an = Arithmetic Word * Problem.
- Equation Set Problem + xy Expression Trees ;

Highlights

- Math word problems (MWPs) solving aims to automatically answer a math word problem by understanding the textual description of the problem and reasoning out the underlying answer
- To address the above issues, we propose a simple yet efficient method called Universal Expression Tree (UET) to make the first attempt to represent the equations of various MWPs uniformly like the expression tree of one-unknown linear word problems with considering unknowns
- We propose a semantically-aligned universal tree-structured solver (SAU-Solver), which is based on our UET representation and an EncoderDecoder framework, to solve multiple types of MWPs in a unified manner with a single model
- To overcome the above issues and maintain simplicity, we propose a new equation representation called Universal Expression Tree (UET) to make the first attempt to represent the equations of various MWPs uniformly
- We introduce a new high-quality MWPs dataset, called Hybrid Math Word Problems dataset (HMWP), in which each sample is extracted from a Chinese K12 math word problem bank, to validate the universality of math word problem solvers and push the research boundary of MWPs to match real-world scenes better
- Experimental results on several MWPs datasets show that our model can solve universal types of MWPs and outperforms several state-of-the-art models
- We propose an SAU-Solver, which is able to solve multiple types of MWPs, to generate the universal express tree explicitly in a semantically-aligned manner

Methods

- Baselines, and Evaluation metric.
- The main state-of-the-art learning-based methods to be compared are as follows: Seq2Seq-attn w/ SNI (Wang et al, 2017) is a universal solver based on the seq2seq model with significant number identification(SNI).
- The authors use answer accuracy as the evaluation metric: if the calculated value of the predicted expression tree equals to the true answer, it is thought of correct since the predicted expression is equivalent to the target expression

Results

- Several observations can be made from the results in Table 2 as follows: First, the SAU-Solver has achieved significantly better than the baselines on four datasets
- It proves that the model is feasible for solving multiple types.
- ( An unknown number of rabbits and chickens were locked in a cage, counting from the top, there were NUM(n0 [20]) heads, counting from the bottom, there were NUM(n1 [50]) feet.

Conclusion

- The methods most relevant to the method are GTS (Xie and Sun, 2019) and StackDecoder (Chiang and Chen, 2019).
- An exception is the Math23K dataset which contains 23161 problems labeled well with structured equations and answers
- It only contains one-unknown linear MWPs, which is not sufficient to validate the ability of a math solver about solving multiple types of MWPs. the authors introduce a new high-quality MWPs dataset, called HMWP, in which each sample is extracted from a Chinese K12 math word problem bank, to validate the universality of math word problem solvers and push the research boundary of MWPs to match real-world scenes better.

Summary

## Introduction:

Math word problems (MWPs) solving aims to automatically answer a math word problem by understanding the textual description of the problem and reasoning out the underlying answer.- A typical MWP is a short story that describes a partial state of the world and poses a question about an = Arithmetic Word * Problem.
- Equation Set Problem + xy Expression Trees ;
## Methods:

Baselines, and Evaluation metric.- The main state-of-the-art learning-based methods to be compared are as follows: Seq2Seq-attn w/ SNI (Wang et al, 2017) is a universal solver based on the seq2seq model with significant number identification(SNI).
- The authors use answer accuracy as the evaluation metric: if the calculated value of the predicted expression tree equals to the true answer, it is thought of correct since the predicted expression is equivalent to the target expression
## Results:

Several observations can be made from the results in Table 2 as follows: First, the SAU-Solver has achieved significantly better than the baselines on four datasets- It proves that the model is feasible for solving multiple types.
- ( An unknown number of rabbits and chickens were locked in a cage, counting from the top, there were NUM(n0 [20]) heads, counting from the bottom, there were NUM(n1 [50]) feet.
## Conclusion:

The methods most relevant to the method are GTS (Xie and Sun, 2019) and StackDecoder (Chiang and Chen, 2019).- An exception is the Math23K dataset which contains 23161 problems labeled well with structured equations and answers
- It only contains one-unknown linear MWPs, which is not sufficient to validate the ability of a math solver about solving multiple types of MWPs. the authors introduce a new high-quality MWPs dataset, called HMWP, in which each sample is extracted from a Chinese K12 math word problem bank, to validate the universality of math word problem solvers and push the research boundary of MWPs to match real-world scenes better.

- Table1: Statistics of our dataset and several publicly available datasets. Avg EL, Avg SNI, Avg Constants, and Avg Ops represent average equation length, average number of quantities occurred in problems and their corresponding equations, average numbers of constants only occurred in equations, and average numbers of operators in equations, respectively. The higher these values, the more difficult it is. This has been shown in (<a class="ref-link" id="cXie_2019_a" href="#rXie_2019_a">Xie and Sun, 2019</a>)
- Table2: Model comparison on answer accuracy via 5fold cross-validation. “-” means either the code is not released or the model is not suitable on those datasets
- Table3: The data statistics and performance on different subset of HMWP
- Table4: Typical cases. Note that the results are represented as infix traversal of expression trees which is more readable than prefix traversal
- Table5: Accuracy of different expression tree size

Related work

- Numerous methods have been proposed to attack the MWPs task, ranging from rule-based methods (Bakman, 2007; Yuhui et al, 2010), statistical machine learning methods (Kushman et al, 2014; Zhou et al, 2015; Mitra and Baral, 2016; Huang et al, 2016; Roy and Roth, 2018),semantic parsing methods (Shi et al, 2015; Koncelkedziorski et al, 2015; Huang et al, 2017), and deep learning methods (Ling et al, 2017; Wang et al, 2017, 2018b; Huang et al, 2018; Wang et al, 2018a; Xie and Sun, 2019; Wang et al, 2019). Due to space limitations, we only review some recent advances on deep leaning-based methods. (Wang et al, 2017) made the first attempt to generate expression templates using Seq2Seq model. Seq2seq method has achieved promising results, but it suffers from generating spurious numbers, predicting numbers at wrong positions, or equation duplication problem (Huang et al, 2018; Wang et al, 2018a). To address them, (Huang et al, 2018) proposed to add a copy-and-alignment mechanism to the standard Seq2Seq model. (Wang et al, 2018a) proposed equation normalization to normalize the duplicated equations by considering the uniqueness of an expression tree.

Different from seq2seq-based works, (Xie and Sun, 2019) proposed a tree-structured decoder to generate an expression tree inspired by the goaldriven problem-solving mechanism. (Wang et al, 2019) proposed a two-stage template-based solution based on a recursive neural network for math expression construction. However, they do not model the unknowns underlying in MWPs, resulting in only handling one-unknown linear word problems. Besides, they also lack an efficient mechanism to handle those MWPs with multiple unknowns and multiple equations, such as equation set problems. Therefore, their solution can not solve other types of MWPs that are more challenging due to larger search space, such as equation set problems, non-linear equation problems, etc. (Chiang and Chen, 2019) is a general equation generator that generates expression via the stack, but they did not consider the semantic transformation between equations in a problem, resulting in poor performance on the multiple-unknown MWPs, such as equation set problems.

Funding

- This work was supported in part by National Key RD Program of China under Grant No 2018AAA0100300, National Natural Science Foundation of China (NSFC) under Grant No.U19A2073 and No.61976233, Guangdong Province Basic and Applied Basic Research (Regional Joint Fund-Key) Grant No.2019B1515120039, Nature Science Foundation of Shenzhen Under Grant No 2019191361, Zhijiang Lab’s Open Fund (No 2020AA3AB14), Sichuan Science and Technology Program (No 2019YJ0190)

Study subjects and analysis

datasets: 4

5.1 Experimental Setup and Training Details

Datasets, Baselines, and Evaluation metric. We conduct experiments on four datasets, such as HMWP, Alg514 (Kushman et al, 2014), Math23K (Wang et al, 2017) and Dolphin18KManual (Huang et al, 2016). The data statistics of four datasets are shown in Table 1

Datasets, Baselines, and Evaluation metric. We conduct experiments on four datasets, such as HMWP, Alg514 (Kushman et al, 2014), Math23K (Wang et al, 2017) and Dolphin18KManual (Huang et al, 2016). The data statistics of four datasets are shown in Table 1

datasets: 4

We conduct experiments on four datasets, such as HMWP, Alg514 (Kushman et al, 2014), Math23K (Wang et al, 2017) and Dolphin18KManual (Huang et al, 2016). The data statistics of four datasets are shown in Table 1. The main state-of-the-art learning-based methods to be compared are as follows: Seq2Seq-attn w/ SNI (Wang et al, 2017) is a universal solver based on the seq2seq model with significant number identification(SNI)

datasets: 4

Datasets, Baselines, and Evaluation metric. We conduct experiments on four datasets, such as HMWP, Alg514 (Kushman et al, 2014), Math23K (Wang et al, 2017) and Dolphin18KManual (Huang et al, 2016). The data statistics of four datasets are shown in Table 1

datasets: 4

We conduct experiments on four datasets, such as HMWP, Alg514 (Kushman et al, 2014), Math23K (Wang et al, 2017) and Dolphin18KManual (Huang et al, 2016). The data statistics of four datasets are shown in Table 1. The main state-of-the-art learning-based methods to be compared are as follows: Seq2Seq-attn w/ SNI (Wang et al, 2017) is a universal solver based on the seq2seq model with significant number identification(SNI)

datasets: 4

Answer Accuracy. We conduct 5-fold crossvalidation to evaluate the performances of baselines and our models on all four datasets. The results are shown in Table 2

datasets: 4

The results are shown in Table 2. Several observations can be made from the results in Table 2 as follows: First, our SAU-Solver has achieved significantly better than the baselines on four datasets. It proves that our model is feasible for solving multiple types

cases: 4

5.4 Case Study. Further, we conduct a case analysis and provide four cases in Table 4, which shows the effectiveness of our approach. Our analyses are summarized as follows

Reference

- Yefim Bakman. 2007. Robust understanding of word problems with extraneous information. Computing Research Repository, arXiv:math/0701393.
- Ting-Rui Chiang and Yun-Nung Chen. 2019. Semantically-aligned equation generation for solving and reasoning math word problems. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2656– 2668. Association for Computational Linguistics.
- Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder–decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724– 1734. Association for Computational Linguistics.
- Danqing Huang, Jing Liu, Chin-Yew Lin, and Jian Yin. 2018. Neural math word problem solver with reinforcement learning. In Proceedings of the 27th International Conference on Computational Linguistics, pages 213–223. Association for Computational Linguistics.
- Danqing Huang, Shuming Shi, Chin-Yew Lin, and Jian Yin. 2017. Learning fine-grained expressions to solve math word problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 805–814. Association for Computational Linguistics.
- Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin, and Wei-Ying Ma. 201How well do computers solve math word problems? large-scale dataset construction and evaluation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 887–896. Association for Computational Linguistics.
- Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In international conference on learning representations.
- Rik Koncel-Kedziorski, Subhro Roy, Aida Amini, Nate Kushman, and Hannaneh Hajishirzi. 2016. Mawps: A math word problem repository. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1152–1157.
- Rik Koncelkedziorski, Hannaneh Hajishirzi, Ashish Sabharwal, Oren Etzioni, and Siena Dumas Ang. 2015. Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics, 3:585–597.
- Nate Kushman, Yoav Artzi, Luke Zettlemoyer, and Regina Barzilay. 2014. Learning to automatically solve algebra word problems. In Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics, volume 1, pages 271–281.
- Wang Ling, Dani Yogatama, Chris Dyer, and Phil Blunsom. 2017. Program induction by rationale generation: Learning to solve and explain algebraic word problems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 158–167. Association for Computational Linguistics.
- Arindam Mitra and Chitta Baral. 2016. Learning to use formulas to solve simple arithmetic problems. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2144–2153. Association for Computational Linguistics.
- Subhro Roy and Dan Roth. 2018. Mapping to declarative knowledge for word problem solving. Transactions of the Association for Computational Linguistics, 6:159–172.
- Lei Wang, Dongxiang Zhang, Zhang Jipeng, Xing Xu, Lianli Gao, Bing Tian Dai, and Heng Tao Shen. 2019. Template-based math word problem solvers with recursive neural networks. In Thirty-Third AAAI Conference on Artificial Intelligence, pages 7144–7151.
- Yan Wang, Xiaojiang Liu, and Shuming Shi. 2017. Deep neural solver for math word problems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 845– 854. Association for Computational Linguistics.
- Zhipeng Xie and Shichao Sun. 2019. A goal-driven tree-structured neural model for math word problems. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 5299–5305. International Joint Conferences on Artificial Intelligence Organization.
- Ma Yuhui, Zhou Ying, Cui Guangzuo, Ren Yun, and Huang Ronghuai. 2010. Frame-based calculus of solving arithmetic multi-step addition and subtraction word problems. In International Workshop on Education Technology and Computer Science, volume 2, pages 476–479.
- Lipu Zhou, Shuaixiang Dai, and Liwei Chen. 2015. Learn to solve algebra word problems using quadratic programming. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 817–822. Association for Computational Linguistics.
- Shuming Shi, Yuehui Wang, Chin-Yew Lin, Xiaojiang Liu, and Yong Rui. 2015. Automatically solving number word problems by semantic parsing and reasoning. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1132–1142. Association for Computational Linguistics.
- Shyam Upadhyay and Mingwei Chang. 2017. Annotating derivations: A new evaluation strategy and dataset for algebra word problems. In 15th Conference of the European Chapter of the Association for Computational Linguistics, pages 494–504.
- Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, and Xiaojiang Liu. 2018a. Translating a math word problem to a expression tree. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1064–1069. Association for Computational Linguistics.
- Lei Wang, Dongxiang Zhang, Lianli Gao, Jingkuan Song, Long Guo, and Heng Tao Shen. 2018b. Mathdqn: Solving arithmetic word problems via deep reinforcement learning. In Thirty-Second AAAI Conference on Artificial Intelligence, pages 5545–5552.

Full Text

Tags

Comments