MMSR: Symbolic Regression is a Multimodal Task
CoRR(2024)
摘要
Mathematical formulas are the crystallization of human wisdom in exploring
the laws of nature for thousands of years. Describing the complex laws of
nature with a concise mathematical formula is a constant pursuit of scientists
and a great challenge for artificial intelligence. This field is called
symbolic regression. Symbolic regression was originally formulated as a
combinatorial optimization problem, and GP and reinforcement learning
algorithms were used to solve it. However, GP is sensitive to hyperparameters,
and these two types of algorithms are inefficient. To solve this problem,
researchers treat the mapping from data to expressions as a translation
problem. And the corresponding large-scale pre-trained model is introduced.
However, the data and expression skeletons do not have very clear word
correspondences as the two languages do. Instead, they are more like two
modalities (e.g., image and text). Therefore, in this paper, we proposed MMSR.
The SR problem is solved as a pure multimodal problem, and contrastive learning
is also introduced in the training process for modal alignment to facilitate
later modal feature fusion. It is worth noting that in order to better promote
the modal feature fusion, we adopt the strategy of training contrastive
learning loss and other losses at the same time, which only needs one-step
training, instead of training contrastive learning loss first and then training
other losses. Because our experiments prove training together can make the
feature extraction module and feature fusion module running-in better.
Experimental results show that compared with multiple large-scale pre-training
baselines, MMSR achieves the most advanced results on multiple mainstream
datasets including SRBench.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要