SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Meidani, Kazem; Shojaee, Parshin; Reddy, Chandan K.; Farimani, Amir Barati

Computer Science > Machine Learning

arXiv:2310.02227 (cs)

[Submitted on 3 Oct 2023 (v1), last revised 15 Mar 2024 (this version, v3)]

Title:SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Authors:Kazem Meidani, Parshin Shojaee, Chandan K. Reddy, Amir Barati Farimani

View PDF HTML (experimental)

Abstract:In an era where symbolic mathematical equations are indispensable for modeling complex natural phenomena, scientific inquiry often involves collecting observations and translating them into mathematical expressions. Recently, deep learning has emerged as a powerful tool for extracting insights from data. However, existing models typically specialize in either numeric or symbolic domains, and are usually trained in a supervised manner tailored to specific tasks. This approach neglects the substantial benefits that could arise from a task-agnostic multi-modal understanding between symbolic equations and their numeric counterparts. To bridge the gap, we introduce SNIP, a Symbolic-Numeric Integrated Pre-training model, which employs contrastive learning between symbolic and numeric domains, enhancing their mutual similarities in the embeddings. By performing latent space analysis, we observe that SNIP provides cross-domain insights into the representations, revealing that symbolic supervision enhances the embeddings of numeric data and vice versa. We evaluate SNIP across diverse tasks, including symbolic-to-numeric mathematical property prediction and numeric-to-symbolic equation discovery, commonly known as symbolic regression. Results show that SNIP effectively transfers to various tasks, consistently outperforming fully supervised baselines and competing strongly with established task-specific methods, especially in the low data regime scenarios where available data is limited. Code and model are available at: this https URL

Comments:	ICLR 2024 Spotlight Paper
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.02227 [cs.LG]
	(or arXiv:2310.02227v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.02227

Submission history

From: Parshin Shojaee [view email]
[v1] Tue, 3 Oct 2023 17:32:44 UTC (15,776 KB)
[v2] Thu, 19 Oct 2023 13:53:04 UTC (15,776 KB)
[v3] Fri, 15 Mar 2024 06:00:29 UTC (16,657 KB)

Computer Science > Machine Learning

Title:SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators