Skip to main content
Log in

A visual analysis approach for data imputation via multi-party tabular data correlation strategies

  • Research Article
  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

An Erratum to this article was published on 31 January 2024

This article has been updated

Abstract

Data imputation is an essential pre-processing task for data governance, aimed at filling in incomplete data. However, conventional data imputation methods can only partly alleviate data incompleteness using isolated tabular data, and they fail to achieve the best balance between accuracy and efficiency. In this paper, we present a novel visual analysis approach for data imputation. We develop a multi-party tabular data association strategy that uses intelligent algorithms to identify similar columns and establish column correlations across multiple tables. Then, we perform the initial imputation of incomplete data using correlated data entries from other tables. Additionally, we develop a visual analysis system to refine data imputation candidates. Our interactive system combines the multi-party data imputation approach with expert knowledge, allowing for a better understanding of the relational structure of the data. This significantly enhances the accuracy and efficiency of data imputation, thereby enhancing the quality of data governance and the intrinsic value of data assets. Experimental validation and user surveys demonstrate that this method supports users in verifying and judging the associated columns and similar rows using their domain knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Change history

References

Download references

Author information

Authors and Affiliations

Authors

Contributions

Haiyang ZHU conceptualized the main idea and led the research. Haiyang ZHU and Wei CHEN surveyed the relevant materials. All the authors had in-depth discussions; they drafted, revised, and finalized the paper.

Corresponding author

Correspondence to Wei Chen.

Ethics declarations

Haiyang ZHU, Dongmin HAN, Jiacheng PAN, Yating WEI, Yingchaojie FENG, Luoxuan WENG, Ketian MAO, Yuankai XING, Jianshu LV, Qiucheng WAN, and Wei CHEN declare that they have no conflict of interest.

Additional information

Project supported by the Key R&D “Pioneer” Tackling Plan Program of Zhejiang Province, China (No. 2023C01119), the “Ten Thousand Talents Plan” Science and Technology Innovation Leading Talent Program of Zhejiang Province, China (No. 2022R52044), the Major Standardization Pilot Projects for the Digital Economy (Digital Trade Sector) of Zhejiang Province, China (No. SJ-BZ/2023053), and the National Natural Science Foundation of China (No. 62132017)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Han, D., Pan, J. et al. A visual analysis approach for data imputation via multi-party tabular data correlation strategies. Front Inform Technol Electron Eng 25, 398–414 (2024). https://doi.org/10.1631/FITEE.2300480

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.2300480

Key words

CLC number

Navigation