Institute of Computing Technology, Chinese Academy IR
Multiset Feature Learning for Highly Imbalanced Data Classification | |
Jing, Xiao-Yuan1,2,3; Zhang, Xinyu1; Zhu, Xiaoke4; Wu, Fei3; You, Xinge5; Gao, Yang6; Shan, Shiguang7; Yang, Jing-Yu8 | |
2021 | |
发表期刊 | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE |
ISSN | 0162-8828 |
卷号 | 43期号:1页码:139-156 |
摘要 | With the expansion of data, increasing imbalanced data has emerged. When the imbalance ratio (IR) of data is high, most existing imbalanced learning methods decline seriously in classification performance. In this paper, we systematically investigate the highly imbalanced data classification problem, and propose an uncorrelated cost-sensitive multiset learning (UCML) approach for it. Specifically, UCML first constructs multiple balanced subsets through random partition, and then employs the multiset feature learning (MFL) to learn discriminant features from the constructed multiset. To enhance the usability of each subset and deal with the nonlinearity issue existed in each subset, we further propose a deep metric based UCML (DM-UCML) approach. DM-UCML introduces the generative adversarial network technique into the multiset constructing process, such that each subset can own similar distribution with the original dataset. To cope with the non-linearity issue, DM-UCML integrates deep metric learning with MFL, such that more favorable performance can be achieved. In addition, DM-UCML designs a new discriminant term to enhance the discriminability of learned metrics. Experiments on eight traditional highly class-imbalanced datasets and two large-scale datasets indicate that: the proposed approaches outperform state-of-the-art highly imbalanced learning methods and are more robust to high IR. |
关键词 | Highly imbalanced data classification multiset feature learning deep metric learning generative adversarial network cost-sensitive factor weighted uncorrelated constraint |
DOI | 10.1109/TPAMI.2019.2929166 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | NSFC-Key Project of General Technology Fundamental Research United Fund[U1736211] ; National Natural Science Foundation of China[61672208] ; National Natural Science Foundation of China[61702280] ; National Natural Science Foundation of China[61772220] ; National Natural Science Foundation of China[61432008] ; key research and development program of China[2016YFE0121200] ; Key Science and Technology Innovation Program of Hubei Province[2017AAA017] ; Key Science and Technology Innovation Program of Hubei Province[2018ACA135] ; Natural Science Foundation Key Project for Innovation Group of Hubei Province[2018CFA024] ; Natural Science Foundation of Jiangsu Province[BK20170900] ; National Postdoctoral Program for Innovative Talents[BX20180146] ; Higher Education Institution Key Research Projects of Henan Province[19A520001] |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000597206900010 |
出版者 | IEEE COMPUTER SOC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://119.78.100.204/handle/2XEOYT63/16518 |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Jing, Xiao-Yuan; Zhu, Xiaoke; Wu, Fei |
作者单位 | 1.Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China 2.Guangdong Univ Petrochem Technol, Sch Comp, Maoming 525000, Peoples R China 3.Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing 210003, Peoples R China 4.Henan Univ, Henan Key Lab Big Data Anal & Proc, Kaifeng 475004, Peoples R China 5.Huazhong Univ Sci & Technol, Dept Elect & Informat Engn, Wuhan 430074, Peoples R China 6.Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210094, Peoples R China 7.Chinese Acad Sci, Inst Comp Technol, CAS, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 8.Nanjing Univ Sci & Technol, Coll Comp Sci, Nanjing 210094, Peoples R China |
推荐引用方式 GB/T 7714 | Jing, Xiao-Yuan,Zhang, Xinyu,Zhu, Xiaoke,et al. Multiset Feature Learning for Highly Imbalanced Data Classification[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2021,43(1):139-156. |
APA | Jing, Xiao-Yuan.,Zhang, Xinyu.,Zhu, Xiaoke.,Wu, Fei.,You, Xinge.,...&Yang, Jing-Yu.(2021).Multiset Feature Learning for Highly Imbalanced Data Classification.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,43(1),139-156. |
MLA | Jing, Xiao-Yuan,et al."Multiset Feature Learning for Highly Imbalanced Data Classification".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 43.1(2021):139-156. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论