CSpace
A Lightweight Hybrid Model with Location-Preserving ViT for Efficient Food Recognition
Sheng, Guorui1; Min, Weiqing2,3; Zhu, Xiangyi1; Xu, Liang1; Sun, Qingshuo1; Yang, Yancun1; Wang, Lili1; Jiang, Shuqiang2,3
2024
发表期刊NUTRIENTS
卷号16期号:2页码:16
摘要Food-image recognition plays a pivotal role in intelligent nutrition management, and lightweight recognition methods based on deep learning are crucial for enabling mobile deployment. This capability empowers individuals to effectively manage their daily diet and nutrition using devices such as smartphones. In this study, we propose an Efficient Hybrid Food Recognition Net (EHFR-Net), a novel neural network that integrates Convolutional Neural Networks (CNN) and Vision Transformer (ViT). We find that in the context of food-image recognition tasks, while ViT demonstrates superiority in extracting global information, its approach of disregarding the initial spatial information hampers its efficacy. Therefore, we designed a ViT method termed Location-Preserving Vision Transformer (LP-ViT), which retains positional information during the global information extraction process. To ensure the lightweight nature of the model, we employ an inverted residual block on the CNN side to extract local features. Global and local features are seamlessly integrated by directly summing and concatenating the outputs from the convolutional and ViT structures, resulting in the creation of a unified Hybrid Block (HBlock) in a coherent manner. Moreover, we optimize the hierarchical layout of EHFR-Net to accommodate the unique characteristics of HBlock, effectively reducing the model size. Our extensive experiments on three well-known food image-recognition datasets demonstrate the superiority of our approach. For instance, on the ETHZ Food-101 dataset, our method achieves an outstanding recognition accuracy of 90.7%, which is 3.5% higher than the state-of-the-art ViT-based lightweight network MobileViTv2 (87.2%), which has an equivalent number of parameters and calculations.
关键词food recognition lightweight global feature ViT nutrition management
DOI10.3390/nu16020200
收录类别SCI
语种英语
WOS研究方向Nutrition & Dietetics
WOS类目Nutrition & Dietetics
WOS记录号WOS:001151224800001
出版者MDPI
引用统计
被引频次:1[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://119.78.100.204/handle/2XEOYT63/38396
专题中国科学院计算技术研究所
通讯作者Yang, Yancun
作者单位1.Ludong Univ, Sch Informat & Elect Engn, Yantai 264025, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
3.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Sheng, Guorui,Min, Weiqing,Zhu, Xiangyi,et al. A Lightweight Hybrid Model with Location-Preserving ViT for Efficient Food Recognition[J]. NUTRIENTS,2024,16(2):16.
APA Sheng, Guorui.,Min, Weiqing.,Zhu, Xiangyi.,Xu, Liang.,Sun, Qingshuo.,...&Jiang, Shuqiang.(2024).A Lightweight Hybrid Model with Location-Preserving ViT for Efficient Food Recognition.NUTRIENTS,16(2),16.
MLA Sheng, Guorui,et al."A Lightweight Hybrid Model with Location-Preserving ViT for Efficient Food Recognition".NUTRIENTS 16.2(2024):16.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Sheng, Guorui]的文章
[Min, Weiqing]的文章
[Zhu, Xiangyi]的文章
百度学术
百度学术中相似的文章
[Sheng, Guorui]的文章
[Min, Weiqing]的文章
[Zhu, Xiangyi]的文章
必应学术
必应学术中相似的文章
[Sheng, Guorui]的文章
[Min, Weiqing]的文章
[Zhu, Xiangyi]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。