More ConvNets in the 2020s : Scaling up Kernels Beyond 51x51 using Sparsity
Liu, S., Chen, T., Chen, X., Chen, X., Xiao, Q., Wu, B., Kärkkäinen, T., Pechenizkiy, M., Mocanu, D. C., & Wang, Z. (2023). More ConvNets in the 2020s : Scaling up Kernels Beyond 51x51 using Sparsity. In ICLR 2023 : The Eleventh International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=bXNl-myZkJl
Tekijät
Päivämäärä
2023Tekijänoikeudet
© 2023 OpenReview.net
Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs). The dominant role of convolutional neural networks (CNNs) seems to be challenged by increasingly effective transformer-based models. Very recently, a couple of advanced convolutional models strike back with large kernels motivated by the local-window attention mechanism, showing appealing performance and efficiency. While one of them, i.e. RepLKNet, impressively manages to scale the kernel size to 31x31 with improved performance, the performance starts to saturate as the kernel size continues growing, compared to the scaling trend of advanced ViTs such as Swin Transformer. In this paper, we explore the possibility of training extreme convolutions larger than 31x31 and test whether the performance gap can be eliminated by strategically enlarging convolutions. This study ends up with a recipe for applying extremely large kernels from the perspective of sparsity, which can smoothly scale up kernels to 61x61 with better performance. Built on this recipe, we propose Sparse Large Kernel Network (SLaK), a pure CNN architecture equipped with sparse factorized 51x51 kernels that can perform on par with or better than state-of-the-art hierarchical Transformers and modern ConvNet architectures like ConvNeXt and RepLKNet, on ImageNet classification as well as a wide range of downstream tasks including semantic segmentation on ADE20K, object detection on PASCAL VOC 2007, and object detection/segmentation on MS COCO. Codes are available at https://github.com/VITA-Group/SLaK.
...
Julkaisija
OpenReview.netKonferenssi
International conference on learning representationsKuuluu julkaisuun
ICLR 2023 : The Eleventh International Conference on Learning RepresentationsAsiasanat
Alkuperäislähde
https://openreview.net/forum?id=bXNl-myZkJlJulkaisu tutkimustietojärjestelmässä
https://converis.jyu.fi/converis/portal/detail/Publication/207656356
Metadata
Näytä kaikki kuvailutiedotKokoelmat
Rahoittaja(t)
Suomen AkatemiaRahoitusohjelmat(t)
Muut, SALisenssi
Samankaltainen aineisto
Näytetään aineistoja, joilla on samankaltainen nimeke tai asiasanat.
-
Benchmark Database for Fine-Grained Image Classification of Benthic Macroinvertebrates
Raitoharju, Jenni; Riabchenko, Ekaterina; Ahmad, Iftikhar; Iosifidis, Alexandros; Gabbouj, Moncef; Kiranyaz, Serkan; Tirronen, Ville; Ärje, Johanna; Kärkkäinen, Salme; Meissner, Kristian (Elsevier BV, 2018)Managing the water quality of freshwaters is a crucial task worldwide. One of the most used methods to biomonitor water quality is to sample benthic macroinvertebrate communities, in particular to examine the presence and ... -
Mäkihypyn ponnistusvaiheen biomekaniikka hahmon asennon tunnistamiseen perustuvalla liikeanalyysillä
Virtanen, Lauri (2021)Mäkihyppy on Suomessa perinteikäs laji, jossa on totuttu kansainväliseen menestykseen arvokisoissa. Mäkihyppyä on tutkittu jo 1900-alkupuolelta alkaen ja vilkkain tutkimusaikakausi sijoittunee vähän 2000-luvun molemmin ... -
The potential of convolutional neural network in the evaluation of tumor-stroma ratio from colorectal cancer histopathological images
Petäinen, Liisa (2022)Tässä Pro gradu-työssä tutkitaan konvoluutioneuroverkkojen käyttömahdollisuuksia histopatologisista kuvista tehtävässä kasvain-strooma suhdeluvun arvioinnissa. Tarkoituksena on selvittää, mikä on siirto-opettamisen vaikutus, ... -
Causality-Aware Convolutional Neural Networks for Advanced Image Classification and Generation
Terziyan, Vagan; Vitko, Oleksandra (Elsevier, 2023)Smart manufacturing uses emerging deep learning models, and particularly Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs), for different industrial diagnostics tasks, e.g., classification, ... -
Generative adversarial networks with bio-inspired primary visual cortex for Industry 4.0
Branytskyi, Vladyslav; Golovianko, Mariia; Malyk, Diana; Terziyan, Vagan (Elsevier, 2022)Biologicalization (biological transformation) is an emerging trend in Industry 4.0 affecting digitization of manufacturing and related processes. It brings up the next generation of manufacturing technology and systems ...
Ellei toisin mainittu, julkisesti saatavilla olevia JYX-metatietoja (poislukien tiivistelmät) saa vapaasti uudelleenkäyttää CC0-lisenssillä.