dc.contributor.author | Terziyan, Vagan | |
dc.contributor.author | Malyk, Diana | |
dc.contributor.author | Golovianko, Mariia | |
dc.contributor.author | Branytskyi, Vladyslav | |
dc.date.accessioned | 2022-09-05T05:04:56Z | |
dc.date.available | 2022-09-05T05:04:56Z | |
dc.date.issued | 2022 | |
dc.identifier.citation | Terziyan, V., Malyk, D., Golovianko, M., & Branytskyi, V. (2022). Hyper-flexible Convolutional Neural Networks based on Generalized Lehmer and Power Means. <i>Neural Networks</i>, <i>155</i>, 177-203. <a href="https://doi.org/10.1016/j.neunet.2022.08.017" target="_blank">https://doi.org/10.1016/j.neunet.2022.08.017</a> | |
dc.identifier.other | CONVID_151802598 | |
dc.identifier.uri | https://jyx.jyu.fi/handle/123456789/82937 | |
dc.description.abstract | Convolutional Neural Network is one of the famous members of the deep learning family of neural network architectures, which is used for many purposes, including image classification. In spite of the wide adoption, such networks are known to be highly tuned to the training data (samples representing a particular problem), and they are poorly reusable to address new problems. One way to change this would be, in addition to trainable weights, to apply trainable parameters of the mathematical functions, which simulate various neural computations within such networks. In this way, we may distinguish between the narrowly focused task-specific parameters (weights) and more generic capability-specific parameters. In this paper, we suggest a couple of flexible mathematical functions (Generalized Lehmer Mean and Generalized Power Mean) with trainable parameters to replace some fixed operations (such as ordinary arithmetic mean or simple weighted aggregation), which are traditionally used within various components of a convolutional neural network architecture. We named the overall architecture with such an update as a hyper-flexible convolutional neural network. We provide mathematical justification of various components of such architecture and experimentally show that it performs better than the traditional one, including better robustness regarding the adversarial perturbations of testing data. | en |
dc.format.mimetype | application/pdf | |
dc.language.iso | eng | |
dc.publisher | Elsevier | |
dc.relation.ispartofseries | Neural Networks | |
dc.rights | CC BY 4.0 | |
dc.subject.other | convolutional | |
dc.subject.other | neural network | |
dc.subject.other | generalization | |
dc.subject.other | flexibility | |
dc.subject.other | adversarial robustness | |
dc.subject.other | pooling | |
dc.subject.other | convolution | |
dc.subject.other | activation function | |
dc.subject.other | Lehmer mean | |
dc.subject.other | Power mean | |
dc.title | Hyper-flexible Convolutional Neural Networks based on Generalized Lehmer and Power Means | |
dc.type | article | |
dc.identifier.urn | URN:NBN:fi:jyu-202209054469 | |
dc.contributor.laitos | Informaatioteknologian tiedekunta | fi |
dc.contributor.laitos | Faculty of Information Technology | en |
dc.contributor.oppiaine | Collective Intelligence | fi |
dc.contributor.oppiaine | Tekniikka | fi |
dc.contributor.oppiaine | Collective Intelligence | en |
dc.contributor.oppiaine | Engineering | en |
dc.type.uri | http://purl.org/eprint/type/JournalArticle | |
dc.type.coar | http://purl.org/coar/resource_type/c_2df8fbb1 | |
dc.description.reviewstatus | peerReviewed | |
dc.format.pagerange | 177-203 | |
dc.relation.issn | 0893-6080 | |
dc.relation.volume | 155 | |
dc.type.version | publishedVersion | |
dc.rights.copyright | © 2022 The Author(s). Published by Elsevier Ltd. | |
dc.rights.accesslevel | openAccess | fi |
dc.subject.yso | syväoppiminen | |
dc.subject.yso | koneoppiminen | |
dc.subject.yso | neuroverkot | |
dc.format.content | fulltext | |
jyx.subject.uri | http://www.yso.fi/onto/yso/p39324 | |
jyx.subject.uri | http://www.yso.fi/onto/yso/p21846 | |
jyx.subject.uri | http://www.yso.fi/onto/yso/p7292 | |
dc.rights.url | https://creativecommons.org/licenses/by/4.0/ | |
dc.relation.dataset | https://github.com/Adversarial-Intelligence-Group/flexnets | |
dc.relation.doi | 10.1016/j.neunet.2022.08.017 | |
dc.type.okm | A1 | |