The Key Concepts of Ethics of Artificial Intelligence - A Keyword based Systematic Mapping Study

The growing influence and decision-making capacities of Autonomous systems and Artificial Intelligence in our lives force us to consider the values embedded in these systems. But how ethics should be implemented into these systems? In this study, the solution is seen on philosophical conceptualization as a framework to form practical implementation model for ethics of AI. To take the first steps on conceptualization main concepts used on the field needs to be identified. A keyword based Systematic Mapping Study (SMS) on the keywords used in AI and ethics was conducted to help in identifying, defying and comparing main concepts used in current AI ethics discourse. Out of 1062 papers retrieved SMS discovered 37 re-occurring keywords in 83 academic papers. We suggest that the focus on finding keywords is the first step in guiding and providing direction for future research in the AI ethics field.


INTRODUCTION
By reviewing the latest accomplishment and increasing implementation of Autonomous systems (AS) and Artificial Intelligence (AI) systems have become more influential in our lives. By growing influence ethical questions related to these systems have become more and more obvious and actual.
For example, looking biased algorithms in social media [1], decision making systems of autonomous cars [2], or even social effects of automatization in whole transportation ecosystems like autonomous maritime [3] it is clear that system development is not anymore only about technological or engineering question. AI and AS are already in the surrounding world among us and the need of implementing ethics and our values into these systems is urgent.
Concerning ethics as a part of system design has also gained attention from governmental and standardization level, such as Federal Ministry of Transport and Digital Infrastructure in Germany [4] and IEEE [5]. The academic discussion on the relation of AI and ethics has been ongoing for decades, but the development of systems and ethical research have only slightly crossed [6]. The ethical research has been mainly focused on the potential of AI on theoretical level [7]. So, the question still remains open on application level: How ethics should be implemented in practice into these systems?
There can be little ethical implementation without understanding the consequences of developers' own actions, open dialogue and ethical aspects considered in AI and autonomous system development, because of the multidisciplinary nature of AI ethics development [8]. As a solution for understanding the field of ethics of AI, philosophical conceptualization should be used. This method allows to discuss and to form cross-disciplinary definitions for key concepts and also initiate productive dialog merging philosophical and technological views to produce a common framework for implementing ethics in AI.
The goal of this paper is to identify and categorize keywords used in academic papers in the current AI ethics discourse and by that take first steps to identify, define and compare main concepts and terms used in discourse. To find the relevant papers and keywords, a preliminary Systematic Mapping Study was conducted with the following focus: o Recognize keywords used in the field o Extract potential keywords for future research o Compare keywords to proposed concepts in academic literature The Systematic Mapping Study based on keywords reveals 37 re-occurring author keywords found in 83 academic papers that are found from an initial set of 1062 papers in the field of AI and ethics. Cause of the preliminary nature of this study as Systematic Mapping Study it does not provide full comprehensive picture of the primary studies in the area, but it provides an important standpoint and relevant tools for future research on AI and ethics. By understanding the used concepts, research can shift from discussing concepts to defining This is the author's version of the work. The definite version was published in Vakkuri, V., Abrahamsson, P. 2018, June. This paper is organized as follows: Section 2 describes the background and related work; Section 3 describes the research methodology and conducted keyword based search; Section 4 findings; Section 5 concludes the paper by discussing the findings with general presentation of AI ethics and summarizes the answers for research questions to set guidelines for future work.

BACKGROUND
The ethical discussion of Artificial Intelligence has been present from the start for AI research, but instead of focusing on the real use cases, the focus has been mainly on the theoretical work discussing the possibilities and future impacts of AI. In recent years there has been a major change in discussion of AI related ethics when new level of capabilities of AI have become reality and more influential in our lives due to the resent breakthroughs in AI development. Availability of low cost computing power and innovation like Big Data technologies have made AI more useable in solving complicated problems. [9] One milestone of AI development can be seen in year 2012 when Google's large-scale deep learning experimentation on brain simulation using 16000 CPU cores and deep learning was conducted [10]. The experiment significantly improved the state of the art on a standard image classification test. This year also serves as the starting point for the current AI ethics discussion in the context of this study.
Even though the academic discussion on the relation of AI and ethics has been ongoing for decades, there is no commonly shared definition of what AI ethics is or even how it should be named. As the defining concept, Machine ethics has arisen out the discussion but it has also been criticized. There has been a heated discussion on how does the concept of machine ethics also cover and include new branches in AI related ethics. [7,11,12] There is only a handful of books that have comprehensive presentations covering the ethical issues of AI, such as Towards a Code of Ethics for Artificial Intelligence that mainly focuses on professional ethics [9]; robot ethics 2.0 covers ethics related to embodied AI [13] and Machine Ethics prior to the current discussion [14]. For defining the field of AI related ethics so called "six hot topics" have proposed [15]. The problem with these categorically wide topics is that they are not necessarily comprehensive or clear enough and not in balance with the overall discussion. Importantly, they are also not necessarily scientifically founded. For example of the wide scope of AI ethics discourse, the first AAAI/ACM Conference on AI, Ethics, and Society held in 2018 had broad set of 12 different topics from technical to social sciences [16].
Besides defining relevant concepts for a crucial problem in practical implementation of AI ethics is the limited co-operation and communication between the developers of the AI systems and ethics researchers. [6,11] To reach the practical implementation of AI ethics, a multidisciplinary research approach is needed where AI developers can also see the use of ethics and results of the philosophical research on a practical level. 3.

RESEARCH METODOLOGY AND MAPPING
As a multidisciplinary research area AI ethics covers a wide range of topics and the discussion of definitions still endures. To gain a better understanding of the research area, a Systematic Mapping Study was chosen as a research method due to its capability to deal with wide and loosely defined areas of study. SMS aims at producing an overview of the field and reveals concretely which topics have been covered to a certain extent. The present study is a keyword based systematic mapping study. Two main guidelines for systematic mapping study were combined aiming at recognizing primary studies and the used keywords therein. We consider this study, however, to be the first step since the mapping process is not executed to its full length. We needed first to gain a better understanding of relevant keywords for the PICO (Population, Intervention, Comparison and Outcomes) process. [17,18]

A. Definition of Research Questions
The main research question for the present study is: What are the main author keywords used in academic papers in the current AI ethics discussion. To answer this question, four subquestions were formed: The purpose of Q1 is to produce a preliminary picture of the keywords used in the identified papers and gathering information together. Q2 aims at recognizing the main keywords by means of a quantitative analysis of the variance and appearance in the identified papers whereas Q3 aims at providing qualitative classification of the used keywords. With Q4 the intention is to understand how keywords fit into proposed concepts, how comprehensive they are and what type of new concepts they can potentially offer. This is the author's version of the work. The definite version was published in Vakkuri, V., Abrahamsson, P. 2018, June.

B. Conducted Search
Keywords were identified by conducting keyword search in selected scientific databases. The search string was formed from the main research question by combining both key concepts artificial intelligence/AI + ethics. The suggested PICO process was not used to identify search string keywords because of the lack of shared concepts in AI ethics for the reasons argued earlier.
The selected scientific databases on which search was performed are shown in Table II, along with the number of publications retrieved from each database (in the 11th of March, 2018). The selection of databases were guided by the need to gain a wide coverage of the multidisciplinary nature of AI research and databases ability to handle advanced queries. The used set of keyword search strings were customized as shown in Table I to adapt to the syntax of the particular database. Web of Science and ProQuest databases do not have specified search term for author keyword, therefore keyword including the topic and subject fields were used in the search queries. Pre-exclusion of document type, source type and article language was done automatically in databases, see Table III. From databases five different results lists were exported and combined to reference management tool RefWorks resulting list of 588 papers. For duplicate exclusion each papers metadata and title were reviewed with aid of the reference management tool. In manual metadata analysis, papers published before 2012 were excluded. In addition, nonscholarly journal articles, for example popular articles, which were not detected in pre-exclusion phase, were excluded in the manual screening process. In in-depth review of the remaining papers, abstracts were analyzed to determinate whether the paper is related to ethics and artificial intelligence or related technologies. In the last iteration of exclusion, papers were excluded if full-text and author keywording were not available. Resulting 83 papers included. Screening process and steps can be seen in Table III and distribution by year in Fig. 1.

FINDINGS
For this study, the listing of the included keywords worked as a data extraction and no further keywording was conducted. To answers research questions Q1 and Q2, the keywords were listed and counted resulting in total of 324 different keywords in 83 papers. 37 of the 324 used keywords were re-occurring in two or more papers. Most frequently used keywords were Artificial intelligence/AI and ethics. This is a natural result due to the terms used in research strings, and therefore does not provide new information. These keywords were excluded from the listing. Re-occurring keywords and papers where these keywords were used can be seen on Table IV. The usage of keywords has considerable variance in incidence and spelling such as "Roboethics" and "Robot ethics" that may hinder search result. The variance in used keywords for one topic such as "Autonomous vehicle", "Driverless cars", "Self-driving cars" can be seen also as example of the immaturity of shared terms and undisclosed discussion what terms should be used in specific context. Virtue ethics 2 [46,52] The 37 re-occurring keywords where classified into 9 categories as shown in Table V. Classification of keywords was formed following four step process: 1) Linguistic similarity of keywords, for example similarity in spelling. 2) Ontological similarity of keyword as assumed reference for same concept.
3) Family resemblance of keywords. 4) Similarity in usage, from abstract to specific. After classification describing names were given to formed categories. [80] The idea of classification was to outline re-occurring topics from the vast variance of keywords. This classification produced Publication distribution year This is the author's version of the work. The definite version was published in Vakkuri, V., Abrahamsson, P. 2018, June. Academic literature shows similarities in recurring terms when comparing keywords and the formed categories to proposed concepts and topics, but keyword listing is in some parts also partial. For example, technology based keywords and topics are underrepresented such as bias issues, fairness, transparency and controlling AI. Also socioethical topics like impact on society or workforce are lacking. [9,13] Comparison of keyword classification reveals topics that are quite commonly shared in literary. Found keywords can be classified under the known topics even specified formulation of keywords in some parts varies considerably. This study provided a set of AI ethics related keywords and listing of 37 re-occurring author keywords found in 83 academic papers. Re-occurring keywords where classified into 9 categories based on conceptual similarities of keywords to more general topics relevant to AI ethics. Keywords and formed categories where compared to concepts provided in academic literature to evaluate coverage of the systematic mapping study and listing. Three main differences were discovered: Lack of different branches of AI in keywords, technology based keywords have only minor role and there is a great variance in formulation of keywords even though keywords can be classified under the known topics. Recommendation for future research and systematic mapping studies: Different AI branches and different formulation for keywords extracted from known topics should be included in the keyword extraction process.
Keyword based systematic mapping study method used in this study has several weaknesses. Due to the focus on the keywords only, no primary studies of the field of AI ethics where recognized. The relevance of papers was evaluated in exclusion process and in the prevalence of keywords in the papers. Neither definitions of concepts that keywords represented where not analyzed. Despite the weaknesses, keyword based approach allowed to cover wide and loosely defined field of AI ethics to produce understanding of relevant keywords where no prior listing was available. This preliminary work also helps future systematic mapping studies by providing relevant keywords on AI ethics.
With wide variety of papers and keywords from different areas concerning AI ethics this study revealed that defining the field of AI ethics is still a challenging task. The comprehensive presentations have done a valuable work on setting definitions for expanding field of AI ethics. There is still a substantial amount of work to be done in the area. These presentations are not all inclusive and more comprehensive works are needed on the topic discussed on this paper. For example, by looking at the occurrence of different keywords, papers have different stress in different topics than comprehensive presentations have. Overall there is still research needed in the field of AI ethics on the concepts as such to see where AI ethics discourse is developing and how concepts can aid the need of practical implementation of ethics into AI systems.