Artificial Intelligence, Values and Alignment
Abstract

This paper looks at philosophical questions that arise in the context of AI alignment. It defends three propositions. First, normative and technical aspects of the AI alignment problem are interrelated, creating space for productive engagement between people working in both domains. Second, it is important to be clear about the goal of alignment. There are significant differences between AI that aligns with instructions, intentions, revealed preferences, ideal preferences, interests and values. A principle-based approach to AI alignment, which combines these elements in a systematic way, has considerable advantages in this context. Third, the central challenge for theorists is not to identify ‘true’ moral principles for AI; rather, it is to identify fair principles for alignment, that receive reflective endorsement despite widespread variation in people’s moral beliefs. The final part of the paper explores three ways in which fair principles for AI alignment could potentially be identified.

Authors’ notes

The question of ‘value alignment’ centres upon how to ensure that AI systems are properly aligned with human values. It can be broken down into two parts. The first part is technical and focuses on how to encode values or principles in artificial agents, so that they reliably do what they ought to do. The second part is normative, and focuses on what values or principles it would be right to encode in AI.

This paper focuses on the second question, paying particular attention to the fact that we live in a pluralistic world where people have a variety of different beliefs about value. Ultimately, I suggest that we need to devise principles for alignment that treat people fairly and command widespread support despite this difference of opinion.

Moral considerations

Any new technology generates moral considerations. Yet the task of imbuing artificial agents with moral values becomes particularly important as computer systems operate with greater autonomy and at a speed that ‘increasingly prohibits humans from evaluating whether each action is performed in a responsible or ethical manner’.

The first part of the paper notes that while technologists have an important role to play in building systems that respect and embody human values, the task of selecting appropriate values is not one that can be settled by technical work alone. This becomes clear when we look at the different ways in which value alignment could be achieved, at least within the reinforcement learning paradigm.

One set of approaches try to specify a reward function for an agent that would lead it to promote the right kind of outcome and act in ways that are broadly thought to be ethical. For this approach to succeed, we need to specify appropriate goals for artificial agents and encode them in AI systems – which is far from straightforward. A second family of approaches proceeds differently. Instead of trying to specify the correct reward function for the agent upfront, it looks at ways in which an agent could learn the correct reward from examples of human behavior or human feedback. However, the question then becomes what data or feedback to train the agent on – and how this decision can be justified.

Either way, important normative questions remain.

Alignment with what?

A key concern among AI researchers is that the systems they build are properly responsive to human direction and control. Indeed, as Stuart Russell notes, it is important that artificial agents understand the real meaning of the instructions they are given, and that they do not interpret them in an excessively literal way – with the story of King Midas serving as a cautionary tale.

At the same time, there is growing recognition that AI systems may need to go beyond this – and be designed in a way that leads them to do the right thing by default, even in the absence of direct instructions from a human operator.

One promising approach holds that AI should be designed to align with human preferences. In this way, AI systems would learn to avoid outcomes that very few people wanted or desired. However, this approach also has certain weaknesses. Revealed preferences can be irrational or based on false information. They may also be malicious. Furthermore, preferences are sometimes ‘adaptive’: people who lead lives affected by poverty or discrimination may revise their hopes and expectations downwards in order to avoid disappointment. By aligning itself with existing human preferences, AI could therefore come to act on data that is heavily compromised.

To address this weakness, I suggest that AI systems need to be properly responsive to underlying human interests and values. A principle-based approach to AI alignment, which takes into account both of these factors, would yield agents that are less likely to do harm and more likely to promote human well-being. A principle-based approach to alignment could also be sensitive to other considerations, such as the welfare of future generations, non-human animals and the environment.

Three approaches

The final part of the paper looks at the ways in which principles for AI alignment might be identified.In this context, I suggest that the main challenge is not to identify ‘true’ moral principles and encode them in AI – for even if we came to have great confidence in the truth of a single moral theory there would still be people with different beliefs and opinions who disagreed with us. Instead, we should try to identify principles for alignment that are acceptable to people who ascribe to a wide range of reasonable points of view. Principles of this kind could be arrived at in at least three different ways.

One approach looks at the possibility that there is an overlapping consensus between the moral beliefs held by people around the world. If such a consensus exists, then AI could be aligned with it – and potentially command widespread support – without encountering the problem of value imposition. In this regard, human rights are particularly promising. For while the idea of universal human rights is not wholly uncontested, the principles they embody command significant international support in practice. They also find justification in African, Islamic, Western, and Confucian philosophical traditions.

A second approach to pluralistic value alignment seeks to model fair principles for AI using the idea of a ‘veil of ignorance’. The veil of ignorance is a device proposed by the philosopher John Rawls, to help people with different values and perspectives agree upon principles of justice for a society. The central claim is that when choosing principles of this kind, people should do so from an imaginary position where they do not know who they will be in that society, or what specific moral view they will hold. As a result, they will deliberate impartially and choose principles that do not unduly favour themselves. A similar approach could be used to model principles for AI.

Although it is difficult to say what people would choose in this situation without knowing more about the specific form of AI in question, it seems plausible that they would want to ensure that this technology is safe, amenable to human control, and that its benefits are distributed widely.

The final approach looks at ways in which social choice theory can be used to combine different viewpoints and inform the direction AI should take. One school of thought focuses on mathematical integration of individual preferences into a single ranking – which could be used to guide AI. More promising still are democratic methods such as voting and broad-based deliberation. When used successfully, these approaches reflect the value of equality and have the potential to ensure that principles for AI alignment enjoy widespread legitimacy.

Further research

Each proposal discussed here is tentative. They can be developed and combined in many different ways. This paper has benefited from feedback provided by over fifty people, including from audiences at workshops convened at Stanford University, Princeton University, PAI, the University of Warwick, and the University of California, Berkeley. Moving forward, our hope is that this paper can contribute to the growing conversation about AI systems and their alignment with human values.

References

1. Colin Allen, Iva Smit and Wendell Wallach, ‘Artificial Morality: Top-Down, Bottom-Up and Hybrid Approaches’, Ethics and Information Technology 7 (2005), p. 149.

2. Joshua Cohen, The Arc of the Moral Universe and Other Essays (Harvard, 2010); Jack Donnelly, ‘The Relative Universality of Human Rights’, Human Rights Quarterly 29 (2007)

 

Original post: https://deepmind.com/research/publications/Artificial-Intelligence-Values-and-Alignment

118 comentários em “Artificial Intelligence, Values and Alignment

  1. I am the manager of JustCBD company (justcbdstore.com) and I am currently aiming to broaden my wholesale side of business. I am hoping anybody at targetdomain can help me ! I thought that the most ideal way to accomplish this would be to connect to vape stores and cbd stores. I was hoping if anyone could suggest a reputable site where I can get CBD Shops B2B Database I am presently examining creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Unsure which one would be the most ideal selection and would appreciate any assistance on this. Or would it be easier for me to scrape my own leads? Ideas?

  2. I am the co-founder of JustCBD Store brand (justcbdstore.com) and I am currently planning to expand my wholesale side of business. I really hope that someone at targetdomain is able to provide some guidance . I thought that the most effective way to do this would be to reach out to vape shops and cbd stores. I was hoping if anybody at all could recommend a dependable site where I can buy Vape Shop Business Marketing Data I am presently considering creativebeartech.com, theeliquidboutique.co.uk and wowitloveithaveit.com. Not sure which one would be the very best option and would appreciate any support on this. Or would it be simpler for me to scrape my own leads? Ideas?

  3. I really love your site.. Excellent colors & theme. Did you develop this website yourself? Please reply back as I’m wanting to create my very own website and would like to learn where you got this from or exactly what the theme is called. Many thanks!

  4. Oh my goodness! Awesome article dude! Thank you so much, However I am encountering troubles with your RSS. I don’t understand the reason why I can’t subscribe to it. Is there anybody else having identical RSS issues? Anybody who knows the solution can you kindly respond? Thanx!!

  5. Howdy! This blog post could not be written any better! Reading through this post reminds me of my previous roommate! He continually kept preaching about this. I’ll send this information to him. Fairly certain he’s going to have a great read. I appreciate you for sharing!

  6. This is the perfect website for anyone who would like to understand this topic. You know a whole lot its almost tough to argue with you (not that I personally would want to…HaHa). You definitely put a fresh spin on a topic that’s been written about for a long time. Wonderful stuff, just wonderful!

  7. I’m impressed, I have to admit. Rarely do I come across a blog that’s equally educative and interesting, and without a doubt, you have hit the nail on the head. The issue is something not enough folks are speaking intelligently about. I’m very happy that I came across this during my hunt for something relating to this.

  8. You’re so interesting! I don’t think I’ve read through anything like that before. So nice to find another person with a few genuine thoughts on this subject. Seriously.. thank you for starting this up. This website is something that is required on the web, someone with a little originality!

  9. This is a very good tip particularly to those fresh to the blogosphere. Brief but very precise information… Thanks for sharing this one. A must read article!

  10. After looking into a few of the blog articles on your web site, I seriously appreciate your way of blogging. I saved it to my bookmark website list and will be checking back soon. Please check out my web site too and let me know how you feel.

  11. I must thank you for the efforts you have put in writing this blog. I am hoping to check out the same high-grade blog posts by you in the future as well. In fact, your creative writing abilities has inspired me to get my own, personal website now 😉

  12. I needed to thank you for this fantastic read!! I certainly loved every little bit of it. I have you book marked to check out new things you post…

  13. Nice post. I learn something totally new and challenging on websites I stumbleupon everyday. It’s always interesting to read through content from other authors and practice something from other websites.

  14. That is a very good tip especially to those new to the blogosphere. Short but very precise info… Thanks for sharing this one. A must read post!

  15. Having read this I thought it was extremely informative. I appreciate you finding the time and energy to put this content together. I once again find myself spending way too much time both reading and posting comments. But so what, it was still worthwhile!

  16. Oh my goodness! Awesome article dude! Many thanks, However I am going through problems with your RSS. I don’t know why I can’t join it. Is there anybody having identical RSS problems? Anyone who knows the solution will you kindly respond? Thanks!!

  17. My coder is trying to persuade me to move to .net from PHP.
    I have always disliked the idea because of the expenses.
    But he’s tryiong none the less. I’ve been using Movable-type on numerous
    websites for about a year and am concerned about switching to another platform.
    I have heard excellent things about blogengine.net. Is there a way I can transfer all
    my wordpress content into it? Any kind of help would be
    really appreciated!

  18. I blog frequently and I truly thank you for your information. The article has truly peaked my interest. I will take a note of your website and keep checking for new details about once per week. I subscribed to your Feed as well.

  19. A motivating discussion is definitely worth comment. I believe that you ought to publish more on this subject, it might not be a taboo subject but generally people do not talk about these issues. To the next! Best wishes!!

  20. Right here is the right webpage for anybody who wants to find out about this topic. You understand so much its almost hard to argue with you (not that I personally would want to…HaHa). You certainly put a fresh spin on a topic that has been discussed for ages. Excellent stuff, just wonderful!

  21. Right here is the right webpage for anyone who wishes to understand this topic. You realize so much its almost tough to argue with you (not that I really would want to…HaHa). You certainly put a brand new spin on a subject that has been written about for many years. Excellent stuff, just excellent!

  22. Spot on with this write-up, I honestly believe this amazing site needs much more attention. I’ll probably be returning to see more, thanks for the info!

  23. Everything is very open with a precise explanation of the challenges. It was definitely informative. Your website is very useful. Many thanks for sharing!

  24. You are so cool! I don’t suppose I’ve truly read anything like this before. So good to discover somebody with a few genuine thoughts on this issue. Seriously.. thank you for starting this up. This site is something that is needed on the web, someone with some originality!

  25. After I initially left a comment I seem to have clicked the -Notify me when new comments are added- checkbox and now whenever a comment is added I get four emails with the same comment. Is there a way you can remove me from that service? Thank you!

  26. After looking into a few of the blog articles on your web page, I really like your technique of writing a blog. I bookmarked it to my bookmark website list and will be checking back in the near future. Please check out my website as well and tell me your opinion.

  27. Hi there, I believe your blog could be having internet browser compatibility problems. Whenever I look at your website in Safari, it looks fine however, when opening in Internet Explorer, it has some overlapping issues. I just wanted to provide you with a quick heads up! Besides that, great site!

  28. I’m excited to find this website. I want to to thank you for your time due to this fantastic read!! I definitely enjoyed every bit of it and I have you book marked to look at new stuff on your site.

  29. Hello! I’ve been reading your web site for a long time now and finally got the bravery to go ahead and give you a shout out from Dallas Texas! Just wanted to tell you keep up the fantastic work!|

  30. Hello there! This post couldn’t be written any better! Reading through this post reminds me of my previous roommate! He always kept talking about this. I am going to forward this post to him. Fairly certain he’s going to have a great read. Thanks for sharing!

  31. Hey there! I could have sworn I’ve been to this site
    before but after checking through some of the post I realized
    it’s new to me. Nonetheless, I’m definitely glad I found it and I’ll be book-marking and checking back often!

  32. After I originally commented I seem to have clicked the -Notify me when new comments are added- checkbox and
    now every time a comment is added I get four emails with
    the same comment. There has to be a means you can remove me
    from that service? Many thanks!

  33. I loved as much as you’ll receive carried out right here.
    The sketch is attractive, your authored material stylish.
    nonetheless, you command get got an impatience over that
    you wish be delivering the following. unwell unquestionably come further formerly again since exactly the same nearly a lot often inside case
    you shield this increase.

    my website situs domino online

  34. I am really enjoying the theme/design of your weblog. Do you ever run into any internet browser compatibility problems?
    A couple of my blog readers have complained about
    my site not working correctly in Explorer but looks great in Chrome.
    Do you have any tips to help fix this issue?

  35. I have to thank you for the efforts you have put in penning this website. I am hoping to check out the same high-grade content from you in the future as well. In fact, your creative writing abilities has inspired me to get my own, personal site now 😉

  36. Very good blog! Do you have any helpful hints for aspiring writers?
    I’m planning to start my own blog soon but I’m a little lost on everything.
    Would you propose starting with a free platform like WordPress or go for a paid option? There are so many choices out there that I’m completely confused ..
    Any suggestions? Cheers!

  37. After I initially commented I appear to have clicked on the -Notify me when new comments are added- checkbox and now every time a comment is added I get 4 emails with the exact same comment. There has to be a means you are able to remove me from that service? Appreciate it!

  38. Someone essentially help to make seriously posts I would state. This is the very first time I frequented your web page and thus far? I surprised with the analysis you made to create this actual put up amazing. Magnificent job!|

  39. Achieving your fitness goal doesn’t have to require a certified personal trainer or an expensive gym memberships, it is not hard to exercise at home. It’s easy to go down a training and fitness rabbit hole, however, when you’re looking for the best home exercise equipment to outfit your personal home gym.

  40. Thanks for the marvelous posting! I genuinely enjoyed reading it, you might be a
    great author.I will make certain to bookmark your blog and will come back at some point.

    I want to encourage continue your great work, have a nice day!

  41. I think this is among the most important information for me.

    And i’m glad reading your article. But want to remark on some general things, The website style
    is ideal, the articles is really excellent : D.
    Good job, cheers

  42. Can I simply say what a relief to find somebody that truly knows what they are discussing on the net. You actually understand how to bring an issue to light and make it important. A lot more people have to check this out and understand this side of your story. I was surprised you aren’t more popular since you most certainly possess the gift.

  43. Some of such repairing companies are just there too create profit and hence tney just don’t supplies
    the servioce that satissfies the consumer consequently one
    should take care when opting with an Air Conditioning Repair The Woodlands.
    There is no need that you can visit the dealer to obtain your
    firewood or briquettes. The mihor problems within the unit
    cause major repairs within the coming time which results inn to inconvenienhe and large expenditure.

    My webpage – Ancuortext (nxlv.ru)

  44. A person necessarily help to make significantly articles I
    might state. This is the very first time I frequented your web page and thus far?
    I amazed with the analysis you made to create this particular
    put up amazing. Wonderful activity!

  45. Having read this I thought it was really informative. I appreciate you finding the time and effort to put this information together. I once again find myself personally spending way too much time both reading and leaving comments. But so what, it was still worth it!

  46. Right here is the perfect website for everyone who wants to understand this topic. You understand so much its almost hard to argue with you (not that I personally will need to…HaHa). You definitely put a new spin on a subject that has been discussed for a long time. Wonderful stuff, just great!

  47. Howdy I am so grateful I found your web site, I really found you by accident, while I was browsing on Bing for something else, Regardless I
    am here now and would just like to say thanks a lot for a tremendous post and a all round enjoyable blog (I also love the
    theme/design), I don’t have time to read through it all at the
    moment but I have book-marked it and also added your RSS feeds, so
    when I have time I will be back to read a great deal more,
    Please do keep up the great b.

  48. Terrific post however , I was wanting to know if you could write a litte more on this topic?
    I’d be very thankful if you could elaborate a little bit more.
    Bless you!

  49. When someone writes an piece of writing he/she retains the plan of
    a user in his/her brain that how a user can be aware of it.

    Therefore that’s why this article is amazing. Thanks!

  50. Oh my goodness! Amazing article dude! Many thanks, However I am having issues with your RSS. I don’t know why I can’t join it. Is there anyone else getting similar RSS problems? Anybody who knows the solution will you kindly respond? Thanx!!

  51. You have made some really good points there.
    I checked on the web for additional information about the issue and found most individuals will go along with your views on this website.

  52. I don’t even know how I ended up here, but I thought this post
    was great. I do not know who you are but definitely you’re going to a
    famous blogger if you are not already 😉 Cheers!

  53. After I originally left a comment I appear to have clicked the -Notify me when new comments are added- checkbox and now every time a comment is added I recieve four emails with the same comment. There has to be an easy method you are able to remove me from that service? Cheers!

  54. Oh my goodness! Incredible article dude! Many thanks, However I
    am having issues with your RSS. I don’t understand the
    reason why I am unable to subscribe to it.

    Is there anybody else having the same RSS problems? Anyone that knows the answer will you kindly
    respond? Thanx!!

  55. It’s a pity you don’t have a donate button! I’d definitely
    donate to this brilliant blog! I guess for now i’ll
    settle for bookmarking and adding your RSS feed
    to my Google account. I look forward to brand new updates and will talk about this blog with my Facebook
    group. Chat soon!

  56. Hi, I do believe your website may be having
    browser compatibility issues. Whenever I look
    at your blog in Safari, it looks fine however
    when opening in IE, it has some overlapping issues. I just
    wanted to give you a quick heads up! Apart from that,
    great site!

  57. Hello, I think your web site could possibly be having browser compatibility problems. When I take a look at your site in Safari, it looks fine however when opening in IE, it has some overlapping issues. I just wanted to provide you with a quick heads up! Apart from that, fantastic website!

  58. Hi, I do think this is a great website. I stumbledupon it 😉 I am going to come back once again since I book marked it. Money and freedom is the greatest way to change, may you be rich and continue to guide other people.

  59. I absolutely love your website.. Very nice colors & theme. Did you make this web site yourself? Please reply back as I’m looking to create my very own blog and want to know where you got this from or exactly what the theme is called. Many thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *