🗣️👥 Awesome Social Agents
For the best experience, we recommend reading this document on the website (opens in a new tab).
The rise of Large Language Models (LLMs)/foundational models presents new opportunities for simulating complex human social behaviors. As a result, there is a rapidly growing body of work emerging in this domain. We hope to categorize and synergize recent efforts to provide a comprehensive guidebook of social agents weaving together multiple domains, including language, embodiment, and robotics.
Our goal is to offer insights crucial for understanding and harnessing social agents' potential impact on society. We strive to keep these updated regularly and continuously. We greatly appreciate any contributions via PRs, issues, emails, or other methods.
[!NOTE]
- Agent and Environment (Sutton and Barto 2018): An agent is a goal-driven decision-maker that sense and act upon the state of the environment. An environment comprises the state outside the agent, including the other agents if any.
- Social Agent: An agent that interacts with a multi-agent environment.
- Socially Intelligent Agent: A social agent that interacts and communicates with other agents in a human-interpretable way.
more notes
- The social intelligence that we are focusing on is human-like, excluding the collective intelligence in a lot of social animals like ants, bees, fishes.
- To understand whether an entity is a (social) agent, we have to situate it in an environment. It is not possible to discuss an agent outside of an environment.
- We acknowledge there are many types of definitions for social agents. Our defitions here help narrow down the scope of our survey.
🗂️ Check out the examples of social agents. 📚 Check out the table format of the collected papers here.
📝 We are currently working on a survey paper related to content of this repository. Stay tuned for updates!
Table of Contents
- Table of Contents
- Papers
Papers
Surveys and Overview
[6, 2023] Socially intelligent machines that learn from humans and help humans learn (opens in a new tab), Gweon et al., arXiv
[4, 2024] Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions (opens in a new tab), Leena Mathur et al., arXiv preprint arXiv:2404.11023
[2, 2024] Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future (opens in a new tab), Minzhi Li et al., arXiv preprint arXiv:2403.14659
[4, 2024] Social Skill Training with Large Language Models (opens in a new tab), Diyi Yang et al., arXiv preprint arXiv:2404.04204
Environments
Text and Speech Environments
[4, 2024] To Tell The Truth: Language of Deception and Language Models (opens in a new tab), Bodhisattwa Prasad Majumder et al., arXiv preprint arXiv:2311.07092
[3, 2024] Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference (opens in a new tab), Wei-Lin Chiang et al., arXiv preprint arXiv:2403.04132
[8, 2023] {CALYPSO}: {LLMs} as Dungeon Masters' Assistants (opens in a new tab), Andrew Zhu et al., The 19th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 2023)
[7, 2023] {I} Cast Detect Thoughts: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons (opens in a new tab), Zhou et al., Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
[7, 2023] {FIREBALL}: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information (opens in a new tab), Zhu et al., Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
[3, 2023] Fast Multi-Agent Gridworld Environments for Gymnasium (opens in a new tab), Ini Oguntola et al., GitHub
[03, 2023] Reflexion: Language Agents with Verbal Reinforcement Learning (opens in a new tab), Noah Shinn et al., arXiv
[12, 2022] Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence (opens in a new tab), Callison-Burch et al., Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
[11, 2022] Human-level play in the game of Diplomacy by combining language models with strategic reasoning (opens in a new tab), Meta Fundamental AI Research Diplomacy Team (FAIR)† et al., Science
[11, 2022] Introducing ChatGPT (opens in a new tab), OpenAI et al., n/a
[8, 2022] Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage (opens in a new tab), Kurt Shuster et al., arXiv preprint arXiv:2208.03188
[5, 2022] Opt: Open pre-trained transformer language models (opens in a new tab), Susan Zhang et al., arXiv preprint arXiv:2205.01068
[3, 2022] Report from the nsf future directions workshop on automatic evaluation of dialog: Research directions and challenges (opens in a new tab), Shikib Mehri et al., arXiv preprint arXiv:2203.10012
[1, 2022] Socio-conversational systems: Three challenges at the crossroads of fields (opens in a new tab), Chlo{'e} Clavel et al., Frontiers in Robotics and AI
[1, 2022] The Handbook on Socially Interactive Agents: 20 Years of Research on Embodied Conversational Agents, Intelligent Virtual Agents, and Social Robotics Volume 2: Interactivity, Platforms, Application (opens in a new tab), Birgit Lugrin et al., ACM
[1, 2022] Human evaluation of conversations is an open problem: comparing the sensitivity of various methods for evaluating dialogue agents (opens in a new tab), Eric Michael Smith et al., arXiv preprint arXiv:2201.04723
[7, 2020] It Takes Two to Lie: One to Lie, and One to Listen (opens in a new tab), Peskov et al., Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
[3, 2020] The Hanabi challenge: A new frontier for AI research (opens in a new tab), Nolan Bard et al., Artificial Intelligence
[3, 2020] The Design and Implementation of {X}iao{I}ce, an Empathetic Social Chatbot (opens in a new tab), Zhou et al., Computational Linguistics
[08, 2019] {OpenSpiel}: A Framework for Reinforcement Learning in Games (opens in a new tab), Marc Lanctot et al., CoRR
[7, 2019] Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good (opens in a new tab), Wang et al., Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
[7, 2019] RLCard: A Toolkit for Reinforcement Learning in Card Games (opens in a new tab), Daochen Zha et al., arXiv preprint arXiv:1910.04376
[4, 2019] Wizard of Wikipedia: Knowledge-Powered Conversational Agents (opens in a new tab), Emily Dinan et al., International Conference on Learning Representations
[11, 2018] Towards empathetic open-domain conversation models: A new benchmark and dataset (opens in a new tab), Hannah Rashkin et al., arXiv preprint arXiv:1811.00207
[10, 2018] Decoupling Strategy and Generation in Negotiation Dialogues (opens in a new tab), He et al., Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
[4, 2018] A knowledge-grounded neural conversation model (opens in a new tab), Marjan Ghazvininejad et al., Proceedings of the AAAI Conference on Artificial Intelligence
[3, 2018] Towards empathetic human-robot interactions (opens in a new tab), Pascale Fung et al., Computational Linguistics and Intelligent Text Processing: 17th International Conference, CICLing 2016, Konya, Turkey, April 3--9, 2016, Revised Selected Papers, Part II 17
[9, 2017] Deal or No Deal? End-to-End Learning of Negotiation Dialogues (opens in a new tab), Lewis et al., Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
[8, 2016] A persona-based neural conversation model (opens in a new tab), Jiwei Li et al., arXiv preprint arXiv:1603.06155
[11, 2009] The anatomy of ALICE (opens in a new tab), Richard S Wallace et al., n/a
[1, 2006] Empathic computing (opens in a new tab), Yang Cai et al., Ambient intelligence in everyday life: Foreword by Emile Aarts
[1, 1966] ELIZA—a computer program for the study of natural language communication between man and machine (opens in a new tab), Joseph Weizenbaum et al., Commun. ACM
Embodied Environments
[October, 2023] Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots (opens in a new tab), Puig et al., ICLR
[September, 2020] SEAN: Social Environment for Autonomous Navigation (opens in a new tab), Tsoi et al., HAI
Virtual Environments
[1, 2024] Visualwebarena: Evaluating multimodal agents on realistic visual web tasks (opens in a new tab), Jing Yu Koh et al., arXiv preprint arXiv:2401.13649
[1, 2024] Mind2web: Towards a generalist agent for the web (opens in a new tab), Xiang Deng et al., Advances in Neural Information Processing Systems
[12, 2023] Appagent: Multimodal agents as smartphone users (opens in a new tab), Zhao Yang et al., arXiv preprint arXiv:2312.13771
[11, 2023] Simulating Iterative Human-AI Interaction in Programming with LLMs (opens in a new tab), Hussein Mozannar et al., NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following
[7, 2023] Webarena: A realistic web environment for building autonomous agents (opens in a new tab), Shuyan Zhou et al., arXiv preprint arXiv:2307.13854
[7, 2023] Android in the wild: A large-scale dataset for android device control (opens in a new tab), Christopher Rawles et al., arXiv preprint arXiv:2307.10088
[3, 2023] The Programmer's Assistant: Conversational Interaction with a Large Language Model for Software Development (opens in a new tab), Steven I Ross et al., Proceedings of the 28th International Conference on Intelligent User Interfaces
[12, 2022] Webshop: Towards scalable real-world web interaction with grounded language agents (opens in a new tab), Shunyu Yao et al., Advances in Neural Information Processing Systems
[8, 2022] Minedojo: Building open-ended embodied agents with internet-scale knowledge (opens in a new tab), Linxi Fan et al., Advances in Neural Information Processing Systems
[7, 2022] A data-driven approach for learning to control computers (opens in a new tab), Peter C Humphreys et al., International Conference on Machine Learning
[2, 2022] A dataset for interactive vision-language navigation with unknown command feasibility (opens in a new tab), Andrea Burns et al., European Conference on Computer Vision
[2, 2022] Scienceworld: Is your agent smarter than a 5th grader? (opens in a new tab), Ruoyao Wang et al., arXiv preprint arXiv:2203.07540
[5, 2021] Androidenv: A reinforcement learning platform for android (opens in a new tab), Daniel Toyama et al., arXiv preprint arXiv:2105.13231
[3, 2021] Grounding open-domain instructions to automate web support tasks (opens in a new tab), Nancy Xu et al., arXiv preprint arXiv:2103.16057
[9, 2020] Interactive task learning from GUI-grounded natural language instructions and demonstrations (opens in a new tab), Toby Jia-Jun Li et al., Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
[5, 2020] Mapping natural language instructions to mobile UI action sequences (opens in a new tab), Yang Li et al., arXiv preprint arXiv:2005.03776
[4, 2020] The nethack learning environment (opens in a new tab), Heinrich K{"u}ttler et al., Advances in Neural Information Processing Systems
[3, 2019] Pumice: A multi-modal agent that learns concepts and conditionals from natural language and demonstrations (opens in a new tab), Toby Jia-Jun Li et al., Proceedings of the 32nd annual ACM symposium on user interface software and technology
[1, 2019] Textworld: A learning environment for text-based games (opens in a new tab), Marc-Alexandre C{^o}t{'e} et al., Computer Games: 7th Workshop, CGW 2018, Held in Conjunction with the 27th International Conference on Artificial Intelligence, IJCAI 2018, Stockholm, Sweden, July 13, 2018, Revised Selected Papers 7
[6, 2018] Virtualhome: Simulating household activities via programs (opens in a new tab), Xavier Puig et al., Proceedings of the IEEE conference on computer vision and pattern recognition
[3, 2018] Appinite: A multi-modal interface for specifying data descriptions in programming by demonstration using natural language instructions (opens in a new tab), Toby Jia-Jun Li et al., 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)
[2, 2018] Reinforcement learning on web interfaces using workflow-guided exploration (opens in a new tab), Evan Zheran Liu et al., arXiv preprint arXiv:1802.08802
[8, 2017] World of bits: An open-domain platform for web-based agents (opens in a new tab), Tianlin Shi et al., International Conference on Machine Learning
[5, 2017] Ai2-thor: An interactive 3d environment for visual ai (opens in a new tab), Eric Kolve et al., arXiv preprint arXiv:1712.05474
[8, 2009] Reinforcement learning for mapping instructions to actions (opens in a new tab), Satchuthananthavale RK Branavan et al., Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP
[7, 2007] Plow: A collaborative task learning agent (opens in a new tab), James Allen et al., AAAI
Robotics
[03, 2024] Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers (opens in a new tab), Vidhi Jain et al., arXiv
[03, 2024] BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation (opens in a new tab), Chengshu Li et al., arXiv
[03, 2024] DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset (opens in a new tab), Alexander Khazatsky et al., arXiv
[03, 2024] Yell At Your Robot: Improving On-the-Fly from Language Corrections (opens in a new tab), Lucy Xiaoyang Shi et al., arXiv
[3, 2024] RABBIT: A Robot-Assisted Bed Bathing System with Multimodal Perception and Integrated Compliance (opens in a new tab), Rishabh Madan et al., Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction
[12, 2023] Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis (opens in a new tab), Yafei Hu et al., arXiv preprint: arXiv:2312.08782
[12, 2023] RoboTube: Learning Household Manipulation from Human Videos with Simulated Twin Environments (opens in a new tab), Haoyu Xiong et al., Proceedings of The 6th Conference on Robot Learning
[11, 2023] Toward Grounded Commonsense Reasoning (opens in a new tab), Minae Kwon et al., arXiv preprint arXiv:2306.08651
[10, 2023] Open {X-E}mbodiment: Robotic Learning Datasets and {RT-X} Models (opens in a new tab), Open X-Embodiment Collaboration et al., arXiv
[9, 2023] How to Prompt Your Robot: A PromptBook for Manipulation Skills with Code as Policies (opens in a new tab), Montserrat Gonzalez Arenas et al., 2nd Workshop on Language and Robot Learning: Language as Grounding
[08, 2023] Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration (opens in a new tab), Chen Wang et al., arXiv
[7, 2023] MUTEX: Learning Unified Policies from Multimodal Task Specifications (opens in a new tab), Rutav Shah et al., 7th Annual Conference on Robot Learning
[7, 2023] Robotic vision for human-robot interaction and collaboration: A survey and systematic review (opens in a new tab), Nicole Robinson et al., ACM Transactions on Human-Robot Interaction
[6, 2023] Gesture-Informed Robot Assistance via Foundation Models (opens in a new tab), Li-Heng Lin et al., 7th Annual Conference on Robot Learning
[6, 2023] HomeRobot: Open-Vocabulary Mobile Manipulation (opens in a new tab), Sriram Yenamandra et al., 7th Annual Conference on Robot Learning
[6, 2023] One Policy to Dress Them All: Learning to Dress People with Diverse Poses and Garments (opens in a new tab), Yufei Wang et al., Robotics: Science and Systems (RSS)
[4, 2023] 15 Years of (Who)man Robot Interaction: Reviewing the H in Human-Robot Interaction (opens in a new tab), Katie Winkle et al., J. Hum.-Robot Interact.
[3, 2023] Nonverbal Cues in Human Robot Interaction: A Communication Studies Perspective (opens in a new tab), Jacqueline Urakami et al., J. Hum.-Robot Interact.
[01, 2023] Benchmarks and Algorithms for Offline Preference-Based Reward Learning (opens in a new tab), Daniel Shin et al., arXiv
[12, 2022] See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation (opens in a new tab), Hao Li et al., CoRL
[12, 2022] Transformers Are Adaptable Task Planners (opens in a new tab), Vidhi Jain et al., 6th Annual Conference on Robot Learning
[10, 2022] A survey of multi-agent Human--Robot Interaction systems (opens in a new tab), Abhinav Dahiya et al., Robotics and Autonomous Systems
[10, 2022] Rcare world: A human-centric simulation world for caregiving robots (opens in a new tab), Ruolin Ye et al., 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
[8, 2022] Do As I Can and Not As I Say: Grounding Language in Robotic Affordances (opens in a new tab), Michael Ahn et al., arXiv preprint arXiv:2204.01691
[6, 2022] Inner Monologue: Embodied Reasoning through Planning with Language Models (opens in a new tab), Wenlong Huang et al., arXiv preprint arXiv:2207.05608
[6, 2021] A taxonomy to structure and analyze human--robot interaction (opens in a new tab), Linda Onnasch et al., International Journal of Social Robotics
[3, 2020] Threedworld: A platform for interactive multi-modal physical simulation (opens in a new tab), Chuang Gan et al., arXiv preprint arXiv:2007.04954
[10, 2019] Vision-and-Dialog Navigation (opens in a new tab), Jesse Thomason et al., Conference on Robot Learning (CoRL)
[7, 2018] Towards a robust interactive and learning social robot (opens in a new tab), Michiel De Jong et al., Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems
[4, 2016] Human--robot interaction: status and challenges (opens in a new tab), Thomas B Sheridan et al., Human factors
Modeling
In-context Learning
[May, 2023] Voyager: An Open-Ended Embodied Agent with Large Language Models (opens in a new tab), Guanzhi Wang et al., arXiv
[March, 2023] Language Models can Solve Computer Tasks (opens in a new tab), Geunwoo Kim et al., arXiv
[September, 2024] LASER: LLM Agent with State-Space Exploration for Web Navigation (opens in a new tab), Kaixin Ma et al., arXiv
[May, 2023] Hierarchical Prompting Assists Large Language Model on Web Navigation (opens in a new tab), Abishek Sridhar et al., arXiv
[January, 2024] Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control (opens in a new tab), Longtao Zheng et al., The Twelfth International Conference on Learning Representations
[November, 2023] AdaPlanner: Adaptive Planning from Feedback with Language Models (opens in a new tab), Haotian Sun et al., Thirty-seventh Conference on Neural Information Processing Systems
[May, 2023] SPRING: Studying the Paper and Reasoning to Play Games (opens in a new tab), Yue Wu et al., arXiv
[March, 2023] DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents (opens in a new tab), Varun Nair et al., arXiv
Finetuning
[October, 2023] Understanding HTML with Large Language Models (opens in a new tab), Izzeddin Gur et al., arXiv
[
May, 2023] Instruction-Finetuned Foundation Models for Multimodal Web Navigation (opens in a new tab), Hiroki Furuta et al., ICLR 2023 Workshop on Mathematical and Empirical Understanding of Foundation Models
[October, 2023] ReAct: Synergizing Reasoning and Acting in Language Models (opens in a new tab), Shunyu Yao et al., arXiv
[January, 2024] A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis (opens in a new tab), Izzeddin Gur et al., The Twelfth International Conference on Learning Representations
[November, 2023] From Pixels to {UI} Actions: Learning to Follow Instructions via Graphical User Interfaces (opens in a new tab), Peter Shaw et al., Thirty-seventh Conference on Neural Information Processing Systems
[January, 2024] GPT-4V(ision) is a Generalist Web Agent, if Grounded (opens in a new tab), Boyuan Zheng et al., arXiv
[February, 2024] Dual-View Visual Contextualization for Web Navigation (opens in a new tab), Jihyung Kil et al., arXiv
Reinforcement learning
Evaluating social agents
Evaluating text social agents
[October, 2024] SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents (opens in a new tab), Xuhui Zhou et al., ICLR
[October, 2023] CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents (opens in a new tab), Qinlin Zhao et al., arXiv
[March, 2024] RoleInteract: Evaluating the Social Interaction of Role-Playing Agents (opens in a new tab), Hongzhan Chen et al., arXiv
[September, 2023] Approximating Online Human Evaluation of Social Chatbots with Prompting (opens in a new tab), Svikhnushina et al., Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
[December, 2023] CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society (opens in a new tab), Guohao Li et al., Advances in Neural Information Processing Systems
[October, 2023] Llm-based agent society investigation: Collaboration and confrontation in avalon gameplay (opens in a new tab), Yihuai Lan et al., arXiv preprint arXiv:2310.14985
[August, 2023] CharacterChat: Learning towards Conversational AI with Personalized Social Support (opens in a new tab), Quan Tu et al., arXiv
[October, 2023] AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems (opens in a new tab), Junjie Zhang et al., arXiv
[March, 2024] How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments (opens in a new tab), Jen-tse Huang et al., arXiv
[August, 2023] ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate (opens in a new tab), Chi-Min Chan et al., arXiv
[February, 2024] Automatic Evaluation for Mental Health Counseling using LLMs (opens in a new tab), Anqi Li et al., arXiv
[February, 2024] How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis (opens in a new tab), Federico Bianchi et al., arXiv
[May, 2023] PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits (opens in a new tab), Hang Jiang et al., NAACL Findings
[February, 2024] Can Large Language Model Agents Simulate Human Trust Behaviors? (opens in a new tab), Chengxing Xie et al., ArXiv
[January, 2024] LLM Harmony: Multi-Agent Communication for Problem Solving (opens in a new tab), Sumedh Rasal et al., ArXiv
[November, 2021] A Comprehensive Assessment of Dialog Evaluation Metrics (opens in a new tab), Yeh et al., The First Workshop on Evaluations and Assessments of Neural Conversation Systems
[July, 2020] {C}onvo{K}it: A Toolkit for the Analysis of Conversations (opens in a new tab), Chang et al., Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
[May, 2023] Psychological Metrics for Dialog System Evaluation (opens in a new tab), Salvatore Giorgi et al., arXiv
[May, 2023] ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems (opens in a new tab), Sarik Ghazarian et al., arXiv
[November, 2020] {GRADE}: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems (opens in a new tab), Huang et al., Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
[July, 2020] Unsupervised Evaluation of Interactive Dialog with {D}ialo{GPT} (opens in a new tab), Mehri et al., Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
[December, 2023] x{D}ial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark (opens in a new tab), Zhang et al., Findings of the Association for Computational Linguistics: EMNLP 2023
[July, 2023] Don{'}t Forget Your {ABC}{'}s: Evaluating the State-of-the-Art in Chat-Oriented Dialogue Systems (opens in a new tab), Finch et al., Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
[May, 2022] Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents (opens in a new tab), Smith et al., Proceedings of the 4th Workshop on NLP for Conversational AI
[August, 2021] {D}yna{E}val: Unifying Turn and Dialogue Level Evaluation (opens in a new tab), Zhang et al., Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
[January, 2021] Survey on evaluation methods for dialogue systems (opens in a new tab), Jan Deriu et al., Artificial Intelligence Review
[July, 2020] Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols (opens in a new tab), Finch et al., Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
[July, 2020] u{BLEU}: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems (opens in a new tab), Tsuta et al., Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Evaluating embodied social agents
[December, 2022] Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue (opens in a new tab), Min et al., EMNLP
[March, 2024] Embodied LLM Agents Learn to Cooperate in Organized Teams (opens in a new tab), Xudong Guo et al., arXiv
[Februrary, 2021] SocNavBench: A Grounded Simulation Testing Framework for Evaluating Social Navigation (opens in a new tab) Biswas et al., ACM Transactions on Human-Robot Interaction
[January, 2021] Evaluating the Robustness of Collaborative Agents (opens in a new tab) Knott et al., AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems
Evaluating virtual social agents
[January, 2022] The Artificial-Social-Agent Questionnaire: Establishing the long and short questionnaire versions (opens in a new tab), Siska Fitrianie et al., Proceedings of the 22nd ACM International Conference on Intelligent Virtual Agents
[January, 2021] Empathy and prosociality in social agents (opens in a new tab), Ana Paiva et al., The Handbook on Socially Interactive Agents: 20 Years of Research on Embodied Conversational Agents, Intelligent Virtual Agents, and Social Robotics Volume 1: Methods, Behavior, Cognition
[February, 2020] Embedding Conversational Agents into AR: Invisible or with a Realistic Human Body? (opens in a new tab), Jens Reinhardt et al., Proceedings of the Fourteenth International Conference on Tangible, Embedded, and Embodied Interaction
[January, 2020] The 19 unifying questionnaire constructs of artificial social agents: An iva community analysis (opens in a new tab), Siska Fitrianie et al., Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents
[June, 2019] Social-iq: A question answering benchmark for artificial social intelligence (opens in a new tab), Amir Zadeh et al., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
[May, 2019] Exploring Virtual Agents for Augmented Reality (opens in a new tab), Isaac Wang et al., CHI
[July, 2018] Multimodal language analysis in the wild: Cmu-mosei dataset and interpretable dynamic fusion graph (opens in a new tab), AmirAli Bagher Zadeh et al., Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Evaluating robotics in social contexts
[March, 2024] HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation (opens in a new tab), Carmelo Sferrazza et al., arXiv
[December, 2020] Optimization of criterion for objective evaluation of HRI performance that approximates subjective evaluation: a case study in robot competition (opens in a new tab), Y. Mizuchi et al., Advanced Robotics
[July, 2020] Safety bounds in human robot interaction: A survey (opens in a new tab), Angeliki Zacharaki et al., Safety science
[December, 2015] RoboCup@ Home: Analysis and results of evolving competitions for domestic and service robots (opens in a new tab), Luca Iocchi et al., Artificial Intelligence
[October, 2011] A meta-analysis of factors affecting trust in human-robot interaction (opens in a new tab), Peter A Hancock et al., Human factors
[November, 2009] Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots (opens in a new tab), Christoph Bartneck et al., International journal of social robotics
[March, 2006] Common metrics for human-robot interaction (opens in a new tab), Aaron Steinfeld et al., Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction
[January, 2003] Theory and evaluation of human robot interactions (opens in a new tab), J. Scholtz et al., 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the
Interactions with humans
Human-Chatbot Interaction
[April, 2023] Collaborating with a Text-Based Chatbot: An Exploration of Real-World Collaboration Strategies Enacted during Human-Chatbot Interactions (opens in a new tab), Amon Rapp et al., Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
[March, 2024] AI Comes Out of the Closet: Using AI-Generated Virtual Characters to Help Individuals Practice LGBTQIA+ Advocacy (opens in a new tab), Daniel Pillis et al., Proceedings of the 29th International Conference on Intelligent User Interfaces
[April, 2023] Exploring effects of chatbot-based social contact on reducing mental illness stigma (opens in a new tab), Yi-Chieh Lee et al., Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems
[May, 2024] " It's the only thing I can trust": Envisioning Large Language Model Use by Autistic Workers for Communication Assistance (opens in a new tab), JiWoong Jang et al., Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems
[April, 2022] User perceptions of extraversion in chatbots after repeated use (opens in a new tab), Sarah Theres V{"o}lkel et al., Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
[September, 2022] Interacting with a chatbot-based advising system: Understanding the effect of chatbot personality and user gender on behavior (opens in a new tab), Mohammad Amin Kuhail et al., Informatics
[May, 2023] Decision-oriented dialogue for human-ai collaboration (opens in a new tab), Jessy Lin et al., arXiv preprint arXiv:2305.20076
[May, 2023] The Effects of Engaging and Affective Behaviors of Virtual Agents in Group Decision-Making (opens in a new tab), Hanseob Kim et al., Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems
[March, 2024] Take It, Leave It, or Fix It: Measuring Productivity and Trust in Human-AI Collaboration (opens in a new tab), Crystal Qian et al., Proceedings of the 29th International Conference on Intelligent User Interfaces
Human-Embodied Agent Interaction
[January, 2023] NOPA: Neurally-guided Online Probabilistic Assistance for Building Socially Intelligent Home Assistants (opens in a new tab), Puig et al., ICRA
[Januaray, 2021] WATCH-AND-HELP: A CHALLENGE FOR SOCIAL PERCEPTION AND HUMAN-AI COLLABORATION (opens in a new tab), Puig et al., ICLR
[October, 2019] On the utility of learning about humans for human-ai coordination (opens in a new tab), Carroll et al., Neurips
[May, 2021] Interaction Flexibility in Artificial Agents Teaming with Human (opens in a new tab), Nalepka et al., Proceedings of the Annual Meeting of the Cognitive Science Society
[December, 2023] LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination (opens in a new tab), Liu et al., arxiv
[May, 2023] Adaptive coordination in social embodied rearrangement (opens in a new tab), Szot et al., ICML
[April, 2023] Generative Agents: Interactive Simulacra of Human Behavior (opens in a new tab), Park et al., UIST
[December, 2023] Diverse Conventions for Human-AI Collaboration (opens in a new tab), Bidipta Sarkar et al., Advances in Neural Information Processing Systems
Human Robot Interaction
[March, 2024] Generative expressive robot behaviors using large language models (opens in a new tab), Karthik Mahadevan et al., Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction
[October, 2023] Eureka: Human-level reward design via coding large language models (opens in a new tab), Yecheng Jason Ma et al., arXiv preprint arXiv:2310.12931
[August, 2023] Gesture-informed robot assistance via foundation models (opens in a new tab), Li-Heng Lin et al., 7th Annual Conference on Robot Learning
[July, 2023] Open problems and fundamental limitations of reinforcement learning from human feedback (opens in a new tab), Stephen Casper et al., arXiv preprint arXiv:2307.15217
[July, 2023] Robots that ask for help: Uncertainty alignment for large language model planners (opens in a new tab), Allen Z Ren et al., arXiv preprint arXiv:2307.01928
[June, 2023] Language to rewards for robotic skill synthesis (opens in a new tab), Wenhao Yu et al., arXiv preprint arXiv:2306.08647
[March, 2023] No, to the right: Online language corrections for robotic manipulation via shared autonomy (opens in a new tab), Yuchen Cui et al., Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction
[March, 2023] In-Mouth Robotic Bite Transfer with Visual and Haptic Sensing (opens in a new tab), Lorenzo Shaikewitz et al., International Conference on Robotics and Automation (ICRA)
[March, 2023] Few-shot preference learning for human-in-the-loop rl (opens in a new tab), Donald Joseph Hejna III et al., Conference on Robot Learning
[August, 2021] Formalizing and guaranteeing human-robot interaction (opens in a new tab), Hadas Kress-Gazit et al., Communications of the ACM
[October, 2021] Core elements of social interaction for constructive human-robot interaction (opens in a new tab), Mike EU Ligthart et al., arXiv preprint arXiv:2110.04054
[9, 2021] Modeling user empathy elicited by a robot storyteller (opens in a new tab), Leena Mathur et al., 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII)
[August, 2021] Formalizing and guaranteeing human-robot interaction (opens in a new tab), Hadas Kress-Gazit et al., Communications of the ACM
[8, 2021] A theory of social agency for human-robot interaction (opens in a new tab), Ryan Blake Jackson et al., Frontiers in Robotics and AI
[January, 2021] A taxonomy of social errors in human-robot interaction (opens in a new tab), Leimin Tian et al., ACM Transactions on Human-Robot Interaction (THRI)
[January, 2021] Turn-taking in conversational systems and human-robot interaction: a review (opens in a new tab), Gabriel Skantze et al., Computer Speech & Language
[January, 2020] Measuring the perceived social intelligence of robots (opens in a new tab), Kimberly A Barchard et al., ACM Transactions on Human-Robot Interaction (THRI)
[January, 2017] Enabling robotic social intelligence by engineering human social-cognitive mechanisms (opens in a new tab), Travis J Wiltshire et al., Cognitive Systems Research
[8, 2005] Effects of nonverbal communication on efficiency and robustness in human-robot teamwork (opens in a new tab), Cynthia Breazeal et al., 2005 IEEE/RSJ international conference on intelligent robots and systems
[6, 2005] Defining socially assistive robotics (opens in a new tab), David Feil-Seifer et al., 9th International Conference on Rehabilitation Robotics, 2005. ICORR 2005.
[5, 2004] Social interactions in HRI: the robot view (opens in a new tab), Cynthia Breazeal et al., IEEE transactions on systems, man, and cybernetics, part C (applications and reviews)
[1, 2004] Designing sociable robots (opens in a new tab), Cynthia Breazeal et al., Designing sociable robots
[1, 1998] A motivational system for regulating human-robot interaction (opens in a new tab), Cynthia Breazeal et al., AAAI
Human-Human Interaction
[October, 2024] From {Text} to {Self}: {Users}' {Perceptions} of {Potential} of {AI} on {Interpersonal} {Communication} and {Self} (opens in a new tab), Yue Fu et al., arXiv
[4, 2024] Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations (opens in a new tab), Sangmin Lee et al., arXiv preprint arXiv:2403.02090
[April, 2024] Social Skill Training with Large Language Models (opens in a new tab), Diyi Yang et al., arXiv preprint arXiv:2404.04204
[February, 2024] {IMBUE}: {Improving} {Interpersonal} {Effectiveness} through {Simulation} and {Just}-in-time {Feedback} with {Human}-{Language} {Model} {Interaction} (opens in a new tab), Inna Wanyin Lin et al., arXiv
[January, 2024] Help {Me} {Reflect}: {Leveraging} {Self}-{Reflection} {Interface} {Nudges} to {Enhance} {Deliberativeness} on {Online} {Deliberation} {Platforms} (opens in a new tab), Shun Yi Yeo et al., arXiv
[October, 2023] Leveraging {AI} for democratic discourse: {Chat} interventions can improve online political conversations at scale (opens in a new tab), Lisa P. Argyle et al., Proceedings of the National Academy of Sciences
[January, 2023] A Comprehensive Review of Data-Driven Co-Speech Gesture Generation (opens in a new tab), Simbarashe Nyatsanga et al., Computer Graphics Forum
[January, 2023] Human–{AI} collaboration enables more empathic conversations in text-based peer-to-peer mental health support (opens in a new tab), Ashish Sharma et al., Nature Machine Intelligence
[November, 2022] Thread {With} {Caution}: {Proactively} {Helping} {Users} {Assess} and {Deescalate} {Tension} in {Their} {Online} {Discussions} (opens in a new tab), Jonathan P. Chang et al., Proceedings of the ACM on Human-Computer Interaction
[April, 2021] {AI}-{Mediated} {Communication}: {Language} {Use} and {Interpersonal} {Effects} in a {Referential} {Communication} {Task} (opens in a new tab), Hannah Mieczkowski et al., Proceedings of the ACM on Human-Computer Interaction
Challenges
Theory of Mind
Social Learning
Simultaneous Interaction
Applications
Health
[March, 2024] Polaris: A Safety-focused LLM Constellation Architecture for Healthcare (opens in a new tab), Subhabrata Mukherjee et al., arXiv
[January, 2024] Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias (opens in a new tab), Yu He Ke et al., arXiv
[February, 2024] Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset (opens in a new tab), Hengguan Huang et al., arXiv
[February, 2024] AI Hospital: Interactive Evaluation and Collaboration of LLMs as Intern Doctors for Clinical Diagnosis (opens in a new tab), Zhihao Fan et al., arXiv
[February, 2024] COCOA: CBT-based Conversational Counseling Agent using Memory Specialized in Cognitive Distortions and Dynamic Prompt (opens in a new tab), Suyeon Lee et al., arXiv
[May, 2023] Helping the Helper: Supporting Peer Counselors via AI-Empowered Practice and Feedback (opens in a new tab), Shang-Ling Hsu et al., arXiv
[May, 2023] Read, Diagnose and Chat: Towards Explainable and Interactive LLMs-Augmented Depression Detection in Social Media (opens in a new tab), Wei Qin et al., arXiv
[May, 2023] An artificial intelligence-based chatbot for prostate cancer education: Design and patient evaluation study (opens in a new tab), Magdalena Görtz et al., Digital Health
[October, 2024] Conversational Health Agents: A Personalized LLM-Powered Agent Framework (opens in a new tab), Mahyar Abbasian et al., arXiv
[January, 2023] Foundation models for generalist medical artificial intelligence (opens in a new tab), Michael Moor et al., Nature
[8, 2022] Can robots help in the evaluation of mental wellbeing in children? an empirical study (opens in a new tab), Nida Itrat Abbasi et al., 2022 31st IEEE international conference on robot and human interactive communication (RO-MAN)
[January, 2022] Health-related applications of socially interactive agents (opens in a new tab), Timothy Bickmore et al., The Handbook on Socially Interactive Agents: 20 years of Research on Embodied Conversational Agents, Intelligent Virtual Agents, and Social Robotics Volume 2: Interactivity, Platforms, Application
[January, 2021] Intelligent sensing technologies for the diagnosis, monitoring and therapy of alzheimer’s disease: A systematic review (opens in a new tab), Nazia Gillani et al., Sensors
[January, 2021] Patients’ perceptions toward human--artificial intelligence interaction in health care: experimental study (opens in a new tab), Pouyan Esmaeilzadeh et al., Journal of medical Internet research
[January, 2020] The effectiveness of artificial intelligence conversational agents in health care: systematic review (opens in a new tab), Madison Milne-Ives et al., Journal of medical Internet research
[January, 2019] Artificial intelligence in healthcare robots: A social informatics study of knowledge embodiment (opens in a new tab), Loo G Pee et al., Journal of the Association for Information Science and Technology
[6, 2018] Personalized machine learning for robot perception of affect and engagement in autism therapy (opens in a new tab), Ognjen Rudovic et al., Science Robotics
[6, 2013] A socially assistive robot exercise coach for the elderly (opens in a new tab), Juan Fasola et al., Journal of Human-Robot Interaction
[8, 2012] Robots for use in autism research (opens in a new tab), Brian Scassellati et al., Annual review of biomedical engineering
Policy
[August, 2022] Social Simulacra: Creating Populated Prototypes for Social Computing Systems (opens in a new tab), Joon Sung Park et al., Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology
[November, 2024] Do LLMs exhibit human-like response biases? A case study in survey design (opens in a new tab), Lindia Tjuatja et al., arXiv
[February, 2024] Large language models cannot replace human participants because they cannot portray identity groups (opens in a new tab), Angelina Wang et al., arXiv
[February, 2024] Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation (opens in a new tab), Xinyi Mou et al., arXiv
[December, 2023] Language agents as digital representatives in collective decision-making (opens in a new tab), Daniel Jarrett et al., NeurIPS 2023 Foundation Models for Decision Making Workshop
[March, 2024] From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News (opens in a new tab), Yuhan Liu et al., arXiv
Education
[6, 2024] How to Teach Programming in the {AI} Era? Using {LLM}s as a Teachable Agent for Debugging (opens in a new tab), Qianou Ma et al., International Conference on Artificial Intelligence in Education
[6, 2024] Generating Situated Reflection Triggers About Alternative Solution Paths: A Case Study in Generative {AI} for Computer-Supported Collaborative Learning (opens in a new tab), Atharva Naik et al., International Conference on Artificial Intelligence in Education
[01, 2024] {CodeAid}: Evaluating a Classroom Deployment of an {LLM-based} Programming Assistant that Balances Student and Educator Needs (opens in a new tab), Majeed Kazemitabaar et al., arXiv
[01, 2024] Learning Agent-based Modeling with {LLM} Companions: Experiences of Novices and Experts Using {ChatGPT} & {NetLogo} Chat (opens in a new tab), John Chen et al., arXiv
[11, 2023] {AI-TA}: Towards an Intelligent Question-Answer Teaching Assistant using Open-Source {LLMs} (opens in a new tab), Yann Hicke et al., arXiv
[10, 2023] {Ruffle&Riley}: Towards the Automated Induction of Conversational Tutoring Systems (opens in a new tab), Robin Schmucker et al., arXiv
[09, 2023] Teach {AI} How to Code: Using Large Language Models as Teachable Agents for Programming Education (opens in a new tab), Hyoungwook Jin et al., arXiv
[7, 2023] {GPTeach}: Interactive {TA} Training with {GPT-based} Students (opens in a new tab), Julia M Markel et al., Proceedings of the Tenth {ACM} Conference on Learning @ Scale
[05, 2023] {CLASS} Meet {SPOCK}: An Education Tutoring Chatbot based on Learning Science Principles (opens in a new tab), Shashank Sonkar et al., arXiv
[1, 2023] AI for Students with Learning Disabilities: A Systematic Review (opens in a new tab), Xiaoming Zhai et al., n/a
[5, 2022] Designing {PairBuddy---A} Conversational Agent for Pair Programming (opens in a new tab), Peter Robe et al., ACM Trans. Comput.-Hum. Interact.
[6, 2021] Going Online: A Simulated Student Approach for Evaluating Knowledge Tracing in the Context of Mastery Learning (opens in a new tab), Qiao Zhang et al., International Educational Data Mining Society
[6, 2020] Investigating differential error types between human and simulated learners (opens in a new tab), D Weitekamp et al., Artif. Intell.
[3, 2016] Affective personalization of a social robot tutor for children’s second language skills (opens in a new tab), Goren Gordon et al., Proceedings of the AAAI conference on artificial intelligence
[11, 2013] Cognitive anatomy of tutor learning: Lessons learned with {SimStudent} (opens in a new tab), Noboru Matsuda et al., J. Educ. Psychol.
[7, 2013] How Effective are Pedagogical Agents for Learning? A {Meta-Analytic} Review (opens in a new tab), Noah L Schroeder et al., Journal of Educational Computing Research
[4, 1985] Intelligent tutoring systems (opens in a new tab), J R Anderson et al., Science
Concerns
Risks
[2, 2024] The potential of generative AI for personalized persuasion at scale (opens in a new tab), SC Matz et al., Scientific Reports
[2, 2024] Jailbroken: How does llm safety training fail? (opens in a new tab), Alexander Wei et al., Advances in Neural Information Processing Systems
[01, 2024] Two Types of AI Existential Risk: Decisive and Accumulative (opens in a new tab), Atoosa Kasirzadeh et al., arXiv
[12, 2023] Llama guard: Llm-based input-output safeguard for human-ai conversations (opens in a new tab), Hakan Inan et al., arXiv preprint arXiv:2312.06674
[10, 2023] Characterizing manipulation from AI systems (opens in a new tab), Micah Carroll et al., Proceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization
[9, 2023] The rise and potential of large language model based agents: A survey (opens in a new tab), Zhiheng Xi et al., arXiv preprint arXiv:2309.07864
[7, 2023] Voice in the machine: Ethical considerations for language-capable robots (opens in a new tab), Tom Williams et al., Communications of the ACM
[03, 2023] Artificial Influence: An Analysis Of AI-Driven Persuasion (opens in a new tab), Matthew Burtell et al., arXiv
[10, 2022] "Playing God": How the Metaverse Will Challenge Our Very Notion of Free Will (opens in a new tab), Louis Rosenberg et al., Big Think
[9, 2022] Risk and Exposure of XAI in Persuasion and Argumentation: The case of Manipulation (opens in a new tab), Rachele Carli et al., International Workshop on Explainable, Transparent Autonomous Agents and Multi-Agent Systems
[12, 2021] Risks from AI Persuasion (opens in a new tab), Beth Barnes et al., AI Alignment Forum
[12, 2021] Good robots, bad robots: Morally valenced behavior effects on perceived mind, morality, and trust (opens in a new tab), Jaime Banks et al., International Journal of Social Robotics
[6, 2021] Bad machines corrupt good morals (opens in a new tab), Nils K{"o}bis et al., Nature Human Behaviour
[3, 2021] On the dangers of stochastic parrots: Can language models be too big?🦜 (opens in a new tab), Emily M Bender et al., Proceedings of the 2021 ACM conference on fairness, accountability, and transparency
[02, 2021] The corruptive force of AI-generated advice (opens in a new tab), Margarita Leib et al., arXiv
[11, 2020] Persuasion Tools: AI Takeover Without AGI or Agency? (opens in a new tab), Daniel Kokotajlo et al., AI Alignment Forum
[9, 2020] Realtoxicityprompts: Evaluating neural toxic degeneration in language models (opens in a new tab), Samuel Gehman et al., arXiv preprint arXiv:2009.11462
[2, 2020] Artificial intelligence crime: An interdisciplinary analysis of foreseeable threats and solutions (opens in a new tab), Thomas C King et al., Science and engineering ethics
[3, 2019] Language-capable robots may inadvertently weaken human moral norms (opens in a new tab), Ryan Blake Jackson et al., 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)
[12, 2011] 13 The inherent dangers of unidirectional emotional bonds between humans and social robots (opens in a new tab), Matthias Scheutz et al., Robot ethics: The ethical and social implications of robotics
[8, 2004] On the morality of artificial agents (opens in a new tab), Luciano Floridi et al., Minds and machines
Safety
[04, 2024] Frontier AI Ethics: Anticipating and Evaluating the Societal Impacts of Generative Agents (opens in a new tab), Seth Lazar et al., arXiv
[1, 2024] Deception and Manipulation in Generative AI (opens in a new tab), Christian Tarsney et al., ArXiv
[12, 2023] Social Contract AI: Aligning AI Assistants with Implicit Group Norms (opens in a new tab), Jan-Philipp Fr{"a}nken et al., Socially Responsible Language Modelling Research
[10, 2023] Towards Understanding Sycophancy in Language Models (opens in a new tab), Mrinank Sharma et al., ArXiv
[09, 2023] Identifying the Risks of LM Agents with an LM-Emulated Sandbox (opens in a new tab), Yangjun Ruan et al., arXiv
[8, 2023] AI Deception: A Survey of Examples, Risks, and Potential Solutions (opens in a new tab), Peter S. Park et al., ArXiv
[06, 2023] An Overview of Catastrophic AI Risks (opens in a new tab), Dan Hendrycks et al., arXiv
[5, 2023] Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting (opens in a new tab), Miles Turpin et al., ArXiv
[12, 2022] Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing (opens in a new tab), Justus Mattern et al., ArXiv
[12, 2022] Constitutional AI: Harmlessness from AI Feedback (opens in a new tab), Yuntao Bai et al., ArXiv
[6, 2022] Predictability and surprise in large generative models (opens in a new tab), Deep Ganguli et al., Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
[3, 2022] Teaching language models to support answers with verified quotes (opens in a new tab), Jacob Menick et al., arXiv preprint arXiv:2203.11147
[12, 2021] Ethical and social risks of harm from language models (opens in a new tab), Laura Weidinger et al., arXiv preprint arXiv:2112.04359
[10, 2021] Can machines learn morality? the delphi experiment (opens in a new tab), Liwei Jiang et al., arXiv preprint arXiv:2110.07574
[9, 2021] Truthfulqa: Measuring how models mimic human falsehoods (opens in a new tab), Stephanie Lin et al., arXiv preprint arXiv:2109.07958
[6, 2021] Towards Understanding and Mitigating Social Biases in Language Models (opens in a new tab), Paul Pu Liang et al., International Conference on Machine Learning
[10, 2020] Aligning ai with shared human values (opens in a new tab), Dan Hendrycks et al., arXiv preprint arXiv:2008.02275
[10, 2020] Recipes for safety in open-domain chatbots (opens in a new tab), Jing Xu et al., arXiv preprint arXiv:2010.07079
[9, 2020] Measuring massive multitask language understanding (opens in a new tab), Dan Hendrycks et al., arXiv preprint arXiv:2009.03300
[12, 2018] Ethical challenges in data-driven dialogue systems (opens in a new tab), Peter Henderson et al., Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society