poster

PromptChainer: Chaining Large Language Model Prompts through Visual Programming

Authors:
Tongshuang Wu

University of Washington, United States

University of Washington, United States
View Profile

,
Ellen Jiang

Google Research, United States

Google Research, United States
View Profile

,
Aaron Donsbach

Google Research, United States

Google Research, United States
View Profile

,
Jeff Gray

Google Research, United States

Google Research, United States
View Profile

,
Alejandra Molina

Google Research, United States

Google Research, United States
View Profile

,
Michael Terry

Google Research, United States

Google Research, United States
View Profile

,
Carrie J Cai

Google Research, United States

Google Research, United States
View Profile

CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing SystemsApril 2022Article No.: 359Pages 1–10https://doi.org/10.1145/3491101.3519729

Published:28 April 2022Publication History

CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems

Pages 1–10

ABSTRACT

While LLMs have made it possible to rapidly prototype new ML functionalities, many real-world applications involve complex tasks that cannot be easily handled via a single run of an LLM. Recent work has found that chaining multiple LLM runs together (with the output of one step being the input to the next) can help users accomplish these more complex tasks, and in a way that is perceived to be more transparent and controllable. However, it remains unknown what users need when authoring their own LLM chains – a key step to lowering the barriers for non-AI-experts to prototype AI-infused applications. In this work, we explore the LLM chain authoring process. We find from pilot studies that users need support transforming data between steps of a chain, as well as debugging the chain at multiple granularities. To address these needs, we designed PromptChainer, an interactive interface for visually programming chains. Through case studies with four designers and developers, we show that PromptChainer supports building prototypes for a range of applications, and conclude with open questions on scaling chains to even more complex tasks, as well as supporting low-fi chain prototyping.

Supplemental Material

3491101.3519729-talk-video.mp4

mp4

33.1 MB

Download

3491101.3519729-video-figure.mp4

mp4

100 MB

Download

3491101.3519729-video-preview.mp4

mp4

5.8 MB

Download

References

Gregor Betz, Kyle Richardson, and Christian Voigt. 2021. Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2. ArXiv preprint abs/2103.13033 (2021). https://arxiv.org/abs/2103.13033Google Scholar
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, and Percy Liang. 2021. On the Opportunities and Risks of Foundation Models. arxiv:2108.07258 [cs.LG]Google Scholar
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.htmlGoogle Scholar
Margaret Burnett. 2010. End-user software engineering and why it matters. Journal of Organizational and End User Computing (JOEUC) 22, 1(2010), 1–22.Google ScholarDigital Library
Margaret M. Burnett, Curtis R. Cook, and Gregg Rothermel. 2004. End-user software engineering. Commun. ACM 47(2004), 53 – 58.Google ScholarDigital Library
Steven P Dow, Alana Glassco, Jonathan Kass, Melissa Schwarz, Daniel L Schwartz, and Scott R Klemmer. 2010. Parallel prototyping leads to better design results, more divergence, and increased self-efficacy. ACM Transactions on Computer-Human Interaction (TOCHI) 17, 4(2010), 1–24.Google ScholarDigital Library
Baocheng Geng, Qunwei Li, and Pramod K Varshney. 2018. Decision tree design for classification in crowdsourcing systems. In 2018 52nd Asilomar Conference on Signals, Systems, and Computers. IEEE, 859–863.Google ScholarCross Ref
Ellen Jiang, Kristen Olson, Edwin Toh, Alejandra Molina, Aaron Donsbach, Michael Terry, and Carrie J. Cai. 2022. Prompt-based Prototyping with Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems.Google Scholar
Aniket Kittur, Boris Smus, Susheel Khamkar, and Robert E. Kraut. 2011. CrowdForge: Crowdsourcing Complex Work. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (Santa Barbara, California, USA) (UIST ’11). Association for Computing Machinery, New York, NY, USA, 43–52. https://doi.org/10.1145/2047196.2047202Google ScholarDigital Library
Opher Lieber, Or Sharir, Barak Lenz, and Yoav Shoham. 2021. Jurassic-1: Technical Details And Evaluation. Technical Report. AI21 Labs.Google Scholar
Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2021. What Makes Good In-Context Examples for GPT-3?ArXiv preprint abs/2101.06804 (2021). https://arxiv.org/abs/2101.06804Google Scholar
Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, and Pontus Stenetorp. 2021. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. ArXiv preprint abs/2104.08786 (2021). https://arxiv.org/abs/2104.08786Google Scholar
Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. 2021. Cross-Task Generalization via Natural Language Crowdsourcing Instructions. ArXiv preprint abs/2104.08773 (2021). https://arxiv.org/abs/2104.08773Google Scholar
D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine Learning: The High Interest Credit Card of Technical Debt. In SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop).Google Scholar
Ben Swanson, Kory Mathewson, Ben Pietrzak, Sherol Chen, and Monica Dinalescu. 2021. Story Centaur: Large Language Model Few Shot Learning as a Creative Writing Tool. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics, Online, 244–256. https://aclanthology.org/2021.eacl-demos.29Google ScholarCross Ref
Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, 2022. LaMDA: Language Models for Dialog Applications. ArXiv preprint abs/2201.08239 (2022). https://arxiv.org/abs/2201.08239Google Scholar
Tongshuang Wu, Michael Terry, and Carrie J Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’21). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3491102.3517582Google ScholarDigital Library
Qian Yang, Aaron Steinfeld, Carolyn Rosé, and John Zimmerman. 2020. Re-Examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to Design. Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376301Google ScholarDigital Library

Recommendations

AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Although large language models (LLMs) have demonstrated impressive potential on simple tasks, their breadth of scope, lack of transparency, and insufficient controllability can make them less effective when assisting humans on more complex tasks. In ...
Read More
Java (programming language): Java (software platform), Java Virtual Machine, Java performance, Java syntax, Java applet, JavaServer Pages, Swing (Java), Java Servlet, Generics in Java
Read More
Exploring the Potential of Large Language Models in Supply Chain Management: A Study Using Big Data

This study aims to identify emerging topics, themes, and potential areas for applying large language models (LLMs) in supply chain management through data triangulation. This study involved the synthesis of 33 published articles and a total of 3421 ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems
April 2022
3066 pages
ISBN:9781450391566
DOI:10.1145/3491101
Editors:
Simone Barbosa
PUC-Rio, Brazil
,
Cliff Lampe
University of Michigan, USA
,
Caroline Appert
Université Paris-Saclay, France
,
David A. Shamma
Toyota Research Institute, USA
Copyright © 2022 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 April 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- poster
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate6,164of23,696submissions,26%
Upcoming Conference
CHI '24

Sponsor:

sigchi

CHI Conference on Human Factors in Computing Systems

May 11 - 16, 2024

Honolulu , HI , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 1,496
  Total Downloads
- Downloads (Last 12 months)1,019
- Downloads (Last 6 weeks)111
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

PromptChainer: Chaining Large Language Model Prompts through Visual Programming

CHI EA '22: Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems

ABSTRACT

Supplemental Material

References

Cited By

Recommendations

AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts

Java (programming language): Java (software platform), Java Virtual Machine, Java performance, Java syntax, Java applet, JavaServer Pages, Swing (Java), Java Servlet, Generics in Java

Exploring the Potential of Large Language Models in Supply Chain Management: A Study Using Big Data