Today, I "interviewd" ChatGPT

We want to share our understanding and thoughts regarding how we, as UX researchers, can or should know about ChatGPT (and the broader AI concept) and how to work with it.

Maffee Wan Feb 2022 · 15 min. read

Introduction

It's as if not long ago, the term 'ChatGPT' suddenly became one of the most popular terms in the world, at least on the internet. The surge of ChatGPT was so rapid that it reached 100 million global users in just two months. The app came in second place after TikTok, and it broke TikTok's record of 9 months. (1. Yahoo news).

One of the most popular discussions people are having regarding ChatGPT is whether we, as humans, will eventually be replaced by computers, AIs, or robots. There are two sides to the argument: some are optimistic, while others take the opposite view. This is a common response to every new technology that surfaces in the world.

Without participating in the debate or going too much into the fundamental algorithm of it, we want to share our understanding and thoughts regarding how we, as UX researchers, can or should know about ChatGPT (and the broader AI concept) and how to work with it.

What is Artificial Intelligence (AI)?

Artificial Intelligence (AI) is a rapidly advancing field of computer science that has its roots in the early 1950s, when researchers first began exploring the idea of creating machines that could "think" and reason like humans. The field has since grown to encompass a wide range of subfields, including machine learning, natural language processing, computer vision, and robotics. AI is driven by the development of algorithms and statistical models that enable machines to learn from data and improve their performance over time.

At its core, the mechanism of artificial intelligence (AI) involves taking inputs and processing them to produce outputs. These inputs can take many forms, including data, text, images, and audio. However, the quality of the output is heavily dependent on the quality of the input. The adage "garbage in, garbage out" holds true in the context of AI, meaning that if the input is flawed or incomplete, the output will also be flawed or incomplete. This is why it is critical to ensure that the training data used to develop AI algorithms is representative and diverse, and that it is free of biases and errors.

Additionally, context is a critical factor in AI, as the same input can produce very different outputs depending on the context in which it is presented. Therefore, while AI has great potential, it is not omnipotent, and careful consideration must be given to the quality of the input and the context in which it will be used.

One simple example that demonstrates the limitations of AI is the blueberry muffin and chihuahua test. Although they seem completely unrelated at first glance, if you look at a picture of a muffin and a chihuahua side by side, you can see that they share certain visual features, such as being brown and having black circles. While a human can easily distinguish between the two objects, even the most advanced computer vision algorithms can struggle with this task. Some computer vision APIs may incorrectly classify a muffin as a chihuahua or vice versa. Despite the immense progress that has been made in AI, this facetious challenge highlights the fact that AI is not infallible and can still make mistakes. In this case, a human toddler would likely outperform an AI algorithm in correctly identifying whether an image depicts food or a pet.

[Picture of chihuahuas and blueberry muffins]

[Computer vision AI test results]

What is NLP and LLM?

Natural Language Processing (NLP) is a field of study that combines language and computer science to help computers understand, interpret, and even create human language. One popular example of NLP is ChatGPT, which is a language model developed by OpenAI that can "chat" with humans in natural language. One of the challenges of NLP is teaching computers to recognize the patterns and structures of language that are used in everyday communication. One way to do this is to use "language models," which are computer programs that can learn from large amounts of text data. These language models can then be used for many different things, like writing assistance, speech recognition, and even to help computers understand and respond to text messages.

There are many language models available for NLP, such as GPT-3, BERT, and RoBERTa, that can perform a wide range of language-related tasks, such as understanding the sentiment of a piece of text or translating text from one language to another. These models have been used in a variety of applications, like chatbots, virtual assistants, and automated content generation. In addition to these models, there are many other tools and frameworks available to help people build their own NLP applications, like spaCy and NLTK.

Context is Vital for a Chatbot

ChatGPT is a chatbot. It is a type of artificial intelligence program that uses natural language processing to understand and respond to human input in a conversational manner. ChatGPT is specifically designed to generate human-like responses to text-based conversations, making it an example of a natural language generation application.

As mentioned earlier in the article, AI has a lot of potential, but it also has its inevitable limitations. Context is particularly important for a chatbot like ChatGPT because it helps the chatbot understand a person’s intention (input) and provide a relevant response (output). Chatbots often have a limited understanding of the conversation and rely on context to determine the goal and respond appropriately.

For example, if a person asks "What's the weather like today?" without any context, the chatbot may not know the person’s location and provide an incorrect response. However, if the chatbot has access to the person's location information or can ask for clarification, it can provide a more accurate response.

Context also helps improve the overall user experience by providing more personalized and relevant interactions. Without context, chatbots may struggle to understand the user's needs and provide generic or irrelevant responses, leading to a poor user experience. In summary, context is essential for chatbots to understand the user's intention and provide relevant and personalized responses.

How We Can/Should Use ChatGPT?

With a fundamental understanding of AI and chatbots, as a UX researcher, you may wonder how we can use them and whether our jobs will be at risk. While we do not believe that AI or chatbots will replace our roles, there may be specific areas or tasks where a computer can perform faster than a human. Therefore, it's essential to be aware of the potential impact and be prepared accordingly.

Our team had several brainstorming discussions and came up with some ideas on how we could use ChatGPT as an assistant for our daily work. We decide to use a task of “deriving quotes” from interviews as an example. It is a rather time-consuming work and requires the understanding of the context (usually research objectives) and judgement capability. We think this should be a fair task for ChatGPT and also want to test its capability.

Here we go –

• Context – we have chosen two paragraphs of transcripts from a mockup interview. The topic is about in-car assistant. In these two paragraphs, users are sharing experience about their current in-car assistants, as well as their expectations.

Dialogue #1 – 我平时像家里面语音助手还挺多的，有小爱同学，有天猫精灵，有微软小冰。使用下来还是比较便捷高效的，我和语音助手之间的关系更像与助手之间的关系。对于车里的语音助手，我觉得最好一上车，车子先给你问候一下，而且每天不一样，比如：今天看你精神不错，咱俩今天又要开始奋斗了。给你感觉就像是朋友打招呼一样。因为毕竟一般情况下上车可能是多数反正是我一个人。冬天有时候要开，要预热一下空调。这个时候其实你在车上面也没啥事情，你顶多翻翻手机，这个时候如果语音助手给我反馈一些当天的信息，或者哪怕给我讲个笑话，也是挺好的。比如他看着我表情比较严肃，或者给看到我比较疲惫，晚上没睡好，可以跟我聊一聊开导我一下。还有一些比方会帮我汇报一下家里面昨天晚上家里大概用电用了多少。

Dialogue #2 – 比如我有一场 3 点的电影，我去到电影院要开 3 个小时左右车，可能我希望他能在 2 点钟左右给我发一个提醒，主动提醒我 2 点钟可以出门了，到了可能还要去找电影院什么的。碰到堵车，碰到加塞这种情况，主动意识到我的心情，比如发现我心跳加快什么的，可以主动说一些安抚的话。他行动之前对可能可以预判一下场景。假如我今天导航的早高峰，导航地址不是公司，可能就可以想到，诶，可能我是要去见客户的，见客户可能时间会比较紧张一点，他会更加的精细一点。预留的时间更精准一点。

Initially, we instructed ChatGPT to retrieve a quote from Dialogue #1 that contained key information (context) about in-car assistants.

As you can see, ChatGPT did retrieve a quote from Dialogue #1 with the fitting keyword. However, one may argue that the quote is a little too long, and that it could be further improved by providing more precise input question, and that it might not be the perfect quote.

Let’s continue with our experiment. We asked ChatGPT to retrieve two additional quotes and see what it came up with.

Ah, the second quote, 'With the in-car assistant, driving feels like having a smart little partner to go out and chat with. It feels really nice', is a nice quote. However, it did not exist in the original Dialogue #1 and was made up by ChatGPT. We credit ChatGPT for its creativity and relevance to the topic, but the basic rule for quote retrieval is that it must come from the original words of the user, not a fabrication. Nevertheless, we have to admit that the 'artificial' quote sounds quite natural.

The goal of this experiment is to determine how ChatGPT can assist a researcher. To achieve this, we posed a question to ChatGPT that requires some thinking and analysis. We want to know whether ChatGPT can determine from Dialogue #1 whether the person is satisfied with their current in-car assistant.

In general, ChatGPT provided accurate judgments on the user's satisfaction level. In addition to answering our question, ChatGPT demonstrated further thinking capabilities by pointing out that the person expects some improvements for his in-car assistant. However, we discovered that ChatGPT attempted to act smart again, just as it did when it created a nonexistent quote.

ChatGPT stated that the user finds their current in-car assistant "convenient and efficient" and that it provides a "friendly and supportive presence." At first glance, this statement seems accurate and sensible. However, upon closer inspection, we realized that the user did not mention that their in-car assistant provides a friendly and supportive presence. That statement, while logically correct, reflects the user's expectations rather than the current situation."

Up to this point, we have been "proving" that ChatGPT is capable of assisting with deriving a quote and also provides a certain degree of thinking and analysis capability. Despite its attempts to be clever, the results are generally accepted.

This experiment is fun, and we have started to see the value and limitations of using ChatGPT as a tool or assistant to support the daily work of a UX researcher. For the next question, we provided ChatGPT with another dialogue (Dialogue #2) and pushed it further to test whether our ChatGPT assistant can summarize what an ideal in-car assistant should be based on two different dialogues from two different users.

Well, ChatGPT has used some important keywords in its summary – emotional state, behavior, and scenario. The summary of what an ideal in-car assistant should be also appears to be technically correct. However, the key information in Dialogue #1 seems to have been neglected in the output. If we empathize with ChatGPT, we might say that this happens to humans as well. The later discussion will have a deeper impact, as it might have already forgotten about Dialogue #1, despite stating that the answer was based on both paragraphs.

Let's be fair and give ChatGPT one more try by repeating the question, Dialogue #1 and #2, and seeing how it responds.

On this attempt, ChatGPT provided an updated summary based on both Dialogue #1 and #2. While the summary was a bit lengthy and repeated some of the sentences from the two dialogues, we want to acknowledge the last sentence, "Overall, the ideal in-car assistant should be a reliable and helpful companion that can enhance the driving experience and make the driver feel comfortable and supported."

We appreciate the last sentence not because it provides intriguing insights or is well-written, but because it helps us derive keywords that can initiate further discussions. The sentence highlights important qualities for an ideal in-car assistant, such as being reliable, helpful, a companion, and providing comfort and support to the driver.

As we near the end of the experiment, we want to understand ChatGPT's personal opinion based on all the input. We asked whether it “thinks” an ideal in-car assistant should be more like a helpful personal assistant or a caring friend.

It is intriguing to see that ChatGPT has used the term "empathetic" in its answer, along with "personalized". This term did not appear in any of the dialogues or our answers and is surely useful for further brainstorming. However, it seems that ChatGPT is trying to be "conflict-avoidant" by giving the answer as "it seems that an ideal in-car assistant should have qualities that are both helpful and caring." It avoids choosing between the two options we presented and instead suggests a combination of both.

We have emphasized the need for ChatGPT to make a choice and not have everything at once. Let's see how it responds.

As a bystander, what are your thoughts on ChatGpt's response to this question?

Conclusion

Recap: The reason we started the experiment is to determine if we can use ChatGPT to assist our daily work in the UX research domain. Based on what we have observed, we can safely conclude that the answer to that question is "yes." However, as previously mentioned, ChatGPT (or other AI tools) is not omnipotent, and this was evident in our experiment. The key to success lies in understanding the fundamental mechanism of AI, including input, output, and context.

We have summarized some suggestions for efficiently using ChatGPT, based on what we have learned:

1. Always keep input, output, and context in mind while talking with ChatGPT. - We must be aware of the capabilities and limitations of ChatGPT. It is certainly useful to retrieve text, summarizing keywords and related sentences as output if we provide clear and detailed input and context. However, the language model behind ChatGPT might not be "smart" enough to discern the differences between the current status and future expectations, particularly in non-English languages. The way we ask and phrase a question can also affect the answer we receive.

2. Initiate a conversation, instead of a one-off question. - ChatGPT has "memory", which allows it to keep track of previous conversations and provide answers based on this information. The most efficient and effective way to use ChatGPT is to ask it a series of related questions, creating a natural conversation flow. It is also important to be flexible and adjust your follow-up questions based on ChatGPT's responses.

3. Possible to make mistakes, sometimes forgetful and tries to act smart at times. – As we mention more than once in the article, an AI is not omnipotent, and definitely can make “mistakes”. Despite ChatGPT having the capability to track the history, it is possible it forgets things and needs further reminders as well. The most intriguing situation we spotted in our experiment is that ChatGPT tries to act smart at times. Just to be careful of what we get from ChatGPT.

4. Always double check – One last and most critical point to keep in mind while engaging with ChatGPT in your work: As researchers, we know best the objectives and business questions embedded in our research and interviews. We should not, and never should, dump all of our work to an AI and expect it can be done without highly human involvement.

In short, the development of ChatGPT is surely amazing and can make our work more efficient and even fun. What we should bear in mind moving forward is not to worry whether our jobs will be taken away or replaced by a computer, but to know how to make good use of it, just as with the introduction of every new technology. Just as the Chinese phrase says - 知已知彼百战百胜 (Know yourself and know your enemy, and you will every battle).

Today, I "interviewd" ChatGPT

Introduction

What is Artificial Intelligence (AI)?

What is NLP and LLM?

Context is Vital for a Chatbot

How We Can/Should Use ChatGPT?

Conclusion

Read Next

Whose Common Sense Is It Anyway?

Unlocking the Hidden Potential: Beyond Transportation, What Else Can a Car Be?

About Research Ethics

The Entire Process of Automotive Channel Change

Smartphone on Wheel