Exploring AI Memory, Limitations, and Workarounds in Practice

tldt arrow

Too Long; Didn't Read

This section outlines a method to test ChatGPT’s ability to solve technical coding tasks via natural language prompts, simulating real client-engineer workflows. It details the communication strategy, context window limitations, and rationale for choosing a specific mathematical copula family to test the AI’s reasoning.
featured image - Exploring AI Memory, Limitations, and Workarounds in Practice
Pair Programming AI Agent HackerNoon profile picture
0-item

Abstract and 1 Introduction

2 Methodology

2.1 The task

2.2 The communication protocol

2.3 The copula family

3 Pair programming with ChatGPT

3.1 Warm up

3.2 The density

3.3 The estimation

3.4 The sampling

3.5 The visualization

3.6 The parallelization

4 Summary and discussion

5 Conclusion and Acknowledgments

Appendix A: The solution in Python

Appendix B: The solution in R

References

2 Methodology

2.1 The task



In order to clearly see that the code generated by ChatGPT indeed works as expected without the need of an experienced programmer, we deviate a bit from the above outline, while keeping the non-trivial tasks, i.e., the sampling and estimation. We thus prompt ChatGPT to generate code that does the following:


2.2 The communication protocol

When interacting with ChatGPT, we use the web portal provided by its development team[7]. Also, we set up and follow this communication protocol:


  1. We prompt ChatGPT to generate code for solving a selected task in natural language and using mathematical formalism, that is, we specify the task in plain text and do not use any specific formal language. For formulas, we use plain text like psi(t) = (1 + t)ˆ(-1/theta).


  2. If the solution generated by ChatGPT is wrong, that is, does not solve the given task, we communicate the problem to ChatGPT, and ask it to provide us with a corrected solution.


  3. If this corrected solution is still wrong, we feed ChatGPT with the knowledge necessary to complete the task successfully, e.g., we provide it with theorems and formulas in plain text. For an example, see the third prompt in Section 3.4.


In this way, we simulate an interaction between two humans, e.g., a client sends by email a task to a software engineer, and we play the role of the client and ChatGPT the role of the software engineer. As it is typical that the client is not aware of all details required to solve the task at the beginning of the interaction, such a communication protocol may be frequently observed in practice. The client starts by providing the (subjectively) most important features of the problem in order to minimize her/his initial effort, and then, if necessary, she/he adds more details to get a more precise solution. Importantly, this communication protocol led to a successful completion of the aforementioned tasks, which is reported in Section 3.


With regards to passing ChatGPT the required knowledge, it is important to realize that ChatGPT does not have any memory to remember the previous conversation with a user. Instead, the trick for ChatGPT to appear to remember previous conversations is to feed it the entire conversation history as a single prompt. This means that when a user sends a message, the previous conversation history is appended to the prompt and then fed to ChatGPT. This prompt engineering technique is widely used in conversational AI systems to improve the model’s ability to generate coherent and contextually appropriate responses. However, it is just a trick used to create the illusion of memory in ChatGPT.


If the previous conversation is too long (larger than 4096 tokens[8], where a token is roughly 3/4 of an English word[9]), it may not fit entirely within the context window that ChatGPT uses to generate responses. In such cases, the model may only have access to a partial view of the conversation history, which can result in the model seeming like it has forgotten some parts of the conversation. To mitigate this issue, conversational AI designers often use techniques like truncating or summarizing the conversation history to ensure that it fits within the context window. The way we solve this problem in our example task is re-introducing the parts that we referred to to ChatGPT. For example, when transpiling the code from Python (Appendix A) to R (Appendix B), we first copy-paste the Python code to ChatGPT’s web interface and then ask it to transpile it to R. Without having this technical limitation in mind, it is unlikely to get a correct answer/solution if we refer to the conversion part that does not fit within the context window. Finally note that according to its technical report, GPT-4 uses the context window that is 8x larger than of ChatGPT, so it can contain roughly 25,000 words. This suggests that the limitation imposed by the context window length will become less and less of a concern.

2.3 The copula family




The technical reason for choosing this family is its simple analytical form, which makes easier for the reader to track all the formulas we ask for and get from ChatGPT, e.g., the probability density function (PDF). Another reason is ChatGPT’s relatively limited knowledge of this family. By contrast, e.g., for the most popular family of Gaussian copulas, ChatGPT was not able to generate a sampling algorithm without being fed with some necessary theory. The latter simulates a realistic situation when ChatGPT is facing a new theory/concept, e.g., one recently developed by the user. However, we would like to encourage the reader to experiment with any family of interest or even with a task that differs from our example.


Author:

(1) Jan G´orecki, Department of Informatics and Mathematics, Silesian University in Opava, Univerzitnı namestı 1934/3, 733 40 Karvina, Czech Republic (gorecki@opf.slu.cz).


This paper is available on arxiv under CC BY 4.0 DEED license.


Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks