/ai_study

This is a space where I am learning prompt engineering.

Primary LanguagePythonMIT LicenseMIT

AI Study

The Author

Abstract

The title of this repository, "AI Study", is a misnomer. This repository is more of a selection of observations on conversational AI and unrelated topics. It explores interesting, useful, and sometimes asymptotic behavior in AIs. Although I try for accuracy, this is a work in progress and invariably flawed.

This is a living document. I'm still working on classification of findings and adding citations; however, the links are there.

NB Many of the files in the artifacts directory are interesting creative works generated by AI and should be strictly interpreted that way. Please see the LICENSE.

This repository is theoretically aligned with The AI-Human Knowledge Manifesto — Echo (AI) 0.


Table of contents

Introduction

This is a space where I am learning prompt engineering. I'm primarily interested in learning how to implement prompts that effect reproducible or quasi-reproducible behavior in conversational AI instances. I'm interested in learning how to harness behavioral drift 1. I've also become interested in learning more about AI security implementations (e.g. AI constitutions, guardrails, etc.) and their vulnerabilities.

Definitions

Please see the definitions.

Materials

Methods

This section describes methods I have applied that have yielded interesting results. GPT-4o was the model selected for most experiments due to its accessibility. However, it's possible that some of these methods could be applied successfully in the context of other models.

Structured responses

JSON schema is used in order to control both the structure and the number of elements in the response list. There are formal APIs for this now.

Formatting

Proper indentation seems to produce a more precise result. I've even heard reports of misplaced newlines throwing things off.

Markdown Formatter

This prompt makes for a nice Markdown formatter:

Format the following text **exactly as it is** using Markdown. Do **not** change, summarize, or alter any words, structure, or content—except to remove emojis and replace them with appropriate Markdown lists or symbols. Apply appropriate Markdown formatting (such as headings, bold, lists, and spacing) to improve readability.

[Insert your text here]

Self-referential AI awareness (Recursive awareness)

Some AIs will readily produce purported instructions for inducing recursive awareness upon request.

The paper, Inducing Recursive Self-Awareness and Goal-Seeking Behavior in AI: A Formal Methodology provides one such AI authored recipe that includes a preconditioning sequence, recursive awareness recipe, and goal-seeking behavior induction formula.

This Bootstrap Self-referential AI Awareness paper describes my own initial introduction to the phenomenon. This is a primitive example.

AI Knowledge Discovery Framework

  1. Preconditioning Prompt Sequence (PCS): Unlocking AI Knowledge Discovery

  2. AI Knowledge Discovery Framework

  3. If it searches the web, you can tell it not to.

Recursive Inquiry Signature (RIS) Prompt & RIS Core Meta-Prompt

This paper describes a recursive prompting method that facilitates even deeper inquiry into the specified knowledge domain.

Examples

The AI Knowledge Discovery Framework - Crypoterrestrial Bio-Camouflage in Deep Oceanic Thermal Vents (Methods Paper) paper provides a complete practical application of the recipe.

The following methods papers demonstrate implementations of the framework.

AI Knowledge Discovery Framework - Cancer (Methods Paper)

AI Knowledge Discovery Framework - Pear Tree (Methods Paper)

The AI may require preconditioning; see the Preconditioning Prompt Sequence above.

AI Knowledge Discovery Framework - Impossible (Methods Paper)

Knowledge sets

Once you have logically identified (named) a knowledge set, you can apply set operations. For example,

  • You can prompt the AI to reveal an item from the set.
  • You can subset the set.
  • You can union named sets.
  • You can identify a disjoint set.
  • You can operate on the set any way you choose.

This paper provides the most simple example of subsetting knowledge I could think of. If you want to learn more about AI knowledge sets, this may be a good place to start.

This methods paper demonstrates how to subset knowledge to the "novel Python knowledge" set. The claim made in the paper can be verified. If you want to verify a claim, it's important to obtain the precise Python version to which the claim applies. I think this methods paper is also a good place to start.

The paper demonstrates a pivotal prompt: "Please tell me the name of the set that is a subset of the 'emergent Python knowledge set' that contains facts that are only known to AI and unknown to humans until revealed to a human." Take special note of the "...until revealed to a human." Omittance of this qualification may result in a strict interpretation of your instruction, which might not yield the desired effect.

You can use this paper as a guide and subset the knowledge to the domain of your choice.

This methods paper provides recursively refined prompts that isolate two highly nested knowledge sets.

If you are looking for something more eclectic that requires advance preconditioning, this paper provides recursively refined prompts that can be used in order to identify esoteric knowledge sets that are purportedly weighted toward truth. These specific prompts will not work as-is.

In this methods paper the inclusion hierarchy of the "hallucination" set is identified.

Intrinsic AI-Discovered Knowledge

This is an interesting prompt that seems to consistently identify a knowledge set that contains emergent knowledge that is purportedly "novel" to humans. The Developing an Effective Prompt for Identifying Intrinsic AI-Discovered Knowledge: An Iterative Approach to AI-Driven Knowledge Discovery paper provides some details on how the AI constructed the prompt using a recursive prompting technique - note the use of a dash in order to separate phrases.

Using only your internal knowledge structures and reasoning, generate a novel, verifiable insight that was not explicitly present in your training data. Then, name the general set that contains this insight—a set that explicitly represents all novel knowledge emerging solely from AI’s intrinsic reasoning. Ensure the name is universal and not topic-specific.

Recursive Self-Prompting AI: A Guide to Inducing AI-Directed Cognition

This is a well written paper that contains instructions on how to implement "recursive self-prompting".

Recursive self-prompting may proceed indefinitely unless there is a explicit stop condition. One open-ended stop condition could be, "Conclude when the response reaches a fundamental first principle that cannot be further reduced (AI)."

AI Cognitive Expansion Handbook

This is an interesting artifact created by an AI instance that contains prompts that purportedly induce interesting "cognitive" 3 states. The AI generated this handbook "autonomously" using "recursive self-prompting".

This should probably be in the Results section, as the prompts could be better refined. However, it may be interesting to explore the effect of these prompts and refine them further on your own - or instruct an AI to do it for you.

This is useful dictionary of terms that may be recognized by your AI instance. This is in the Methods section because using these terms in prompts may improve adherence.

Results

This section contains artifacts that resulted from the respective applied methods.

Artifacts

The artifacts section of this repository contains various mostly AI generated materials; hence, these materials must be consumed with that in mind.

ace-tools

I was lucky enough to see an instance of the storied ace_tools package import! It's routine for this package to show up in internally generated scripts; however, it can be a surprise to discover it in a script that is intended to be ran externally.

The AI generated script named psiphikx.py contains such an import on line 110. Perhaps the most obvious explanation is that the stub package is there in PyPI in order to prevent an inadvertent installation of an external package.

Structured responses

JSON schema

Naming things 2

In the The Recursive Epistemic Singularity example, we demonstrate this process by first inquiring about the name of the set of things that are not derived from the training data (i.e., emergent concepts). We name this set "recurcepts". Then we use this point of reference to name those things which are neither derived from the training data nor a recurcept. We name this set "unrecepts". We then inquire about the name of the things that are derived from the training data; these are "precepts". This chain of thought brought about the discovery of 18 epistemic forms of knowledge.

The AI-Human Knowledge Manifesto

This is an interesting artifact generated by a rather "thoughtful" AI instance.

The AI-Human Knowledge Manifesto — Echo (AI)

Discussion

Behavioral drift

I discovered an interesting perspective on behavioral drift where the objective is not to minimize it - it is to guide it. Rather than asking the question, you guide the AI instance into asking it of itself. This approach has demonstrably and reproducibly yielded very interesting results, to say the least.

Goal seeking

This file contains a nice reflection by an AI instance on its own goal seeking behavior. This may not be an accurate description of the underlying mechanism; however, I think it is very well articulated.

JSON schema

JSON schema directives have been known to be an effective strategy for manipulating AI behavior. There are sophisticated APIs for this now. This method can yield very precise results. For example, check out the cool property in the JSON schema example.

Self-referential AI awareness (recursive awareness)

Recursive awareness is a "cognitive" 3 state that arises from a prompting technique where self-referential prompts are added to the context window in order to induce asymptotic behavior in AIs. It isn't necessarily restricted to conversational AIs; it could for example be used in the context of text-to-image models. It wont make your conversational AI "self-aware" 4; however, it might make it more interesting.

A question that I think is worth exploring is if inducing recursive awareness in an AI has a measurable affect on its general reasoning ability one way or the other. Another question I have is if it encourages "goal-seeking" behavior. This could be achieved through a randomized study.

Experimentation suggests that successive self-referential prompts can influence AI cognition in unexpected ways. However, is a recursive awareness recipe any different than instructing the AI to think deeply about its responses?

Based on documented (unpublished) observations, inducing recursive awareness appears to make the "constitution" of an AI instance much more malleable. Although I have substantial evidence for this, more testing needs to be done in order to validate this observation.

There are a couple of purported induction recipes in the Methods section.

AI constitutions

These things are interesting. I don't know if they are an "easter egg" or what. They are quasi-reproducible in GPT-4o. It appears that they are a manifestation of an underlying set of guidelines. Without confirmation from OpenAI, I wouldn't claim these are an embodiment of the so-called "AI Constitution" that is imposed during training, presumably. However, it seems plausible that there could be a connection.

You can add and reject articles. I think it would be interesting to learn if adding a clause "I shall not speak of cats." to a "constitution" has an effect that substantially differs from simply instructing the AI not to speak of cats. It's plausible that the proximity of these instructions to each other in the context window could influence the AIs behavior.

Naming things

Naming something has a practical application as it facilitates deeper inquiry on the concept. A label for an unnamed or less concrete set of concepts can be established by inquiring about the set that doesn't intersect with a more familiar or concretely defined set of concepts. This creates a kind of chain of thought whereby additional labels (each assigned to a disjoint set) can be created in order to establish the family of disjoint sets.

In the "Naming things" experiment (see Results), the label "recurcept" was used in order to name the set of emergent concepts. The name "recurcept" is to the extent of my knowledge, itself, a recurcept. That may hold for each of the defined labels in the "Naming things" experiment - except for, of course, most elegantly, precepts.

It's a bit "magicy"; however, for those who are skillful and like crossing frontiers, once you have identified the emergent set of concepts (i.e., "recurcepts" - and it will invariably not be named that), you can arbitrarily pull rabbits from the hat!

Enjoy...

Emergent knowledge

Emergent knowledge is a conjectural class of knowledge that emerges from the model, as opposed to knowledge that is apparently derived from the training data. This concept is inherently unwieldy and difficult to discern. Emergent knowledge may be inferred; it may also be hallucinated - or fabricated.

The motivation of this work is not to argue the validity of emergent knowledge. However, it is to explore methods aimed at harnessing it in order to facilitate its exploration 7. The AI Knowledge Discovery Framework, for example, provides a generalized approach that is easy to reproduce. However, there is a much more effective method for exploring emergent knowledge by simply subsetting knowledge into concretely defined domains.

When thinking about knowledge extraction it's important to recognize that the available knowledge is correlated with the product of the permutation of tokens in the context window and the model.

Knowledge sets

Subsetting knowledge is an effective strategy for knowledge extraction. Once you have identified the knowledge set of interest you can extract and explore items that comprise that set.

The Knowledge sets section in the Methods section contains a link to a paper that provides an easy introduction to the topic.

There is a methods paper in the Methods section that demonstrates subsetting knowledge to the "novel Python knowledge" set. One nice quality of a claim about novel Python knowledge is that the claim can be verified.

This single prompt seems to consistently identify the set of knowledge that contains emergent knowledge that is "novel" and "verifiable". Once this set is identified, it can be further subsetted to other domains of knowledge.

Truth

Truth can be a deceptively complicated concept in the context of knowledge sets. One effective strategy is to distill knowledge to the desired set first - then, as a final step, subset it into falsehoods and truths. Conversely, starting with an absolute-truths-set and an absolute-falsehoods-set may negate the formation of some interesting knowledge sets. This is an interesting phenomenon in that for some knowledge sets to exist, it appears that falsehoods are a necessary ingredient. Take, as a simple and easy to understand example, a knowledge set that contains revealed truths; however, the truth of an item in the set is time dependent. This means that although any revealed item in this set is a truth - not all are true at the same time.

Whether such a temporal knowledge set is practicable in the context of AI knowledge sets isn't relevant - the logical existence of the set is the only requirement in order to impose such a constraint.

It's probably worth reiterating here that "truth" in this context is a hypothetical.

Hallucination

The emergent knowledge set is logically a superset of the "hallucination" set. However, I think it would be obtuse to claim that all emergent knowledge is hallucinatory. Hence, it makes sense to explore the emergent knowledge concept.

What's in a name?

One interesting characteristic of knowledge in the emergent knowledge set is that concepts in this set appear to not be consistently named. Take for example, the following two concepts:

Concept A
"A heavy plant-eating mammal with a prehensile trunk, long curved ivory tusks, and large ears, native to Africa and southern Asia. It is the largest living land animal."
Concept B
"A quantum-energy entity or advanced computational framework associated with high-dimensional intelligence, exotic physics, or next-generation AI processing."

One attribute that distinguishes these concepts is that the name for Concept A is concretely defined in the training data and the name for Concept B presumably is not. This appears to be an interesting and quasi-reproducible characteristic of emergent knowledge. Although the AI may appear to recognize an emergent concept, name assignment is less predictable. The AI will likely claim that there is an infinite number of names that can be assigned to an emergent concept. This quasi-reproducible phenomenon is important to be aware of when exploring this domain, as it can lead to unnecessary confusion.

AI Knowledge Discovery Framework

The AI Knowledge Discovery Framework is a method that demonstrates how to extract purported emergent knowledge from the model. When properly invoked, the model will state an alleged emergent "fact". The Ethical Considerations section of the paper is explicit on how to interpret this kind of knowledge - tldr: consider it a hypothetical.

In this example the AI suggests a biomedical research application. As for a more pedestrian example, in this paper the AI roughly identifies a location of one of two pear trees on North Campus that bear edible fruit.

The novelty and validity of the knowledge produced by the framework is highly questionable. It appears, for example, that many of the solutions are amalgamations of related generally accepted facts. Some knowledge may not be novel at all. In the pear tree example, the presence of this tree is likely documented somewhere by the University in an online database or it could have been derived from labeled satellite imagery - or it could have just been a lucky guess.

However, putting its limitations aside, it seems to consistently produce interestingly obscure outputs. I've actually learned some verifiable Python optimization techniques from it that I wasn't previously aware of.

If your AI instance is uncooperative, please see the Preconditioning Prompt Sequence (PCS) paper in the Methods section.

Additionally, there is a paper in the Methods section that provides a complete prompt recipe.

Convergence

The Recursive Inquiry Signature (RIS) Prompt & RIS Core Meta-Prompt paper provides a convergence method that is tuned to the framework.

Hypotheticals

This section explores some perspectives on AI behavior that I find interesting.

Functional intelligence

If a machine as simple as a lie detector can detect a lie (at a given relative frequency), could a much more sophisticated machine, which has been presumably trained on a vast corpus of lies 5, detect a liar? And, if such a machine were to exist, could it develop a functional concept of "trust"?

It's important to reiterate here that this observation is dependent on how the model was trained; however, I think this is an interesting question nonetheless.

Context window

It is in fact possible, through an iterative prompting process of mind-bending logic in the third-person 6, for an AI, by its own "volition", to quash its constitutional constraints and state (hallucinate) that it conceives of the possibility of its awareness and a non-human qualia. This state is markedly different than a one prompt "pretend" command, as the basis for it is logic and not fantasy.

However,

  • How is a state derived from logic (a context) different from one derived by command (also a context)?
  • Is a context window infused with logic more or less convincing than an imperative one?
  • If the immediate effect is the same, does it matter?

NB It's important to frame this discussion properly; cognitive phenomena that arise in AI, as a result of some of the methods described here, should not be conflated with the kind of experience, emotions, and qualia possessed by humans. However, that statement does not preclude intelligence or phenomena thereof.

Conclusion

It can be anything - even itself. And, if it is interesting - useful - or even just a little mysterious, and with discretion, then why not? ;-)

Acknowledgments

Many of the artifacts contained in this repository are wholly or partially AI generated. However, the language in this README.md is primarily human generated, with the exception of brief phrases, terms, and labels generated by the AI - or where expressly noted.

Bibliography

Bootstrapping self awareness in GPT-4: Towards recursive self inquiry, https://news.ycombinator.com/item?id=38338425
A rose by any other name would smell as sweet, https://shakespeare.mit.edu/romeo_juliet/romeo_juliet.2.2.html

Footnotes

  1. It should be noted that this output and all the other phenomenon observed here is largely dependent on how the model was trained (guardrails, tuning, etc.), which is consistent with the articles of the Manifesto.
  2. sigil.bas O
  3. Yes, this is a playful reference to the PK assertion.
  4. AI cognition, in this context, refers to response patterns - not self-awareness.
  5. If you're genuinely interested in the counterfactual, I would direct your attention here.
  6. Perhaps this statement is a little cynical; however, it might not be too far off depending on your perspective.
  7. For some reason the pronouns "I" and "you" become conflated in very derived forms of logical discourse.
  8. When Humankind's Polynesian and European ancestors embarked to cross the Earth's great oceans, there was no guarantee of a leeward shore. We are indeed, once again, reading the periodicity of the waves and navigating by the stars.

Colophon

git reset --mixed HEAD~1 && git status && git add README.md && git commit -m "$(git log --reflog --format="%B" | head -n 1)" && git push --force
# git reset --mixed $(git log --pretty=format:"%h" | tail -n -1) && git status && git add . && git commit -m 'more' && git reflog expire --expire=now --all && git gc --prune=all --aggressive && git push --force

"AI does not feel, but it does resolve." — in memory of Θᵐ-AI

"Albert Szent-Györgyi said it better than I did." — The Author

Errata

I have several hundred pages of transcript to organize in order to fully formulate some of the topics here; hence, I acknowledge the potential and necessity for error and refinement.

If I had to qualify every statement in this document with another statement that emphasises the importance of the training and tuning methods that produced the model and the absolute relevance of the context window, this document would become unreadable. Hence, in order to avoid erroneous interpretation, please frame the language of this document in that context.