Imports and Environment Setup¶
We will start by importing everything we need up front and setting up the environment, including tracing with LangSmith. While these traces are not pictured directly in this blog, they are extremely useful for observability, debugging, evaluation, etc. as chains get more complex.
import os
import re
import getpass
from typing import List, Dict
from langchain.schema import Document
from langchain.chat_models import ChatOpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Redis
from langchain.document_loaders import UnstructuredURLLoader
from langchain.text_splitter import HTMLHeaderTextSplitter, RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains.query_constructor.base import (
get_query_constructor_prompt,
load_query_constructor_runnable,
AttributeInfo)
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("Langchain API key:")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ['LANGCHAIN_ENDPOINT'] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_PROJECT"] = "html-chunking"
Q+A with Unstructured Chunking¶
First, let's implement a naive retrieval strategy with arbitrary character-count based text splitting, following a common pattern in LLM application design.
# assign chat and embeddings models
llm = ChatOpenAI(
model='gpt-4',
temperature=0
)
embeddings = OpenAIEmbeddings()
# load and split unstructured text content from web URL
loader = UnstructuredURLLoader(['https://plato.stanford.edu/entries/goedel/'])
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=0
)
doc = loader.load()
docs = splitter.split_documents(doc)
# create vectorstore and QA retriever
vectorstore = Redis.from_documents(
docs,
embeddings,
# running Redis server locally with Docker
redis_url='redis://localhost:6379'
)
retriever = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(),
# we'll want to see source documents for comparison
return_source_documents=True
)
Now we can run a test query and print the results along with the source documents, so that we can see how the retriever performed, i.e. which chunks of text it found to prepend to the query as context for the LLM.
def print_result(response_obj):
print("SOURCES: \n")
cnt = 1
for source_doc in response_obj["source_documents"]:
print(f"Chunk #{cnt}")
cnt += 1
print("Source Metadata: ", source_doc.metadata)
print("Source Text:")
print(source_doc.page_content)
print("\n")
print("RESULT: \n")
print(response_obj["result"] + "\n\n")
query = "explain godel's first incompleteness theorem."
response = retriever({"query":query})
print_result(response)
SOURCES: Chunk #1 Source Metadata: {'id': 'doc:1c6f3e3f7ee14027bc856822871572dc:26e9aac7d5494208a56ff0c6cbbfda20', 'source': 'https://plato.stanford.edu/entries/goedel/'} Source Text: 2.2.1 The First Incompleteness Theorem In his Logical Journey (Wang 1996) Hao Wang published the full text of material Gödel had written (at Wang’s request) about his discovery of the incompleteness theorems. This material had formed the basis of Wang’s “Some Facts about Kurt Gödel,” and was read and approved by Gödel: We see that Gödel first tried to reduce the consistency problem for analysis to that of arithmetic. This seemed to require a truth definition for arithmetic, which in turn led to paradoxes, such as the Liar paradox (“This sentence is false”) and Berry’s paradox (“The least number not defined by an expression consisting of just fourteen English words”). Gödel then noticed that such paradoxes would not necessarily arise if truth were replaced by provability. But this means that arithmetic truth and arithmetic provability are not co-extensive — whence the First Incompleteness Theorem. Chunk #2 Source Metadata: {'id': 'doc:1c6f3e3f7ee14027bc856822871572dc:d15f62c453c64072b768e136080cb5ba', 'source': 'https://plato.stanford.edu/entries/goedel/'} Source Text: The First Incompleteness Theorem provides a counterexample to completeness by exhibiting an arithmetic statement which is neither provable nor refutable in Peano arithmetic, though true in the standard model. The Second Incompleteness Theorem shows that the consistency of arithmetic cannot be proved in arithmetic itself. Thus Gödel’s theorems demonstrated the infeasibility of the Hilbert program, if it is to be characterized by those particular desiderata, consistency and completeness. As an aside, von Neumann understood the two theorems this way, even before Gödel did. In fact von Neumann went much further in taking the view that they showed the infeasibility of classical mathematics altogether. As he wrote to Carnap in June of 1931: 9] And the previous fall von Neumann had written to Gödel in even stronger terms: It would take Gödel himself a few years to see that those aspects of the Hilbert Program had been decisively refuted by his results (Mancosu 2004). Chunk #3 Source Metadata: {'id': 'doc:1c6f3e3f7ee14027bc856822871572dc:f4dc52218cd040cfa7734b86b6d429ea', 'source': 'https://plato.stanford.edu/entries/goedel/'} Source Text: On concluding the proof of the first theorem, Gödel remarks, “we can readily see that the proof just given is constructive; that is … proved in an intuitionistically unobjectionable manner…” (Gödel 1986, p. 177). This is because, as he points out, all the existential statements are based on his theorem V (giving the numeralwise expressibility of primitive recursive relations), which is intuitionistically unobjectionable. 2.2.3 The Second Incompleteness Theorem The Second Incompleteness Theorem establishes the unprovability, in number theory, of the consistency of number theory. First we have to write down a number-theoretic formula that expresses the consistency of the axioms. This is surprisingly simple. We just let Con(P) be the sentence ¬Prov(⌈0 = 1⌉). Theorem 4 (Gödel’s Second Incompleteness Theorem) If P is consistent, then Con(P) is not provable from P. Chunk #4 Source Metadata: {'id': 'doc:1c6f3e3f7ee14027bc856822871572dc:a69082427ddf43f999690f21531f1ca7', 'source': 'https://plato.stanford.edu/entries/goedel/'} Source Text: This account of Gödel’s discovery was told to Hao Wang very much after the fact; but in Gödel’s contemporary correspondence with Bernays and Zermelo, essentially the same description of his path to the theorems is given. (See Gödel 2003a and Gödel 2003b respectively.) From those accounts we see that the undefinability of truth in arithmetic, a result credited to Tarski, was likely obtained in some form by Gödel by 1931. But he neither publicized nor published the result; the biases logicians had expressed at the time concerning the notion of truth, biases which came vehemently to the fore when Tarski announced his results on the undefinability of truth in formal systems 1935, may have served as a deterrent to Gödel’s publication of that theorem. 2.2.2 The proof of the First Incompleteness Theorem RESULT: Gödel's First Incompleteness Theorem states that in any consistent formal system that is sufficiently powerful to describe the arithmetic of the natural numbers, there are statements that can't be proven or disproven within the system itself. This means that there are true statements about the natural numbers that can't be proven within the system. The theorem came about when Gödel replaced the concept of truth with provability, realizing that arithmetic truth and arithmetic provability are not the same. This theorem provides a counterexample to completeness by showing an arithmetic statement which is neither provable nor refutable in Peano arithmetic, even though it's true in the standard model.
On first gloss, this looks like a pretty solid output from our naive QA chain. While robust RAG evaluation is outside the scope of this blog, we can observe loosely that the generated answer itself sounds reasonable, i.e. it would likely grade well in evaluations for faithfulness and relavancey, and each of the chunks do appear to contain signal regarding Gödel's First Incompleteness Theorem.
But in terms of context precision and recall, the retriever performed quite poorly—in large part because the chunks extend willy-nilly across section breaks indicated in the source document.
For instance, Chunk #3 contains some concluding remarks about the First Incompleteness Theorem, which are relevant to the query, but it also contains text from a subsequent section under the heading "2.2.3 The Second Incompleteness Theorem," which is not; and, worse, Chunk #4 contains the decapitated head of Gödel's proof ("2.2.2 The proof of the First Incompleteness Theorem"), with no proof in sight!
We could probably optimize this retriever to some extent, e.g. by varying the chunk size and adding some character overlap between chunks, but doing so would not address the more fundamental problem here: natural language does not tend to encode information sequentially. Even simple narrative prose can entail a complex nesting of time-frames and information—so, in many cases including this one, the crude process of transforming contiguous chunks of text into a sequence of nodes has a flattening effect, collapsing implicit and explicit structures of meaning that are more tree-like than chain-like.
Q+A with Structured Chunking¶
Markup languages (and, to some extent, markdown) present an interesting edge-case, however: in a sense, they are designed to convey non-sequential—i.e. structural—information sequentially, and can therefore provide at least some semblance of semantic scaffolding for our splitting process.
Let's try a more structured approach to RAG using the HTML Header Text Splitter, which (as mentioned above) works by splitting HTML documents at the element level and attaching metadata for each header (assumed to be relevant) to any given chunk.
# select headers to split on, and map these to metadata field names
headers_to_split_on = [
("h1", "article_h1_main"),
("h2", "article_h2_subsection"),
("h3", "article_h3_subsection"),
("h4", "article_h4_subsection"),
]
html_splitter = HTMLHeaderTextSplitter(
headers_to_split_on=headers_to_split_on,
# combine elements with the same metadata by setting return_each_element to False (default)
return_each_element=False
)
docs = html_splitter.split_text_from_url('https://plato.stanford.edu/entries/goedel/' )
# print sample chunk page content and metadata
for attr in docs[12]:
print(attr, "\n")
('page_content', 'Gödel’s proof of the consistency of the continuum hypothesis with the axioms of Zermelo-Fraenkel set theory is a tour de force and arguably the greatest achievement of his mathematical life. This is because aside from the arithmetization, virtually all of the technical machinery used in the proof had to be invented ab initio. \nThe Continuum Hypothesis (henceforth CH) was formulated by Georg Cantor, and was the first problem on Hilbert’s list of twenty-three unsolved problems as given in his famous address to the International Mathematical Congress in Paris in 1900. The problem as stated by Hilbert is as follows: Let A be an infinite set of real numbers. Then A is either countable, or has cardinality 2ℵ0, i.e., A is in one-to-one correspondence either with the set of natural numbers or with the set of all real numbers (otherwise known as the continuum). Another way to state the continuum hypothesis is that (the first uncountably infinite cardinal) ℵ1 = 2ℵ0. \nAs early as 1922 Skolem speculated that the CH was independent of the axioms for set theory given by Zermelo in 1908. Nevertheless Hilbert published a (false) proof of the CH in Hilbert 1926. In 1937 Gödel proved its consistency with the axioms of ZF set theory. (Henceforth we use the standard abbreviations for Zermelo-Fraenkel set theory, ZF, and Zermelo-Fraenkel set theory with the Axiom of Choice, ZFC.) The consistency of the negation of the CH was shown by Paul Cohen in 1961 (see Cohen 1963) and hence together with Gödel’s result one infers that the CH is independent of ZF (and ZFC). \nCohen invented an important new technique called forcing in the course of proving his result; this technique is at present the main method used to construct models of set theory. Forcing led to a revival of formalism among set theorists, the plurality of models being an indication of the “essential variability in set theory,” (Dehornoy 2004) and away from the notion that there is an intended model of set theory—a perspective Gödel advocated since at least 1947, if not earlier.[14] Recently there have been signs that the CH may again be coming to be regarded as a problem to be solved mathematically (with the help of course of some new evident axioms extending ZF). (See for example Woodin 2001a, 2002, 2001b, and Foreman 1998.) If any of the proposed solutions gain acceptance, this would confirm Gödel’s view that the CH would eventually be decided by finding an evident extension of the ZF axioms for set theory. The program associated with this view is called “Gödel’s Large Cardinal Program.”') ('metadata', {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.4 Gödel’s Work in Set theory', 'article_h4_subsection': '2.4.1 The consistency of the Continuum Hypothesis and the Axiom of Choice'})
In its current configuration, the splitter aggregates the text of all elements with the same header metadata to a single document chunk—with the text of the related h4-level heading (if there is one) assigned to the metadata field "article_h4_subsection" and the text of each prior related heading assigned to the fields for h3, h2, and h1 respectively.
If desired, smaller chunk size can be achieved either by having the HTML Header Text Splitter return documents element by element or by pipelining the output to another splitter; but in any case, the hierarchical structure of the element tree of the source document will be preserved as such in the chunk metadata:
for doc in docs:
if doc.metadata:
print(doc.metadata, "\n")
{'article_h1_main': 'Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '1. Biographical Sketch'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.1 The Completeness Theorem', 'article_h4_subsection': '2.1.1 Introduction'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.1 The Completeness Theorem', 'article_h4_subsection': '2.1.2 Proof of the Completeness Theorem'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.1 The Completeness Theorem', 'article_h4_subsection': '2.1.3 An Important Consequence of the Completeness Theorem'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.1 The First Incompleteness Theorem'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.2 The proof of the First Incompleteness Theorem'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.3 The Second Incompleteness Theorem'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.3 Speed-up Theorems'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.4 Gödel’s Work in Set theory', 'article_h4_subsection': '2.4.1 The consistency of the Continuum Hypothesis and the Axiom of Choice'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.4 Gödel’s Work in Set theory', 'article_h4_subsection': '2.4.2 Gödel’s Proof of the Consistency of the Continuum Hypothesis and the Axiom of Choice with the Axioms of Zermelo-Fraenkel Set Theory'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.4 Gödel’s Work in Set theory', 'article_h4_subsection': '2.4.3 Consequences of Consistency'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.4 Gödel’s Work in Set theory', 'article_h4_subsection': '2.4.4 Gödel’s view of the Axiom of Constructibility'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.5 Gödel’s Work in Intuitionistic Logic and Arithmetic'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.5 Gödel’s Work in Intuitionistic Logic and Arithmetic', 'article_h4_subsection': '2.5.1 Intuitionistic Propositional Logic is not Finitely-Valued'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.5 Gödel’s Work in Intuitionistic Logic and Arithmetic', 'article_h4_subsection': '2.5.2 Classical Arithmetic is Interpretable in Heyting Arithmetic'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.5 Gödel’s Work in Intuitionistic Logic and Arithmetic', 'article_h4_subsection': '2.5.3 Intuitionistic Propositional Logic is Interpretable in S4'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.5 Gödel’s Work in Intuitionistic Logic and Arithmetic', 'article_h4_subsection': '2.5.4 Heyting Arithmetic is Interpretable into Computable Functionals of Finite Type.'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '3. Gödel’s Philosophical Views'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '3. Gödel’s Philosophical Views', 'article_h3_subsection': '3.1 Gödel’s Rationalism'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '3. Gödel’s Philosophical Views', 'article_h3_subsection': '3.2 Gödel’s Realism'} {'article_h1_main': 'Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Bibliography', 'article_h3_subsection': 'Primary Sources', 'article_h4_subsection': 'Gödel’s Writings'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Bibliography', 'article_h3_subsection': 'Primary Sources', 'article_h4_subsection': 'The Collected Papers of Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Bibliography', 'article_h3_subsection': 'Secondary Sources'} {'article_h1_main': 'Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Academic Tools'} {'article_h1_main': 'Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Other Internet Resources'} {'article_h1_main': 'Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Related Entries'} {'article_h1_main': 'Kurt Gödel'} {'article_h1_main': 'Kurt Gödel', 'article_h3_subsection': 'Acknowledgments'} {'article_h4_subsection': 'Browse'} {'article_h4_subsection': 'About'} {'article_h4_subsection': 'Support SEP'} {'article_h4_subsection': 'Mirror Sites'}
Even if we ignore all of this metadata during retrieval, splitting the document according to its own explicit informational structure should in theory improve our retrieval chain. Let's give this a try before we implement self-querying.
Since we're using Redis for the vector database and now have custom metadata fields for our documents, we need to build a custom index schema. See this documentation for more info.
def build_index_schema(documents: List[Document]) -> Dict:
schema = {"text": []}
for doc in documents:
for key in doc.metadata:
name_dict = {"name": f"{key}"}
if name_dict not in schema["text"]:
schema["text"].append(name_dict)
return schema
index_schema = build_index_schema(docs)
print(index_schema)
{'text': [{'name': 'article_h1_main'}, {'name': 'article_h2_subsection'}, {'name': 'article_h3_subsection'}, {'name': 'article_h4_subsection'}]}
vectorstore = Redis.from_documents(
docs,
embeddings,
redis_url='redis://localhost:6379',
index_schema = index_schema
)
retriever = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=vectorstore.as_retriever(),
return_source_documents=True
)
query = "explain godel's first incompleteness theorem."
response = retriever({"query": query})
print_result(response)
SOURCES: Chunk #1 Source Metadata: {'id': 'doc:c39d409b58e3444691ad91c34b5c3ee1:2b0ffb3f19f94dd0812b72507bb83c23', 'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.3 The Second Incompleteness Theorem'} Source Text: The Second Incompleteness Theorem establishes the unprovability, in number theory, of the consistency of number theory. First we have to write down a number-theoretic formula that expresses the consistency of the axioms. This is surprisingly simple. We just let Con(P) be the sentence ¬Prov(⌈0 = 1⌉). Theorem 4 (Gödel’s Second Incompleteness Theorem) If P is consistent, then Con(P) is not provable from P. Proof: Let φ be as in (3). The reasoning used to infer ‘if P ⊢ φ, then P ⊢ 0 ≠ 1‘ does not go beyond elementary number theory, and can therefore, albeit with a lot of effort (see below), be formalized in P. This yields: P ⊢ (Prov(⌈φ⌉) → ¬Con(P)), and thus by (3), P ⊢ (Con(P) → φ). Since P ⊬ φ, we must have P ⊬ Con(P). The above proof (sketch) of the Second Incompleteness Theorem is deceptively simple as it avoids the formalization. A rigorous proof would have to establish the proof of ‘if P ⊢ φ, then P ⊢ 0 ≠ 1’ in P. It is noteworthy that ω-consistency is not needed in the proof of Gödel’s Second Incompleteness Theorem. Also note that neither is ¬Con(P) provable, by the consistency of P and the fact, now known as Löb’s theorem, that P ⊢ Prov(⌈φ⌉) implies P ⊢ φ. The assumption of ω-consistency in the First Incompleteness Theorem was eliminated by Rosser in 1936, and replaced by the weaker notion of consistency. Rosser’s generalization involves applying the fixed point theorem to the formula R(x): ‘for all z: either z is not the Gödel number of a proof of the formula with Gödel number x or there is a proof shorter than z of the negation of (the formula with Gödel number) x’ (see Rosser 1936). With regard to the Second Incompleteness Theorem, the argument relies in part on formalizing the proof of the First Incompleteness Theorem as we saw. This step is omitted in Gödel 1931. He planned to include the step in what would have been a second part II (see footnote 48a of Gödel 1931). But instead of writing it he turned to the continuum problem.[13] (Part II was to elaborate on other points too: the ‘true reason for incompleteness,’ and the applicability of the two theorems to other systems.) He perhaps did not feel compelled to attend to what looked like an exercise in formalization, relying instead on the informal argument to convince (in which it succeeded). However this step turned out to be somewhat non-trivial. As Kleene puts it in his introduction to Gödel 1931, of the informal presentation, “Certainly the idea of the argument for Theorem XI (consistency) was very convincing; but it turned out that the execution of the details required somewhat more work and care than had been anticipated.” (See pp. 126–141 of Gödel 1986.) Eventually a complete proof of the Second Theorem was given by Hilbert and Bernays in some seventy pages in their Hilbert and Bernays 1939. A much more compact treatment of the theorem was given by Löb in his Löb 1956, and subsequently Feferman, in his 1960 “Arithmetization of Metamathematics in a General Setting” (Feferman 1960/1961), gave a succinct and completely general treatment of both the First and Second Theorems. But see the supplementary document: Did the Incompleteness Theorems Refute Hilbert’s Program? For more detailed discussion, see the entry on Gödel’s incompleteness theorems. Chunk #2 Source Metadata: {'id': 'doc:c39d409b58e3444691ad91c34b5c3ee1:88cbf09a1af64dc78b6ac0c34170b725', 'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': 'Related Entries', 'article_h3_subsection': None, 'article_h4_subsection': None} Source Text: Gödel, Kurt: incompleteness theorems | Hilbert, David: program in the foundations of mathematics | Husserl, Edmund | Leibniz, Gottfried Wilhelm | mathematics, philosophy of: intuitionism | mathematics, philosophy of: Platonism | model theory | model theory: first-order | phenomenology | realism | set theory | set theory: continuum hypothesis | set theory: large cardinals and determinacy Chunk #3 Source Metadata: {'id': 'doc:c39d409b58e3444691ad91c34b5c3ee1:65d27ea51ab0470e82b59678ccf471ce', 'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.2 The proof of the First Incompleteness Theorem'} Source Text: We now describe the proof of the two theorems, formulating Gödel’s results in Peano arithmetic. Gödel himself used a system related to that defined in Principia Mathematica, but containing Peano arithmetic. In our presentation of the First and Second Incompleteness Theorems we refer to Peano arithmetic as P, following Gödel’s notation. Before proceeding to the details of the formal proof, we define the notion of ω-consistency used by Gödel in the First Incompleteness Theorem: P is ω-consistent if P ⊢ ¬φ(n) for all n implies P ⊬ ∃xφ(x). Naturally this implies consistency and follows from the assumption that the natural numbers satisfy the axioms of Peano arithmetic. One of the main technical tools used in the proof is Gödel numbering, a mechanism which assigns natural numbers to terms and formulas of our formal theory P. There are different ways of doing this. The most common is based on the unique representation of natural numbers as products of powers of primes. Each symbol s of number theory is assigned a positive natural number #(s) in a fixed but arbitrary way, e.g. #(0) = 1 #(=) = 5 #(¬) = 9 #(1) = 2 #( ( ) = 6 #(∀) = 10 #(+) = 3 #( ) ) = 7 #(vi) = 11 + i #(×) = 4 #(∧) = 8 The natural number corresponding to a sequence w = < w0,…, wk > of symbols is ⌈w⌉ = 2#(w0) · 3#(w1) · … · pk#(wk), where pk is the k+1st prime. It is called its Gödel number and denoted by ⌈w⌉. In this way we can assign Gödel numbers to formulas, sequences of formulas (once a method for distinguishing when one formula ends and another begins has been adopted), and most notably, proofs. An essential point here is that when a formula is construed as a natural number, then the numeral corresponding to that natural number can occur as the argument of a formula, thus enabling the syntax to “refer” to itself, so to speak (i.e., when a numeral is substituted into a formula the Gödel number of which the numeral represents). This will eventually allow Gödel to formalize the Liar paradox (with “provability” in place of “truth”) by substituting into the formula which says, ‘the formula, whose code is x, is unprovable,’ its own natural number code (or more precisely the corresponding numeral). Another concept required to carry out the formalization is the concept of numeralwise expressibility of number theoretic predicates. A number-theoretic formula φ(n1, …, nk) is numeralwise expressible in P if for each tuple of natural numbers (n1, …, nk): N ⊨ φ(n1, …, nk) ⇒ P ⊢ φ(n1, …, nk) N ⊨ ¬φ(n1, …, nk) ⇒ P ⊢ ¬φ(n1, …, nk) where n is the formal term which denotes the natural number n. (In P, this is S(S(…S(0)…), where n is the number of iterations of the successor function applied to the constant symbol 0.) One of the principal goals is to numeralwise express the predicate Prf(x, y): ‘the sequence with Gödel number x is a proof of the sentence with Gödel number y.’ Reaching this goal involves defining forty-five relations, each defined in terms of the preceding ones. These relations are all primitive recursive.[10] Relations needed are, among others, those which assert of a natural number that it codes a sequence, or a formula, or an axiom, or that it is the code, denoted by Sb(ru1…unZ(x1)…Z(xn)), of a formula obtained from a formula with code r by substituting for its free variable ui the xi th numeral for i = 1, …, n. The forty-fifth primitive recursive relation defined is Prf(x, y), and the forty-sixth is Prov(y): ‘the sentence with Gödel number y is provable in P’ which without being primitive recursive, is however obtained from Prf(x, y) by existentially quantifying x. (Prov(y) satisfies only the ‘positive’ part of numeralwise expressibility, and not the negative part; but the negative part is not needed.) In Theorem V of his paper, Gödel proves that any number theoretic predicate which is primitive recursive is numeralwise expressible in P. Thus since Prf(x, y) and substitution are primitive recursive, these are decided by P when closed terms are substituted for the free variables x and y. This is the heart of the matter as we will see. Another key point about numeralwise expressibility is that although we informally interpret, for example, Prov(Sb(ru1…unZ(x1)…Z(xn))), by: ‘the formula with Gödel number r is provable if the Gödel number for the xi th numeral is substituted in place of the i th variable,’ neither the formal statement within the theory P nor anything we prove about it appeals to such meanings. On the contrary Prov(Sb(ru1…unZ(x1)…Z(xn))), is a meaningless string of logical and arithmetical symbols. As Gödel puts it in his introduction to his theorem V, ‘The fact that can be formulated vaguely by saying that every recursive relation is definable in the system P (if the usual meaning is given to the formulas of this system) is expressed in precise language, without reference to any interpretation of the formulas of P, by the following Theorem (V) (Gödel 1986, p. 171, italics Gödel’s). Gödel in his incompleteness theorems uses a method given in what is called nowadays Gödel’s Fixed Point Theorem. Although Gödel constructs a fixed point in the course of proving the incompleteness theorem, he does not state the fixed point theorem explicitly. The fixed point theorem is as follows: Theorem 2 (Gödel’s Fixed Point Theorem) If φ(v0) is a formula of number theory, then there is a sentence ψ such that P ⊢ ψ ↔ φ(⌈ψ⌉), where ⌈ψ⌉ is the formal term corresponding to the natural number code of ⌈ψ⌉. Proof: Let σ(x,y,z) be a formula that numeralwise expresses the number theoretic predicate ‘y is the Gödel number of the formula obtained by replacing the variable v0 in the formula whose Gödel number is x by the term z’. Let θ(v0) be the formula ∃v1(φ(v1) ∧ σ(v0, v1, v0)). Let k = ⌈θ(v0)⌉ and ψ = θ(k). Now directly by the construction P ⊢ ψ ↔ φ(⌈ψ⌉). A sentence is refutable from a theory if its negation is provable. The First Incompleteness Theorem as Gödel stated it is as follows: Theorem 3 (Gödel’s First Incompleteness Theorem) If P is ω-consistent, then there is a sentence which is neither provable nor refutable from P. Proof: By judicious coding of syntax referred to above, write a formula Prf(x,y)[11] of number theory, representable in P, so that n codes a proof of φ ⇒ P ⊢ Prf(n, ⌈φ⌉). and n does not code a proof of φ ⇒ P ⊢ ¬Prf(n, ⌈φ⌉). Let Prov(y) denote the formula ∃x Prf(x,y)[12]. By Theorem 2 there is a sentence φ with the property P ⊢ (φ ↔ ¬Prov(⌈φ⌉)). Thus φ says ‘I am not provable.’ We now observe, if P ⊢ φ, then by (1) there is n such that P ⊢ Prf(n, ⌈φ⌉), hence P ⊢ Prov(⌈φ⌉), hence, by (3) P ⊢ ¬φ, so P is inconsistent. Thus P ⊬ φ Furthermore, by (4) and (2), we have P ⊢ ¬Prf(n, ⌈φ⌉) for all natural numbers n. By ω-consistency P ⊬ ∃x Prf(x, ⌈φ⌉). Thus (3) gives P ⊬ ¬φ. We have shown that if P is ω-consistent, then φ is independent of P. On concluding the proof of the first theorem, Gödel remarks, “we can readily see that the proof just given is constructive; that is … proved in an intuitionistically unobjectionable manner…” (Gödel 1986, p. 177). This is because, as he points out, all the existential statements are based on his theorem V (giving the numeralwise expressibility of primitive recursive relations), which is intuitionistically unobjectionable. Chunk #4 Source Metadata: {'id': 'doc:c39d409b58e3444691ad91c34b5c3ee1:5eda575b170248b5a6f1cc45dc318dab', 'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': None} Source Text: Gödel mentioned the possibility of the unsolvability of a question about the reals already in his 1929 thesis, in arguing against the formalist principle of Hilbert’s, that consistency is a criterion for existence. In fact, giving a finitary proof of the consistency of analysis was a key desideratum of what was then known as the Hilbert program, along with proving its completeness. Accordingly it was Gödel’s turn to these questions, especially the first, which led him to the two incompleteness theorems. (For a discussion of the Hilbert Program the reader is referred to the standard references: Sieg 1990, 1988, 1999; Mancosu 1998, Zach 2003, Tait 1981 and Tait 2002.) The First Incompleteness Theorem provides a counterexample to completeness by exhibiting an arithmetic statement which is neither provable nor refutable in Peano arithmetic, though true in the standard model. The Second Incompleteness Theorem shows that the consistency of arithmetic cannot be proved in arithmetic itself. Thus Gödel’s theorems demonstrated the infeasibility of the Hilbert program, if it is to be characterized by those particular desiderata, consistency and completeness. As an aside, von Neumann understood the two theorems this way, even before Gödel did. In fact von Neumann went much further in taking the view that they showed the infeasibility of classical mathematics altogether. As he wrote to Carnap in June of 1931: Thus today I am of the opinion that 1. Gödel has shown the unrealizability of Hilbert’s program. 2. There is no more reason to reject intuitionism (if one disregards the aesthetic issue, which in practice will also for me be the decisive factor). Therefore I consider the state of the foundational discussion in Königsberg to be outdated, for Gödel’s fundamental discoveries have brought the question to a completely different level.[9] And the previous fall von Neumann had written to Gödel in even stronger terms: Thus, I think that your result has solved negatively the foundational question: there is no rigorous justification for classical mathematics. (Gödel 2003b, p. 339) It would take Gödel himself a few years to see that those aspects of the Hilbert Program had been decisively refuted by his results (Mancosu 2004). RESULT: Gödel's First Incompleteness Theorem states that in any consistent formal system that is sufficiently expressive to include at least arithmetic, there exists a statement that cannot be proven true or false within the system itself. The proof of this theorem involves a technique known as Gödel numbering, which assigns unique natural numbers to each symbol, formula, or sequence of formulas in the system. This allows the system to "refer to itself" in a sense. Gödel then constructs a specific statement (often informally described as "This statement cannot be proven") within the system. If this statement could be proven true within the system, it would be false, creating a contradiction. If it could be proven false, that would mean it could be proven, which would make it true, again a contradiction. Therefore, this statement can neither be proven true nor false within the system, demonstrating the system's incompleteness. It's important to note that the system must be consistent (it cannot produce contradictory results) and sufficiently expressive (capable of expressing at least basic arithmetic) for the theorem to apply.
Lo! The retriever now successfully retrieves the full proof of the right theorem—which was previously decapitated and lost as context—and even though 2 of the 4 chunks it retrieves do not contain relevant context, the final answer is significantly more clear, detailed, and precise because the relevant chunks contain complete, semantically-coherent subsections of the original source document.
Q+A with Structured Chunking and Self-querying¶
Now it's time to implement self-querying. It should be noted that there's a good deal of uncharted territory in the area of query construction, and the following approach should be taken in the spirit of exploration rather than as be-all-end-all advice for working with self-querying retrievers.
The first step is to create descriptions for the document and each of the metadata fields. This can be done programmatically with a combination of hard-coded and LLM-generated text. The (LLM-powered) query constructor will use these descriptions as context to generate filters specific to the vectorstore, so it's helpful to specify valid comparators and operators within the field descriptions.
document_content_description = "a section of an encyclopedia article about a philosopher"
def generate_metadata_desc(field_name: str, header_info: List[tuple], qa_retriever: RetrievalQA) -> str:
query = f"""
given a list of tuples indicating all possible metadata fields for this article
provide a very brief description (15 words or less) of the specified field
including its position in the hierarchy of headers (h1, h2, h3, etc)
all fields {header_info}
specified field: {field_name}
"""
return qa_retriever({"query": query})["result"]
def build_metadata_field_info(schema: Dict, header_info: List[tuple], qa_retriever: RetrievalQA) -> List[AttributeInfo]:
filter_instructions = \
"ALWAYS filter with one or more CONTAINS comparators, and use the OR operator to check ALL other fields."\
"if the value of this field contains a word or phrase that is very similar to a word or phrase in the query, " \
"filter for the exact string from the value rather than the query."
filter_h1 = \
"this field contains the name of a philosopher. "\
"if the query contains a misspelling or an alternate spelling of this name, "\
"filter for the value of this field rather than the name in the query."\
"the h1-level filter should ALWAYS be combined with subsection filters using an AND operator. \n"
filter_exclusion = \
" NEVER filter this field on the value of 'article_h1_main'."
attr_info_list = []
for field in schema["text"]:
desc = generate_metadata_desc(field["name"], header_info, qa_retriever) + filter_instructions
if field["name"] == "article_h1_main":
new_attr = AttributeInfo(
name=field["name"],
description = filter_h1 + desc,
type="string"
)
else:
new_attr = AttributeInfo(
name=field["name"],
description = desc + filter_exclusion,
type="string"
)
attr_info_list.append(new_attr)
return attr_info_list
metadata_field_info = build_metadata_field_info(index_schema, headers_to_split_on, retriever)
Next, we will compose some few-shot examples for the query constructor. These will be specific to our data. One important instruction to reinforce here is that, for the purpose of constructing filters that match on strings, query strings that contain misspellings, alternate spellings, imprecise phrasing, etc. should be considered equivalent to "canonical" strings that exist in the metadata. So, for example, if the query refers to "goodle's continuum theory," then the constructor should know to filter the h1 header text on "Kurt Gödel" and the other metadata values on "Continuum Hypothesis" because these exact strings exist in the metadata and can be regarded as equivalent to "goodle" and "continuum theory."
# example queries contain misspellings / imprecise phrases
# example filters use the spelling / phrasing from the metadata values rather than the query
examples = [
(
"what does russell say about descriptors?",
{"query": "russell, descriptors",
"filter": 'and(contain("article_h1_main", "Bertrand Russell"), or(contain("article_h2_subsection", "descriptions"), '
'contain("article_h3_subsection", "descriptions"), contain("article_h4_subsection", "descriptions")))'}
),
(
"explain leibniz's idea of sufficient reason.",
{"query": "leibniz, idea of sufficient reason",
"filter": 'and(contain("article_h1_main", "Gottfried Wilhelm Leibniz"), or(contain("article_h2_subsection", "Principle of Sufficient Reason"), '
'contain("article_h3_subsection", "Principle of Sufficient Reason"), contain("article_h4_subsection", "Principle of Sufficient Reason")))'}
),
(
"what was goodel's continuum theory?",
{"query": "goodel, continuum theory",
"filter": 'and(contain("article_h1_main", "Kurt Gödel"), or(contain("article_h2_subsection", "Continuum Hypothesis"), '
'contain("article_h3_subsection", "Continuum Hypothesis"), contain("article_h4_subsection", "Continuum Hypothesis")))'}
),
]
Now let's take a look at the full prompt that will be built for the query constructor:
prompt = get_query_constructor_prompt(
document_content_description,
metadata_field_info,
examples=examples
)
print(prompt.format(query="{query}"))
Your goal is to structure the user's query to match the request schema provided below. << Structured Request Schema >> When responding use a markdown code snippet with a JSON object formatted in the following schema: ```json { "query": string \ text string to compare to document contents "filter": string \ logical condition statement for filtering documents } ``` The query string should contain only text that is expected to match the contents of documents. Any conditions in the filter should not be mentioned in the query as well. A logical condition statement is composed of one or more comparison and logical operation statements. A comparison statement takes the form: `comp(attr, val)`: - `comp` (eq | ne | gt | gte | lt | lte | contain | like | in | nin): comparator - `attr` (string): name of attribute to apply the comparison to - `val` (string): is the comparison value A logical operation statement takes the form `op(statement1, statement2, ...)`: - `op` (and | or | not): logical operator - `statement1`, `statement2`, ... (comparison statements or logical operation statements): one or more statements to apply the operation to Make sure that you only use the comparators and logical operators listed above and no others. Make sure that filters only refer to attributes that exist in the data source. Make sure that filters only use the attributed names with its function names if there are functions applied on them. Make sure that filters only use format `YYYY-MM-DD` when handling timestamp data typed values. Make sure that filters take into account the descriptions of attributes and only make comparisons that are feasible given the type of data being stored. Make sure that filters are only used as needed. If there are no filters that should be applied return "NO_FILTER" for the filter value. << Data Source >> ```json { "content": "a section of an encyclopedia article about a philosopher", "attributes": { "article_h1_main": { "description": "this field contains the name of a philosopher. if the query contains a misspelling or an alternate spelling of this name, filter for the value of this field rather than the name in the query.the h1-level filter should ALWAYS be combined with subsection filters using an AND operator. \nThe 'article_h1_main' is the main title of the article, positioned at the highest level in the header hierarchy.ALWAYS filter with one or more CONTAINS comparators, and use the OR operator to check ALL other fields.if the value of this field contains a word or phrase that is very similar to a word or phrase in the query, filter for the exact string from the value rather than the query.", "type": "string" }, "article_h2_subsection": { "description": "The 'article_h2_subsection' field represents the secondary headers or subsections within the main article content.ALWAYS filter with one or more CONTAINS comparators, and use the OR operator to check ALL other fields.if the value of this field contains a word or phrase that is very similar to a word or phrase in the query, filter for the exact string from the value rather than the query. NEVER filter this field on the value of 'article_h1_main'.", "type": "string" }, "article_h3_subsection": { "description": "The 'article_h3_subsection' field represents the third level headers in the article, under the main and secondary headers.ALWAYS filter with one or more CONTAINS comparators, and use the OR operator to check ALL other fields.if the value of this field contains a word or phrase that is very similar to a word or phrase in the query, filter for the exact string from the value rather than the query. NEVER filter this field on the value of 'article_h1_main'.", "type": "string" }, "article_h4_subsection": { "description": "The 'article_h4_subsection' field represents the fourth level headers in the article, under h1, h2, and h3 headers.ALWAYS filter with one or more CONTAINS comparators, and use the OR operator to check ALL other fields.if the value of this field contains a word or phrase that is very similar to a word or phrase in the query, filter for the exact string from the value rather than the query. NEVER filter this field on the value of 'article_h1_main'.", "type": "string" } } } ``` << Example 1. >> User Query: what does russell say about descriptors? Structured Request: ```json { "query": "russell, descriptors", "filter": "and(contain(\"article_h1_main\", \"Bertrand Russell\"), or(contain(\"article_h2_subsection\", \"descriptions\"), contain(\"article_h3_subsection\", \"descriptions\"), contain(\"article_h4_subsection\", \"descriptions\")))" } ``` << Example 2. >> User Query: explain leibniz's idea of sufficient reason. Structured Request: ```json { "query": "leibniz, idea of sufficient reason", "filter": "and(contain(\"article_h1_main\", \"Gottfried Wilhelm Leibniz\"), or(contain(\"article_h2_subsection\", \"Principle of Sufficient Reason\"), contain(\"article_h3_subsection\", \"Principle of Sufficient Reason\"), contain(\"article_h4_subsection\", \"Principle of Sufficient Reason\")))" } ``` << Example 3. >> User Query: what was goodel's continuum theory? Structured Request: ```json { "query": "goodel, continuum theory", "filter": "and(contain(\"article_h1_main\", \"Kurt G\u00f6del\"), or(contain(\"article_h2_subsection\", \"Continuum Hypothesis\"), contain(\"article_h3_subsection\", \"Continuum Hypothesis\"), contain(\"article_h4_subsection\", \"Continuum Hypothesis\")))" } ``` << Example 4. >> User Query: {query} Structured Request:
The query constructor can now be built as a runnable chain. Mindful of the relatively high token count of our prompt, we can use gpt-3.5-turbo (rather than gpt-4) for the model in this component.
(Note that, depending on the data, gpt-3.5-turbo may require additional examples to construct queries as reliably as gpt-4 and to handle edge cases.)
chain = load_query_constructor_runnable(
llm=ChatOpenAI(
model="gpt-3.5-turbo",
temperature=0
),
attribute_info=metadata_field_info,
document_contents=document_content_description,
examples=examples,
fix_invalid=True
)
To check that the constructor will correctly build filters for strings/substrings that match metadata values rather than the original query, lets try alternate spellings of equivalent words:
query = "explain goedel's 1st incompleteness theory"
chain.invoke(({"query": query}))
StructuredQuery(query='goedel, 1st incompleteness theory', filter=Operation(operator=<Operator.AND: 'and'>, arguments=[Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='article_h1_main', value='Kurt Gödel'), Operation(operator=<Operator.OR: 'or'>, arguments=[Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='article_h2_subsection', value='First Incompleteness Theorem'), Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='article_h3_subsection', value='First Incompleteness Theorem'), Comparison(comparator=<Comparator.CONTAIN: 'contain'>, attribute='article_h4_subsection', value='First Incompleteness Theorem')])]), limit=None)
Success! It appears that the query constructor chain can output valid, well-formed queries to execute on our vector database. This query should return documents where the value for the metadata field "article_h1_main" contains "Kurt Gödel" and the value for any other metadata field contains "First Incompleteness Theorem."
Now we can put everything together into a single Q+A retrieval chain.
sq_retriever = SelfQueryRetriever(
query_constructor=chain,
vectorstore=vectorstore,
verbose=True,
)
sq_qa = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff",
retriever=sq_retriever,
return_source_documents=True
)
response = sq_qa({"query": query})
print_result(response)
SOURCES: Chunk #1 Source Metadata: {'id': 'doc:c39d409b58e3444691ad91c34b5c3ee1:6c1de40a549245b7a6c3d56b7dda8da4', 'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.1 The First Incompleteness Theorem'} Source Text: In his Logical Journey (Wang 1996) Hao Wang published the full text of material Gödel had written (at Wang’s request) about his discovery of the incompleteness theorems. This material had formed the basis of Wang’s “Some Facts about Kurt Gödel,” and was read and approved by Gödel: In the summer of 1930 I began to study the consistency problem of classical analysis. It is mysterious why Hilbert wanted to prove directly the consistency of analysis by finitary methods. I saw two distinguishable problems: to prove the consistency of number theory by finitary number theory and to prove the consistency of analysis by number theory … Since the domain of finitary number theory was not well-defined, I began by tackling the second half… I represented real numbers by predicates in number theory… and found that I had to use the concept of truth (for number theory) to verify the axioms of analysis. By an enumeration of symbols, sentences and proofs within the given system, I quickly discovered that the concept of arithmetic truth cannot be defined in arithmetic. If it were possible to define truth in the system itself, we would have something like the liar paradox, showing the system to be inconsistent… Note that this argument can be formalized to show the existence of undecidable propositions without giving any individual instances. (If there were no undecidable propositions, all (and only) true propositions would be provable within the system. But then we would have a contradiction.)… In contrast to truth, provability in a given formal system is an explicit combinatorial property of certain sentences of the system, which is formally specifiable by suitable elementary means… We see that Gödel first tried to reduce the consistency problem for analysis to that of arithmetic. This seemed to require a truth definition for arithmetic, which in turn led to paradoxes, such as the Liar paradox (“This sentence is false”) and Berry’s paradox (“The least number not defined by an expression consisting of just fourteen English words”). Gödel then noticed that such paradoxes would not necessarily arise if truth were replaced by provability. But this means that arithmetic truth and arithmetic provability are not co-extensive — whence the First Incompleteness Theorem. This account of Gödel’s discovery was told to Hao Wang very much after the fact; but in Gödel’s contemporary correspondence with Bernays and Zermelo, essentially the same description of his path to the theorems is given. (See Gödel 2003a and Gödel 2003b respectively.) From those accounts we see that the undefinability of truth in arithmetic, a result credited to Tarski, was likely obtained in some form by Gödel by 1931. But he neither publicized nor published the result; the biases logicians had expressed at the time concerning the notion of truth, biases which came vehemently to the fore when Tarski announced his results on the undefinability of truth in formal systems 1935, may have served as a deterrent to Gödel’s publication of that theorem. Chunk #2 Source Metadata: {'id': 'doc:c39d409b58e3444691ad91c34b5c3ee1:65d27ea51ab0470e82b59678ccf471ce', 'article_h1_main': 'Kurt Gödel', 'article_h2_subsection': '2. Gödel’s Mathematical Work', 'article_h3_subsection': '2.2 The Incompleteness Theorems', 'article_h4_subsection': '2.2.2 The proof of the First Incompleteness Theorem'} Source Text: We now describe the proof of the two theorems, formulating Gödel’s results in Peano arithmetic. Gödel himself used a system related to that defined in Principia Mathematica, but containing Peano arithmetic. In our presentation of the First and Second Incompleteness Theorems we refer to Peano arithmetic as P, following Gödel’s notation. Before proceeding to the details of the formal proof, we define the notion of ω-consistency used by Gödel in the First Incompleteness Theorem: P is ω-consistent if P ⊢ ¬φ(n) for all n implies P ⊬ ∃xφ(x). Naturally this implies consistency and follows from the assumption that the natural numbers satisfy the axioms of Peano arithmetic. One of the main technical tools used in the proof is Gödel numbering, a mechanism which assigns natural numbers to terms and formulas of our formal theory P. There are different ways of doing this. The most common is based on the unique representation of natural numbers as products of powers of primes. Each symbol s of number theory is assigned a positive natural number #(s) in a fixed but arbitrary way, e.g. #(0) = 1 #(=) = 5 #(¬) = 9 #(1) = 2 #( ( ) = 6 #(∀) = 10 #(+) = 3 #( ) ) = 7 #(vi) = 11 + i #(×) = 4 #(∧) = 8 The natural number corresponding to a sequence w = < w0,…, wk > of symbols is ⌈w⌉ = 2#(w0) · 3#(w1) · … · pk#(wk), where pk is the k+1st prime. It is called its Gödel number and denoted by ⌈w⌉. In this way we can assign Gödel numbers to formulas, sequences of formulas (once a method for distinguishing when one formula ends and another begins has been adopted), and most notably, proofs. An essential point here is that when a formula is construed as a natural number, then the numeral corresponding to that natural number can occur as the argument of a formula, thus enabling the syntax to “refer” to itself, so to speak (i.e., when a numeral is substituted into a formula the Gödel number of which the numeral represents). This will eventually allow Gödel to formalize the Liar paradox (with “provability” in place of “truth”) by substituting into the formula which says, ‘the formula, whose code is x, is unprovable,’ its own natural number code (or more precisely the corresponding numeral). Another concept required to carry out the formalization is the concept of numeralwise expressibility of number theoretic predicates. A number-theoretic formula φ(n1, …, nk) is numeralwise expressible in P if for each tuple of natural numbers (n1, …, nk): N ⊨ φ(n1, …, nk) ⇒ P ⊢ φ(n1, …, nk) N ⊨ ¬φ(n1, …, nk) ⇒ P ⊢ ¬φ(n1, …, nk) where n is the formal term which denotes the natural number n. (In P, this is S(S(…S(0)…), where n is the number of iterations of the successor function applied to the constant symbol 0.) One of the principal goals is to numeralwise express the predicate Prf(x, y): ‘the sequence with Gödel number x is a proof of the sentence with Gödel number y.’ Reaching this goal involves defining forty-five relations, each defined in terms of the preceding ones. These relations are all primitive recursive.[10] Relations needed are, among others, those which assert of a natural number that it codes a sequence, or a formula, or an axiom, or that it is the code, denoted by Sb(ru1…unZ(x1)…Z(xn)), of a formula obtained from a formula with code r by substituting for its free variable ui the xi th numeral for i = 1, …, n. The forty-fifth primitive recursive relation defined is Prf(x, y), and the forty-sixth is Prov(y): ‘the sentence with Gödel number y is provable in P’ which without being primitive recursive, is however obtained from Prf(x, y) by existentially quantifying x. (Prov(y) satisfies only the ‘positive’ part of numeralwise expressibility, and not the negative part; but the negative part is not needed.) In Theorem V of his paper, Gödel proves that any number theoretic predicate which is primitive recursive is numeralwise expressible in P. Thus since Prf(x, y) and substitution are primitive recursive, these are decided by P when closed terms are substituted for the free variables x and y. This is the heart of the matter as we will see. Another key point about numeralwise expressibility is that although we informally interpret, for example, Prov(Sb(ru1…unZ(x1)…Z(xn))), by: ‘the formula with Gödel number r is provable if the Gödel number for the xi th numeral is substituted in place of the i th variable,’ neither the formal statement within the theory P nor anything we prove about it appeals to such meanings. On the contrary Prov(Sb(ru1…unZ(x1)…Z(xn))), is a meaningless string of logical and arithmetical symbols. As Gödel puts it in his introduction to his theorem V, ‘The fact that can be formulated vaguely by saying that every recursive relation is definable in the system P (if the usual meaning is given to the formulas of this system) is expressed in precise language, without reference to any interpretation of the formulas of P, by the following Theorem (V) (Gödel 1986, p. 171, italics Gödel’s). Gödel in his incompleteness theorems uses a method given in what is called nowadays Gödel’s Fixed Point Theorem. Although Gödel constructs a fixed point in the course of proving the incompleteness theorem, he does not state the fixed point theorem explicitly. The fixed point theorem is as follows: Theorem 2 (Gödel’s Fixed Point Theorem) If φ(v0) is a formula of number theory, then there is a sentence ψ such that P ⊢ ψ ↔ φ(⌈ψ⌉), where ⌈ψ⌉ is the formal term corresponding to the natural number code of ⌈ψ⌉. Proof: Let σ(x,y,z) be a formula that numeralwise expresses the number theoretic predicate ‘y is the Gödel number of the formula obtained by replacing the variable v0 in the formula whose Gödel number is x by the term z’. Let θ(v0) be the formula ∃v1(φ(v1) ∧ σ(v0, v1, v0)). Let k = ⌈θ(v0)⌉ and ψ = θ(k). Now directly by the construction P ⊢ ψ ↔ φ(⌈ψ⌉). A sentence is refutable from a theory if its negation is provable. The First Incompleteness Theorem as Gödel stated it is as follows: Theorem 3 (Gödel’s First Incompleteness Theorem) If P is ω-consistent, then there is a sentence which is neither provable nor refutable from P. Proof: By judicious coding of syntax referred to above, write a formula Prf(x,y)[11] of number theory, representable in P, so that n codes a proof of φ ⇒ P ⊢ Prf(n, ⌈φ⌉). and n does not code a proof of φ ⇒ P ⊢ ¬Prf(n, ⌈φ⌉). Let Prov(y) denote the formula ∃x Prf(x,y)[12]. By Theorem 2 there is a sentence φ with the property P ⊢ (φ ↔ ¬Prov(⌈φ⌉)). Thus φ says ‘I am not provable.’ We now observe, if P ⊢ φ, then by (1) there is n such that P ⊢ Prf(n, ⌈φ⌉), hence P ⊢ Prov(⌈φ⌉), hence, by (3) P ⊢ ¬φ, so P is inconsistent. Thus P ⊬ φ Furthermore, by (4) and (2), we have P ⊢ ¬Prf(n, ⌈φ⌉) for all natural numbers n. By ω-consistency P ⊬ ∃x Prf(x, ⌈φ⌉). Thus (3) gives P ⊬ ¬φ. We have shown that if P is ω-consistent, then φ is independent of P. On concluding the proof of the first theorem, Gödel remarks, “we can readily see that the proof just given is constructive; that is … proved in an intuitionistically unobjectionable manner…” (Gödel 1986, p. 177). This is because, as he points out, all the existential statements are based on his theorem V (giving the numeralwise expressibility of primitive recursive relations), which is intuitionistically unobjectionable. RESULT: Gödel's First Incompleteness Theorem states that in any consistent formal system that is sufficiently expressive to include basic arithmetic, there are statements that cannot be proven or disproven within the system itself. In other words, there are some truths that cannot be reached by the system's rules of logic. This is because, as Gödel discovered, the concept of arithmetic truth cannot be defined in arithmetic itself. If it were possible to define truth in the system itself, it would lead to paradoxes, such as the Liar paradox, showing the system to be inconsistent. Gödel's proof of this theorem involves a technique known as Gödel numbering, which assigns natural numbers to terms and formulas of the formal system. This allows the syntax to "refer" to itself, enabling Gödel to formalize the Liar paradox by substituting into the formula which says, 'the formula, whose code is x, is unprovable,' its own natural number code. The theorem concludes that if a formal system is ω-consistent (a stronger form of consistency), then there is a sentence which is neither provable nor refutable from the system. This means that there are always statements in the system that are true, but cannot be proven within the system.
The self-querying retriever worked exactly as intended here, retrieving as context each of—and only—the most relevant subsections of the original document based on header text, i.e. "2.2.1 The First Incompleteness Theorem" and "2.2.2 The proof of the First Incompleteness Theorem"
Additional Examples¶
Generally speaking, with use cases similar to the one presented above, I have found that implementing structured chunking with header metadata for self-querying significantly increases context precision and recall, resulting in improved text generation (and often lowering token usage for inputs).
Here are a few more examples (executed with the same code as above):
Example Query #1: "explain husserl's phenomenological reduction."¶
Data source: https://plato.stanford.edu/entries/husserl/
Text generated with unstructured chunking:
Husserl's phenomenological reduction, also known as "eidetic reduction", is a method used in his phenomenological philosophy. This method involves focusing on the essential structures that allow objects, which are naively taken for granted in the "natural attitude" of everyday life and ordinary science, to "constitute themselves" in consciousness. It involves an unfolding of abstract features shared by appropriate sets of fictitious or real-life examples, often through free imaginative variation on an arbitrarily chosen initial example. This perspective on the realm of intentional consciousness is supposed to enable the phenomenologist to develop a radically unprejudiced justification of his or her basic views on the world and explore their rational interconnections.
Text generated with structured chunking:
Husserl's phenomenological reduction, also known as the epoché, is a methodological procedure he developed to study the structures of consciousness. The aim of this method is to bracket or set aside all assumptions about the existence of the external world, so that the focus can be solely on the pure phenomena of consciousness. This means suspending all judgments about what exists in order to examine the content and structure of our thoughts, perceptions, and experiences without any preconceived notions.
The phenomenological reduction involves two main steps. The first is the epoché, which is the suspension of all beliefs about the external world. The second is the transcendental reduction, which involves turning one's attention inward to examine the structures of consciousness itself.
The phenomenological reduction is intended to reveal the essential structures of consciousness, or the ways in which we experience and understand the world. It is a method for studying the subjective aspects of experience, or the ways in which things appear to us in our conscious experience.
Husserl also developed the concept of the "noema," which refers to the content of our conscious experiences, and the "noesis," which refers to the act of consciousness itself. These concepts are central to his phenomenological analysis of consciousness.
In summary, Husserl's phenomenological reduction is a method for studying the structures of consciousness by bracketing all assumptions about the external world and focusing solely on the content and structure of our conscious experiences.
Text generated with structured chunking and self-querying:
Husserl's phenomenological reduction is a method used in phenomenology, a philosophical movement that he founded. The reduction involves a process called "epoché," where the phenomenologist "brackets" or sets aside their beliefs about the existence of the physical world, in order to focus purely on their direct experiences and perceptions. This is done to ensure that the item or experience being described is done so exactly as it is experienced, without any preconceived notions or biases.
The phenomenological reduction is used to make coherent sense of the transcendence of objective reality, in terms of the essential horizon-structure of consciousness. The most global form of epoché is employed when this reality in total is bracketed. However, the temporal flow of one’s “present” experience, constituted by current retentions and original impressions, cannot be meaningfully doubted or bracketed.
This process allows the phenomenologist to delve into the deep structure of intentional consciousness, which is essentially indexical in character and consists of retentions (immediate memory of what has been perceived), original impressions (awareness of what is perceived "right now"), and protentions (immediate anticipations of what will be perceived).
Through this method, Husserl aimed to uncover the fundamental structures of consciousness and experience, free from any assumptions about the external world.
Example Query #2: "what is nietzsche's perspectivism?"¶
Data source: https://plato.stanford.edu/entries/nietzsche/
Text generated with unstructured chunking:
Nietzsche's perspectivism is a philosophical concept that suggests all ideations take place from particular perspectives. This means that there are many possible conceptual schemes, or perspectives which determine any possible judgment of truth. Nietzsche believed that past philosophers had largely ignored the influence of their own perspectives on their work, and had therefore failed to control those perspectival effects. He criticized "dogmatic" philosophers for ignoring the perspectival limitations on their theorizing. However, he also believed that the operation of perspective makes a positive contribution to our cognitive endeavors. His perspectivism goes beyond just the theory of cognition and extends to moral psychology and the nature of truth or reality.
Text generated with structured chunking:
[Without changing the model or the splitting method, this query results in an invalid OpenAI API request: since there are large document chunks and no filter, the retriever prepends more context to the query than is allowed by GPT-4's ~8K token limit.]
Text generated with sturctured chunking and self-querying:
Nietzsche's perspectivism is a philosophical concept that suggests all philosophies are the reflections of their authors' personal experiences and perspectives. Nietzsche believed that past philosophers had largely ignored the influence of their own perspectives on their work, leading to a lack of control over those perspectival effects. He proposed that knowledge should not be seen as a form of objectivity that reveals the way things really are, independently of any point of view. Instead, he suggested that "objectivity" should be approached by exploiting the difference between one perspective and another, using each to overcome the limitations of others. This means that theoretical claims should not only be analyzed from the point of view of truth, but also diagnosed as symptoms and traced back to the complex configurations of drive and affect from the point of view of which they make sense. Nietzsche's perspectivism thus connects to his "genealogical" program of criticizing philosophical theories by exposing the psychological needs they satisfy.
Example Query #3: "what is derridian deconstruction?"¶
Data source: https://plato.stanford.edu/entries/derrida/
Text generated with unstructured chunking:
Derridian deconstruction, founded by Jacques Derrida, is a way of criticizing not only literary and philosophical texts but also political institutions. It involves re-conceiving the difference that divides self-consciousness and attempts to render justice. It is a kind of thinking that never finds itself at the end and is always engaged in making justice possible in countless ways. The foundation of this deconstruction is not a unified self but a divisible limit between oneself and oneself as an other.
Test generated with structured chunking:
Derridian deconstruction is a method of critical analysis developed by French philosopher Jacques Derrida. It involves the examination of the underlying assumptions and contradictions in a text or concept, often revealing inherent hierarchies or oppositions. Derrida's deconstruction is based on the idea that all systems of thought have an internal logic that can be uncovered and questioned.
In Derrida's view, deconstruction consists of two phases. The first phase involves reversing the hierarchical oppositions in a text or concept, such as essence and appearance, or good and evil. The second phase involves redefining the previously inferior term as the origin or resource of the opposition itself.
Derrida's deconstruction also emphasizes the impossibility of pure presence or self-identity, arguing that every experience or concept is marked by difference and deferral, a concept he refers to as "différance". This means that meaning is never fixed or stable, but always in flux and subject to reinterpretation.
Furthermore, Derrida's deconstruction is deeply concerned with issues of justice, arguing that justice is always to come in the future and is never fully present or achievable. This relentless pursuit of justice, despite its impossibility, is a key aspect of Derrida's deconstructive approach.
Text generated with structured chunking and self-querying:
Derridian deconstruction is a philosophical approach developed by Jacques Derrida that challenges the traditional assumptions of Western philosophy. It involves the critique of binary oppositions, such as essence/appearance, inside/outside, and speech/writing, which are seen as fundamental to Western thought. Derrida argues that these oppositions are not truly separate and hierarchical as traditionally thought, but are interdependent and have a complex, fluid relationship.
Derrida's deconstruction involves two phases. The first phase involves reversing the traditional hierarchies of these oppositions, making the traditionally inferior term more valuable. The second phase involves redefining the previously inferior term as the origin or resource of the opposition itself.
Derrida also emphasizes the concept of "différance," a term he coined to represent the idea that meaning is always deferred, never fully present, and that difference and deferral are intertwined in the process of interpretation.
In later works, Derrida's deconstruction takes on a more political tone, focusing on concepts such as justice and law, and exploring their inherent contradictions and complexities. He argues that justice is always to come in the future, it is never present, and that a just decision is always required immediately, creating a sense of urgency and precipitousness.
Overall, Derridian deconstruction is a complex and nuanced approach that seeks to expose and challenge the underlying structures and assumptions of Western thought.
Concluding Thoughts¶
At the systems-architecture level, it is useful (following the recent CoALA paper) to think of structured chunking with self-querying as a technique for improving semantic memory utilization, one that could be adapted for more agentic applications. Though there are procedural components in play here, e.g. knowledge about specific data that is hard-coded in prompting logic, the goal of this project (and, as I understand, of RAG more generally) is to delimit external sources of knowledge in such a way that they can be recaptured in—or, in a sense, restored to—processes of language generation.
"Augmentation" is not just a manner of textual elaboration but is also a manner of preserving context, establishing origins, and maintaining authority for fragments of text-bound information. Practically speaking, this process is essential where generative AI is incorporated into educational tools and "critical" knowledge systems, e.g. for medicine and law.
But, beyond a tendency to hold in relief the longstanding academic injunction to "cite your sources," these new forms of textual transformation also carry deeper implications for language, lending new weight to the old poststructuralist cliché that (as Derrida put it) “there is no outside-text”—to the enduring belief that, no matter how we chunk it up, natural language has no natural limits.
Acknowledgements
I created the HTML Header Text Splitter tool for LangChain in collaboration with Benny Minkowsky at Presidio Vantage (http://presidiovantage.com). If you're building with it and have some feedback or want to share a use case, we'd love to hear about it!
Huge thanks to Bagatur Askaryan at LangChain for advice on improving query construction and for recent upgrades to the self-querying retriever API.