DEF: 1 Title: The Canonical Debate Author: @CanonicalDebateLab. Comments-Summary: No comments yet. Status: Active Type: Paper Created: 2018-06-11 License: MIT Replaces: 0
The Canonical Debate.
A proposal to fix the current state of online discourse through the promotion of fact-based reasoning, and the accumulation of human knowledge, brought to you by the Canonical Debate Lab, a community project of the Democracy Earth Foundation.
i. Abstract.
Thomas Jefferson once said, "The cornerstone of democracy rests on the foundation of an educated electorate." We are living in the Internet Era, where information should be more available and abundant than ever, and yet it has also been said that we are now living in an era of "Post-Truth." We at The Canonical Debate Lab believe it is time to take the power away from the hands of elected representatives, and place it finally where it belongs, in the hands of the public. However, in order for this to be successful, we must ensure that the citizens of the world have all the information they need to make the right choices for themselves, and to make sure that this information has been validated to the maximum extent possible. Although some may say that the Internet is "broken", we believe that the problem is a structural one, and thus one that can be fixed. In this white paper, we define what we believe to be the missing piece to the Internet: a canonical debate, a place online where individuals with different opinions can come together, rather than apart, to debate and discuss openly and constructively the issues that face them, to propose solutions for these problems, analyze the trade offs and choose the one that best suits their needs. This tool reverses many of the natural incentives of other social networks that have led to information bubbles, clickbait headlines, sensationalist journalism, and "fake news." We believe that with this tool, we can finally fulfill the promise that was given with the rise of the Internet.
ii. Contents.
This document describes our vision for online debates and decision making in several parts, first from a conceptual and philosophical level, followed by a detailed description of our proposed solution for this vision:
- Problems: Provides an overview of the current state of debate and discourse online, and where we see a need for improvement.
- Principles: The fundamental principles upon which our solution is built.
- Solution: Our proposal for a new platform upon which intelligent decisions can be constructed in collaboration with one another.
1. Problems.
Once upon a time, there was the optimistic belief that the Internet would bring about an era of enlightenment, peace and unprecedented productivity. The technological and logistic advances that came with this new era permit access to information in a capacity never before experienced by humans, and modes of instantaneous and asynchronous communication between people that were science fiction only decades before. These structural changes were supposed to eradicate the problem of human ignorance, and bring people around the world together in h higher levels of mutual understanding and cooperation.
So, what happened? More than 20 years after the dawn of the World Wide Web, The Oxford Dictionary declared "post-truth" as its world of the year. This year also heralded a major shift in world politics towards nationalistic, isolationist trends, and become common practice to categorize information which one considers displeasing as "fake news".
The state of debate and collective deliberation, both online and off, appears to have taken a major step backwards since the dawn of the Internet. We of the Canonical Debate Lab believe, however, that all is not lost. We believe that the problem is structural in nature, and therefore can be solved with a structural fix.
1.1 The Problem With Debate Online.
The state of debate as it occurs via the available suite of "online" tools is particularly disappointing. None of the technologies in use today (with the exception of email and a few others) were available twenty years ago, and yet with this incredible growth in capability, we have failed to address fundamental issues with deliberation, and created many new ones.
Debates are scattered
As in the pre-Internet era, there is no one place to look for information regarding a single topic of debate. It requires more work than the average person is willing to do to collect all the relevant information. Given the overhead involved in a task as small as verifying the facts behind even a single news article, no single person can ever hope to gain a complete understanding of our most complex and important issues.
Arguments are made in silos
Rather than responding point by point to each argument, online debate is generally executed in long form, where a single person or group offers a complete set of opinions on a single, or possibly multiple subjects. This format makes it nearly impossible for one side to completely address the points of the other side, and gives leeway to gloss over them with only a superficial treatment, or even ignore them completely. As such, it is difficult to every fully resolve the case regarding even a single fact. It also enables many fallacies, such as the Strawman Fallacy and Cherry Picking.
Effort is wasted in repetition
Because debates are held in multiple places, with multiple overlapping, incomplete sets of of information at varying levels of specificity, accuracy and truth, most of effort in debate is wasted in repetition. There is currently no efficient way to "onboard" newcomers to the debate with the accumulated set of information that has been discussed. More knowledgeable participants must inevitably answer the same questions multiple times, and rebut arguments that have been made to them before, with no way to resolve a question once and for all.
Models promote polarization
Trends in the polarization of political opinion have been traced back in large degree to current models of online discussion and information sources, which increase engagement by way of learning user preferences, and giving them more of the same. The result is what's known as information bubbles, or echo chambers, in which people rarely come into contact with information or opinions contrary to the user's own.
This trend to some degree seems to be an unexpected negative consequence of the ability of the individual to choose between multiple sources of information, and so can be traced prior to the internet to other events, such as the proliferation of opinionated news channels on cable TV. The Internet has certainly exacerbated the situation, giving the uncountable options of information platforms available.
Recent studies have shown how such one-sided interactions lead to increasing radicalization of opinions, as people encounter a wide variety of opinions of varying extremes, but all biased towards one side of the debate. Unfortunately, this problem appears to be an essential part of the structure and business model of most information sources online:
- In order to compete, news sites find it advantageous to carve out a niche by adopting a consistent editorial slant.
- Social media sites encourage users to improve their experience by organizing into groups of people with shared interest and opinions.
- Ad-driven business models find success in giving users more of what they want, rather than showing them something that will upset them.
- Superficial, entertaining, and sensationalist reporting generally gets more attention (and gives better financial returns) than balanced, information-rich text.
Many Internet destinations which encourage interaction with the public do so in a format that favors "trolling" or "flame bait": flippant remarks designed to insult or enrage readers, or to get a cheap laugh, without substantially advancing the conversation. Comments sections in news and video sites, as well as most unmoderated forums, are notorious for this problem, resulting in a perverse circus-like competition of comments which often cross the line to hate speech. Many news sites have chosen to turn off public comments entirely, and it is a common joke that "the comments section may be hazardous to your health."
Microblog formats, like Twitter, do better in some respects, but suffer from similar problems, though in less concentrated doses. The brief format works best for sound bites, quick jokes, or off-the-cuff opinions, rather than deep and detailed reasoning. The fully public nature of the platform, coupled with the incentive to gain "followers" and "likes", rewards phrasing and entertainment value at least as much as the ideas that are being represented. Twitter icons like Ann Coulter have mastered the technique of retrying the same comment or joke multiple times until it has reached the right pitch of phrasing, humor and controversy to "go viral". This is a necessary skill that unfortunately not many scientists, field experts or deep thinkers possess.
Research into the psychology of online trolling has found that there is a notably sadistic tendency to those which exhibit this behavior. Their intention is specifically to cause some form of emotional harm to their readers, who have no reasonable recourse other than try to ignore such comments. A direct response only worsens the situation, giving more opportunity for the troll to insult or enrage the reader. It is unfortunate, then, that most sites do not have enough resources nor the proper structure to moderate open conversation. Trolls are thus given a much louder voice than those with more moderate tendencies, leading to the misperception that the majority of people in the world have extreme and insulting opinions.
Debate is tied to reputation
When a controversial issue is discussed online in an official capacity, rather than anonymously or pseudonymously, it means the arguments being made are tied to a person or entity with a reputation to establish and maintain. This creates a bias in the arguments that can be made. In order to expand human knowledge and make the best decisions, it is necessary to confront all possibilities, no matter how controversial they may at first seem. Unfortunately, public entities reasonably fear making unpopular arguments, or alienating their supporters or readers.
Exploratory arguments are especially dangerous, in that they may be misunderstood or taken (intentionally or unintentionally) out of context. Online, it can be very difficult to establish a safe place for "blue sky" thinking, or using the brainstorming practice of gathering all possible ideas free from any evaluation so that the best ideas may then be selected. It would be easy for casual readers to mistake fanciful out-of-the-box ideas as the actual opinions of the person expressing them.
While this may appear to be an insignificant issue, there is an exaggerated danger online of reputational damage that can be very difficult to undo. Many people have lost their job expressing an unpopular opinion online. The cartoonist Scott Adams was labeled a misogynist (correctly or not) for posting thought experiments on his blog, prompting him to preface many of his subsequent posts with a disclaimer. More recently, the author and podcaster Sam Harris dared to discuss with an open mind the controversial research of Charles Murray and found himself placed on a list of racist leaders. There is a very real risk in tackling controversy online, with very little chance of recovering from reputational damage.
Experts are drowned out
One of the great achievements of the internet is the extent to which it has enabled just about anyone to join in on the global conversation, to publish writings and opinions with little to no cost and effort. The internet can be compared to the revolution created by the Gutenberg printing press in terms of its impacts on communications and society.
Unfortunately, quantity is not the same thing as quality. While it is much easier to find some amount of information on just about anything, there are few mechanisms in place for selecting based on quality and depth.
Especially in the case of real-time news, it is much easier to get a sense of general popular opinion (or at least, the popular opinion within your self-selected information bubble) than it is to find the most accurate or relevant information. When experts are interviewed, the content is superficial and more opinion-rich than information-rich.
Real information is commonly buried in academic papers, and primary sources are hard to find. The format and language used by experts is generally too dense and erudite to be useful to the average person.
Debates can be "hacked"
As difficult as it may be with the cacophony of superficial opinions to find a vein of expert and verified information, the problem can easily be exacerbated by those who do not want the truth to be known. Through use of bots, fake accounts and other guerrilla marketing techniques the online debate can be swayed by a false sense of popularity for otherwise unfounded positions.
Such were the findings of the U.S. House Intelligence Committee, which found that a Russian agency was created specifically to sew distrust and division in the U.S. via online ads. Networks of fake accounts were also used to promote falsified information that would also increase the internal strife. While politicians may label "fake news" as anything with which they do not agree, there is a real threat of sophisticated attacks on our online information systems which are intended to promote false versions of the truth.
1.2 The Problem With Debate Offline.
Not enough time
When a topic is important enough to debate with another person, there are usually multiple arguments on either side that must be considered. Each argument must be based on one or more facts that to their own merit may require justification, and so on. In-person debates, whether between an informal group of people, or formally presented and moderated in front of an audience, are too short to cover all the relevant points, to the final level of detail.
Live debates can only be an approximation of the whole picture. Take, for example, the formal structure of Oxford-style debates:
- One member of the team in support of the "motion" makes their case, to the best of their ability, in favor of the topic under discussion, within a given time limit.
- A panelist from the side opposing the motion gets their chance to make the case against the statement within the time limit given.
- This continues for each pair of additional panelists.
- There is a limited question-and-response segment in which participants, moderators and/or observers can ask questions. Each side is allowed a chance to respond.
- The two sides then give their closing statements, again within the limited time frame.
This limitation exists in formal debate, but holds for informal discussion as well. In order to not distract from the "main point", we have to pick and choose which points are worth discussing at any given moment.
Not enough memory
Humans cannot possibly hold in their minds a complete picture of all the information related to a specific debate. Even experts on a subject would be expected to have to refer to documents, studies and other primary sources of information in order to hold a comprehensive discussion on a topic of debate. When a debate is held in real time, to compensate, points must often be made on the basis of “feelings”. At best, an aggregated opinion based on previous in-depth studies can be expressed, but it must be taken on faith by the other parties, or disregarded entirely. This is yet one more reason why the act of passing on important information from one person to the next can only be partially successful.
It's usually about "winning"
If discussion were always a cooperative activity in which parties joined forces (memory and intellect) to come to the best conclusion, then it is possible that humankind would be able to overcome the limitations cited above. Unfortunately, debates are often adversarial in nature, and too often focus on which side can “win” the debate, either by making the best show of things to an external audience, or by convincing their "opponents" that they are in the right (or at least overwhelming them with better arguments). Under such conditions, truth is only a secondary concern, and it can be tempting to employ logical fallacies in the name of making a point.
Unfortunately, as common as this posture is, truth may not be the only casualty here. The adversarial approach belies a lack of empathy for other participants, and can lead to alienation between those of differing opinions, driving a wedge between those that would otherwise benefit from cooperation.
Preparation matters
Under ideal circumstances, a debate would have the best representation possible from all perspectives. Unfortunately, in addition to the limitations cited above of time and human memory, there is the problem of unevenness of knowledge and experience between individuals. A debate may be won or lost based on which side has studied the subject in more depth, and is more prepared to relate and defend their arguments. No real-time debate can be said to present a complete picture, and witnesses and participants can therefore be swayed by whichever side is better prepared to represent their half.
Eloquence matters
One final variable in the result of a debate that is immaterial to the actual substance of the subject is the way in which the arguments are made. Unfortunately, the "winner" of a debate may be decided by who is the better speaker, rather than by which side has the strongest case.
1.3 The Problem with Politics.
To be honest, politics has never really been about truth in debates. tunnel
Risk aversion
Politics is a strange form of popularity contest, in which careers are based and world-changing decisions made on the basis of getting the most votes (for a person, not an issue). This means that bold or risky opinions are discouraged, as are blue-sky brainstorms and general speculation.
Uncertainty is not allowed
Politicians are discouraged from showing uncertainty. This reduces the level of honesty in the discussion, and closes the door on external discourse.
Changing opinions is a sign of weakness
It is considered a lack of leadership, intelligence and/or backbone to change one’s opinion in public as a politician. This almost completely eliminates the purpose of constructive discourse, which is to arrive at a consensus cooperatively.
Ambiguity is rewarded
In order to appeal to a larger group, and to avoid risking the accusation of being a “waffler”, statements from politicians tend to be as abstract, unspecific and non-committal as possible, while still trying to strike an emotional chord. This kind of discourse avoids the difficult questions rather than facing them head-on.
The problems are too complex
Rational Ignorance: When it takes more effort to understand something well enough (to vote on it) than is worth it, people won’t do it.
Politics of the possible
There are a lot of concepts mixed in here, not sure how to sort them out. BUT,
There is a problem with the way legislation is designed and proposed. It is designed based on what the authors THINK will be passable, then goes through a round of horse-trading, in which items get slipped into the bill totally unrelated to the purpose of the legislation in order to “buy” the votes of specific representatives. In the end, decisions are made at least partially not on whether or not it is a good idea, but rather, that it is a “passable” idea.
Decisions may be rushed: either presented and rushed to vote before all the delegates have had a chance to read it, or rushed so that there can be a vote before the session ends, before vacation, or before dynamics change (elections). Then there are the “at least we tried” bills: legislation proposed insincerely as a form of showing constituents that an attempt was made, without really trying.
Finally, most bills do not attempt to be holistic about a solution. For example, the recent FOSTA-SESTA bill was written (and passed) as a way to curb child sex trafficking, but also closed a door on some safety mechanisms for adult sex workers. While the problem was acknowledged (and stats show that these mechanisms helped reduce murders of ALL women by nearly 20% in areas where they existed), no effort was made to made regular (but illegal) sex work safer due to social taboos.
Focus on the wrong things
Politics very often focuses on the character of the candidate, rather than on the issues they support.
The real debate is hidden
We know that much of what really happens in the process of policy decisions happens behind closed doors, off the public record. Whether it’s in conversations with lobbyists, political whips, large donors, personal family influence, conversations with a pastor, or something else, much of what goes into making up the mind of a representative happens out of sight, through personal contact. Once a decision has been made, a conscious effort is made to find a way to frame it to the public in such a way that it can be “justified”. While this is actually a natural part of the process (imagine if you were a representative - wouldn’t you want feedback from family and friends?), as a side effect, it actually disempowers the remainder of a representative’s constituents.
Lack of local information
Especially for local elections and referendums, which should be more familiar with voters, there is ironically a severe lack of relevant information.
1.4 The Human Condition.
People don't know how to argue
A large percentage of the world population does not know what makes for a reasonable argument. People need to learn about logical fallacies, cognitive dissonance, etc.
People assume the worst
It appears at least that most people arguing (especially online) assume the worst of people that disagree with them. This is in part due to tribalism. But it seems to have become a social norm to assume bad intentions (or stupidity), rather than an honest difference of opinion, as the source of disagreement.
This problem could be solved if everyone were to adopt the Principle of Charity. Unfortunately, it would be hoping too much to expect this to happen any time soon.
People are lazy
Humans are generally too lazy to do the research necessary to verify if any supposed fact they read online is actually true. There is a natural bias towards believing information that supports a pre-existing opinion, and doubting one that debunks it. This leads to certain lies spreading very quickly online.
1.5 Current State of the Art.
Centralization of control
If the debate is “owned” by a single entity (in terms of who can edit the data), without complete and trustworthy transparency and history, the solution is subject to accusations of favoritism. This therefore can degrade trust in the substance of the debate itself.
Centralization of data
If the content of the debate is owned by a single party, there is a risk that the debate will one day disappear with the end of that entity, and all work will be lost (or sold!). This undermines the perceived value in participation.
Conflicts of interest
Commercial implementations (e.g. Kialo) will need to find a profitable business model. This creates the risk of selling personal data to advertisers (and political parties and beyond!).
Fake accounts
Most implementations do not provide adequate protection against “Sybil” accounts (fake accounts). Any solution that includes voting (upvotes, scoring, etc.) is therefore subject to “attacks” by networks of fake accounts attempting to sway the outcome of the debate.
Not reusable
Most solutions seen so far do not treat the debates as canonical. That is, they treat each debate as a separate, isolated entity, whose result cannot be used in future debates. This prevents knowledge from accumulating to the degree we believe possible.
Not popular
No solution so far has managed to become THE place for debate.
Insufficient curation
Many implementations lack the tools necessary to keep the debate organized, clean and productive
Bad user experience
Debate is a VERY complex activity. It needs to be trivially easy to read and to use. Kialo is the best attempt I’ve seen so far. C.f. the comment regarding Argunet here: https://www.lesswrong.com/posts/dJJYgmaYYFmHoQM4L/debate-tools-an-experience-report
Overly simplified model
This is a catch-all for features otherwise not called out in this section. The solution just doesn’t offer all the features we consider necessary. (That is, they don’t solve all the concerns listed in the Problems section).
No context
Solutions generally don’t take a rigorous approach to "context" (specifically as defined in our white paper).
Inadequate scoring
The scoring system offered by the solution is non-existent or oversimplified.
No concept of generic vs. complete
No concept of “generic” claims, tied to concrete claims. That is, no way to discuss generally about a controversial topic (e.g. abortion), and then tie it to a more specific case (e.g. late-term abortions).
No tools for contemplation
Debates are generally treated as a fixed entity, with only the overall popular consensus considered. There is no way to inspect the debate specifically from the perspective of the user, or from the perspective of someone else.
Boring
Deconstructed debates, removed from individuals and avoiding incendiary speech can unfortunately be very dry. This makes it difficult to achieve mass adoption.
2. Principles.
These problems can be solved given the right solution. There have been many good attempts in the past, but so far we have not seen one successful enough to make a significant change. In part, we believe it is because each seems to lack one or more critical elements necessary to avoid the pitfalls of the problems listed above.
2.1 Guiding Principles.
We present here a list of guiding principles that we believe are necessary to follow in order to create an antidote to these problems, and to avoid the pitfalls that have come before.
Knowledge should accumulate
Respect human nature
Focus on what matters
Be trustworthy
Include everyone
It's about learning, not winning
2.2 Beliefs.
Inspire with emotion; Argue with reason
Debating brings people together
Honest disagreements are about beliefs, values and priorities
We can get closer to the truth
2.3 Design Goals.
Create the canonical debate Make debates reusable Promote constructive discourse Teach people about debates Promote the best arguments Require context Reduce sources of bias Make arguments anonymous Prevent spam and fake accounts Be fully transparent Be fully distributed Make the platform last Eliminate conflicts of interest Foster an ecosystem Keep it organized Make it effortless Promote engagement Include everyone Respect the individual Illuminate different perspectives Bring people together
3. Solution.
Work in Progress - these are just notes taken from the Slack channel
The solution is to build the missing piece to the internet. The problem is a structural one. But it is possible to build a fix, one that can change the way people interact. There are several examples of this. Think how the world was before and after each of these pieces were added:
- Wikipedia
- Text messaging
3.1 Basic Elements.
There is a general academic consensus regarding the structure of a debate, at least as it relates to argumentation in the form of a dialectic. However, to a large degree this research focuses on debates within the context of actors performing the debate, providing their arguments and counter-arguments in response. While this foundational work profoundly informs the design of our solution, it is not generally adapted for the context we are proposing: that of a canonical debate, independent of specific actors.
What follows is a description of the principle elements of such a debate, and how they relate to one another.
=3.1.1 Claim.
A Claim in Gruff is a proposition, a statement which is intended to be taken as fact, or as the truth. In many cases (in fact, in the majority of cases), agreement on the validity of the statement is not universal. These differences of opinion form the foundation for debate, and the purpose of Gruff is to provide the tools for participants to work together to come to the best possible vision regarding the validity of a Claim.
3.1.1.1 Examples of and Variations on Claims.
Although deceptively simple in concept, in practice there is a lot to consider (and for Gruff to support). Gruff can work with several types of Claim:
- A proposed fact: The Earth is not flat.
- A statement of subjective opinion: This reporter is a moron.
- A proposed course of action: We should dedicate all our resources towards preventing meteor impacts.
- Fact: true or false
- Opinion: agree or disagree
- Proposal: yes or no
3.1.1.2 Canonicality.
One of the problems stated in the first section of this white paper is that the quality of our debates is weakened by the fractured nature of current discussions. A common fact may be discussed ad nauseum online, and yet reach no conclusion as each new participant must start from a position of ignorance. A mixture of impatience and limited memory leads to each new participant receiving an abbreviated or erroneous version of the debate, upon which they must build their opinions and pass on to the next person.
In the Gruff platform, Claims are created, and their debate can result in a large number of arguments and counter arguments. Even a simple example, such as the previous "The Earth is not flat" Claim can generate a large quantity of proofs and counter-proofs. But that's not the end of life for a Claim: the validity of one Claim can affect the debate over another. For example, the statement "We should sail west to reach the Indies" could hang quite critically on whether or not the Earth is a sphere. Likewise, a statement about how long the sun stays up in the summer on the North Pole would depend on this.
All too common what happens in a one-on-one debate is that we use claims that we take for granted to prove the proposition at hand, only to discover that the other party doesn't agree with our statement. It's one thing if it's a difference of opinion, but it is a problem if the other party doesn't have the same information as you regarding your claim, or has information that you don't have. In such cases, we can either ignore this problem (in which case the counterpart will probably just ignore your claim), or divert attention from the main debate at hand to go into detail rehashing a debate over this new claim.
Debate platforms must also deal with this problem: What happens when a Claim that has already been fully debated is used in a totally separate debate? Many systems ignore this issue, and leave it up to the users to restate their Claims in each new debate, resulting in a repetition of the previous debate, often skipping some arguments, and perhaps adding new ones that were missed the first time around.
We can solve this problem by making Claims canonical. For each Claim, there can and must be only one place to debate its validity. When a Claim is used in a debate with another Claim, it would be a waste to make a new copy. Instead, that debate refers to the Claim (and all its discussions) in its canonical place so that none of the original debate is lost. Furthermore, should new arguments or information ever arise relating to the Claim, it can be added to the canonical debate, and its impact will ripple across all other debates that have relied on this Claim in some form.
3.1.1.3 Attributes.
- Unique ID
- Title
- Description
- Links
- Related media
- Context
- Truth Score
- Arguments For/Against the Claim
- Arguments based on this Claim
3.1.1.4 Confidence/Belief.
TODO: talk about the sliding scale of confidence in a Claim. Even though a Claim is a binary proposition, it doesn't mean we have to believe it completely. And since debates are done as a group, the veracity score is really a measure of the overall group's belief in the statement.
3.1.1.5 Complex Claims.
Multiple premises, etc.
3.1.1.6 Changing Perspectives.
Positive vs. Negative vs. Question
3.1.2 Argument.
When a Claim is used in a debate to prove or disprove another Claim, we call this an Argument. Arguments are generally made either in favor of a Claim or against it (although sometimes it can go either way - see below). There are a few different, but synonymous ways to describe which side of the debate an Argument supports:
- Argument In Favor/Argument Against
- Argument For/Argument Against
- Pro Argument/Con Argument
- Supporting Argument/Attacking Argument (or Opposing Argument)
3.1.2.1 Relevance.
While the veracity of the Claim is important on its own, what is important for an Argument is its relevance. That is, it's important to judge not only whether or not the Claim is true, but whether or not using it in the debate is meaningful.
The most common example of an Argument that is by nature irrelevant is the well-known ad hominem fallacy, in which the arguer attacks the nature of their opponent rather than the substance of their arguments. For example, the statement You have something green stuck between your teeth may be absolutely true, but that would have no consequence in a debate against the Claim The Earth is spherical.
A subtle distinction could be made between the relevance of an Argument (is it related to the topic at hand?) and its importance (how much should we care?). After some reflection, it was decided that the distinction between these two characteristics is unimportant for the purpose of this platform, and so the two are grouped together into the single attribute called Relevance.
3.1.2.2 Argument for an Argument.
As we like to say, "everything's debatable". That includes the Relevance of an Argument. One person may find it totally acceptable that the mattress is missing a spring in the corner; another may consider that the most important part of the bed. In this case, it becomes important for each side to make Arguments for or against the Argument, rather than the Claim (or Argument!) the Argument is attacking or supporting.
- Argument: The mattress is missing a spring in the corner
- For: I need to sit on that corner to put on my shoes
- Against: We can turn the mattress around so that the corner you sit on has a spring
- Against: But then it will be where my head always ends up
- Against: We can turn the mattress around so that the corner you sit on has a spring
- Against: We can put in a new spring
- For: I need to sit on that corner to put on my shoes
What is only implicit in the example is that each of these Arguments and sub-Arguments are themselves based on a Claim (as with ALL Arguments). A more explicit version of the Arguments, with their Claims, would be as follows:
- Argument: The mattress is missing a spring in the corner
- Claim: The mattress is missing a spring in the corner
- Argument For Argument: I need to sit on that corner to put on my shoes
- Claim: John Doe needs to sit on the corner of the mattress that is missing a spring in order to put on his shoes
- Argument Against Argument: We can turn the mattress around so that the corner you sit on has a spring
- Claim: The mattress can be turned so that the missing spring is not in the corner where John Doe needs to put on his shoes
- Argument Against Argument: But then it will be where my head always ends up
- Claim: If the mattress is rotated so that the missing spring is away from the place where John Doe puts on his shoes, it will place the missing spring where his head ends up at night
- Argument Against Argument: We can put in a new spring
- Claim: A new spring can be placed in the mattress at the corner where one is missing
3.1.2.3 Relevance vs. Truth: Strength.
While an Argument is concerned with its Relevance, its Claim is concerned with its Truth.
3.1.2.4 Arguments That Are Pro AND Con.
TODO: What about Arguments that can be considered for OR against something. Ex: "If we don't get a new mattress, my mother can't come stay with us?" Depending on one's feelings about the person's mother, that could be a very strong argument for OR against the proposition. How should Gruff handle this?
- Put scoring on a scale of -100 to 100, rather than 0 to 100 (for example)
- Allow debaters to add the same Claim to the debate twice, once via an Argument FOR the Claim, and once via an Argument AGAINST. It would then be up to each debater to make sure they have balanced their votes on each side… (20 for one side and 80 for the other, for example)
Should we get a new mattress?
- For/Against: If we don't, my mother can't stay with us
- For: I love my mother (Score: 100 for)
- Against: You hate my mother (Score: 100 against/-100 for)
Should we get a new mattress?
- For: If we don't, my mother can't stay with us
- For: I love my mother (Score: 100)
- Against: You hate my mother (Score: 100)
- Against: If we don't, my mother can't stay with us
- For: You hate my mother (Score: 100)
- Against: I love my mother (Score: 100)
3.1.2.5 Attributes.
Arguments have many of the same attributes as Claims, at least in terms of displaying them to debaters. In fact, the average person may not understand the concept of separating an Argument from its Claim; when one writes out the main points in a debate, they generally provide ONLY the Argument (with its Context-relative title), and the Claim itself remains implicit.
Nevertheless, Arguments do have a few differences:
- Unique ID
- Title
- Description
- Relevance Score
- The Claim upon which the Argument is based
- Arguments For/Against the Argument
- The Claim or Argument which this Argument supports or attacks (the Target)
- Relevant links
- Media
- Context
3.1.2.6 The Debate Graph.
TODO: Claims are the nodes; Arguments are the edges
3.1.2.7 Common Argument Types.
Although the platform will probably not provide explicit support for the Argument types below, it is interesting to note them here and discuss how they will be represented in the system.
3.1.2.7.1 Evidence.
Supporting evidence is essentially an objective fact presented as some sort of "proof" for a Claim. Common types of evidence include:
- A photo, video or other documentation of a supposed event
- Formal documentation of a fact (e.g. a birth certificate)
- Scientific studies
- Eye-witness reports
As a side note, the authenticity of digital media can be hard to prove. As technology improves the capacity to falsify speech and even video, this can be a critical threat to honest debate. Fortunately, cryptography and publicly shared ledgers (blockchain) may provide a countermeasure to this problem: if authentic recordings can be registered automatically on a blockchain the moment they are recorded, it will be in practical measures impossible to alter the recordings after the fact. Technology is always part of a cat-and-mouse game, so it is important to remain vigilant regarding the presentation of evidence.
3.1.2.7.2 Logical and Informal Fallacies.
The first thing one learns when studying rhetoric is the notion of a logical fallacy. Some fallacies are formal, and so, if an Argument is shown to be an example of one, it can be shown to be completely ineffective (i.e. a Relevance Score of 0). Others are informal, and may in fact maintain some validity even in the face of an accusation of being a fallacy.
Take, for example, an Appeal to Authority:
Warren Buffet, the most successful investor of all time, says that it would be unwise to invest in cryptocurrencies.
An Argument that could be used against the Strength of this Argument would be:
This is an Appeal to Authority, and doesn't help explain why one shouldn't invest.
In Gruff, every Argument must have an underlying Claim, and in this case that Claim would be:
Using an Appeal to Authority as an Argument does not help support a debate logically
Once this Claim has been presented (and debated - and this topic is, in fact, contentious) in the system, it is ready to be reused for any other debate as needed. And so it should be for the entire list of both formal and informal fallacies.
Note that deep consideration has been given to making fallacies a special category or feature within Gruff all its own. However, it became clear that not only is the basic structure of Claims and Arguments sufficient; it also permits active debate on use of the fallacies.
3.1.2.7.3 Basic Values and Principles.
TODO
3.1.3 Request for Information (RFI)
A placeholder argument requesting that someone with expertise in the field provide some concrete information (e.g. "What is the difference in relative salary between men and women for college graduates for the period from 2010 through 2017?").
It is common for people in a debate to realize that information outside the scope of their knowledge could dramatically sway their opinion on a matter, assuming it could be trusted. In a live debate, those points can only be avoided, but in the canonical debate, there is the opportunity to request for experts to fill in the blanks. Much like Quora, it should be possible for users to follow specific topics (Contexts) in order to receive notifications for RFIs in their area of expertise.
3.1.4 Debate.
Although the whole purpose of the project is to facilitate debates, a debate is not really a first-class citizen in Gruff. Debate is more of a verb than a noun. In fact, there is no object or element in the design of Gruff specifically called a "Debate". Rather, we say that everything on the platform (Claims, Arguments, Evidence) is debatable. Debating is the process of working as a group to come to a consensus, if not on the truth of a Claim, at least on what are the principle points of difference, the beliefs and the set of values that each one must know and balance.
3.2 Supporting Elements.
3.2.1 Context.
Many logical fallacies are made during the course of debates just on the basis of using evidence, statistics or supposed facts from one context and applying them within the very different context of the debate. Another common problem is cherry-picking a study from one very specific place, and obscuring or removing its context in an attempt to make a much more general statement. It is also very common to witness a disconnect between the two sides of a debate as one side focuses on one context and the other focuses on another.
The Gruff platform resolves this problem through the explicit declaration of Context elements for each Claim. These elements to some degree can be thought of as the dictionary definition of each of the words used in the Claim itself. For example, in the argument: John Doe hates my mother, the Context could be declared as follows:
John Doe (Context: Jonathan Randall Joe, SSN: 112-12-0234, USA) hates my mother (Janathan Rosalyn Blow, SSN: 052-33-0101, USA)
This Claim has been declared with two Contexts which, presumably, avoid confusion about which individuals we are discussing. However, this still leaves some wiggle room for an Argument of the type:
No I (Context: Jonathan Randall Joe, SSN: 112-12-0234, USA) do not. I merely despise her (Janathan Rosalyn Blow, SSN: 052-33-0101, USA).
Thus, it would be desirable to provide Context definitions even for verbs, although that may prove difficult to achieve.
3.2.1.1 Knowledge Graphs.
It would be a phenomenal task to attempt to build a database of all the possible Contexts that could be required for public debates. Fortunately, there are some incredible organizations that have already shouldered this burden. These include:
Any of these online databases can be used as the source for a Context element, which includes an external link to the specific definition.Donald Trump does not wear a toupée.
However, if none of these sources provides the required reference, it should be possible to define one, either within the system itself, or via some other external source.
3.2.1.2 Context to Refine Debates.
One very common problem in debates is the tendency to generalize the statistics in one context to a much larger context, without any clear indication that it is reasonable to do so. This is known as the Hasty Generalization fallacy, and is only one example of fallacies due to loose or uncareful use of Context. One of the benefits of making Context explicit is the ability to identify when this is the case, and correct for it.
Consider the following Claim:
Immigrants are responsible for higher rates of crime.
This is an extremely general statement. There should therefore be a much greater burden to prove such a statement than, for example, what the original author of the Claim intended to say:
Undocumented immigrants in the United States in the decades between 2000 and 2018 have been the principle cause of an increase in violent crime in the cities where they are most present.
This would be a relatively easier Claim to prove, and perhaps a better place to start for a much more complex debate. However, it's easy to envision breakdowns of this one Claim by city, and perhaps even by neighborhood, year, or population, should it interest the debating public.
3.2.1.3 Context and Arguments.
This section began by discussing the relationship between Context and Claims, and yet the first examples were really relating Contexts to Arguments. This was a bit misleading, as Arguments do not themselves possess any Context. Arguments merely provide a link between two separate Claims, each of which has its own set of Contexts. Therefore, the Context of an Argument is actually the Context of its underlying Claim (which is being used within the Context of the target Claim or Argument).
3.2.1.4 Context and Relevance.
Consider then the following Claim:
Study X conducted by the University of Lalaland shows that violent crime rates in communities in the U.S. with a high population of Nambian refugees shows an unusually high number of violent crimes between 2014 and 2017.
Should this Claim be used as an Argument in the prior debate regarding crime rates, a few things should become evident regarding its Context:
- It refers to refugees, rather than undocumented immigrants
- The study covers only a portion of the time period being debated in the Claim
In the current iteration of Gruff, no attempt is being made to automate calculations of relevance based on Context; however, Context may be used to flag to users when there is no match whatsoever, or to recommend Claims that seem to have Contexts that match the target Claim.
3.2.1.4 Autocomplete, Duplicate Argument reduction, etc..
3.2.2 User.
3.2.2.1 Debater.
3.2.2.2 Curator.
3.2.3 Reference (Link).
- paper_agora Slack channel
3.3 Scoring.
It's very important to collect all the Arguments and information available related to a debate, and make them available to everyone interested. However, some Arguments are better than others, and keeping with the principle of reducing friction, there should be a way of separating the chaff from the wheat. In fact, many of our Principles can be supported through a process of numerically evaluating and weighing the different Arguments. To this end, Gruff supports several different formulas for rating, or "Scoring" Claims and Arguments.
3.3.1 Purpose.
Scoring serves the following purposes:
- Sort Arguments in terms of their effectiveness
- Reveal popular opinions on debates
- Allow users to view debates through the lens of their own beliefs
- Allow users to view the same debates through the lens of other users (anonymously)
- Reveal how the validity (or lack thereof) of underlying Claims can strengthen or weaken Arguments
- Reveal inconsistencies in the beliefs of a user
- Reveal hidden biases
3.3.2 Criteria.
Before designing a scoring method, it's important to reflect on what are characteristics we want in a scoring algorithm. Certainly, the method should help us achieve each of the goals listed in the section above. No one algorithm (at least so far) is perfectly suited to every objective, and so Gruff supports several different approaches, as described below. Multiple equations were tried and discarded, and others were maintained.
The following guidelines were used in the selection process. None of the methods satisfies all of them in every case.
The result of just one Argument shouldn’t be better than the Strength of that argument
It would be misleading, for example, to award a score of 100% to a Claim that is supported by only one Argument with a Strength Score of 30% (because it's either possibly untrue, or largely irrelevant).
Adding another Argument on a side should increase the weight on that side
If a Claim is supported by an Argument with nearly 100% Strength, and a second, weaker Argument is provided to prop up the first, it wouldn't make sense to take an average (for example), which would reduce the overall support of the Claim.
As a corollary to this, increasing the score of one of the Arguments should always increase the weight on that side, or at least not be harmful to its case.
A single Argument with a Strength of 100% and no opposition should have a result of 100%'
And, correspondingly, a 100% effective counter-argument should result in a score of 0%, assuming there are no supporting Arguments. A good example of this is a piece of irrefutable evidence: just one good piece of evidence should be enough to prove a Claim.
Adding an Argument with a Score of 0% should not affect the result
An Argument with a Score of 0% is either completely false, totally irrelevant, or both. It would definitely be a problem if participants could alter the outcome of a debate by spamming the discussion with ineffective Arguments.
Re-grouping shouldn’t change the outcome significantly
This is the hardest criteria to support mathematically. Intuitively, assuming that all Arguments have been sufficiently vetted, rebutted and voted, performing a Group Arguments curation to place several weaker Arguments beneath a more general heading should not weaken or exaggerate the outcome for that side. Three partially-relevant Arguments (30% each, let's say), should be just as effective separately as one more relevant (e.g. 65%) Argument that has them as supporting Arguments.
The outcome should match human intuition
This is the necessary catch-all for any criteria we may have missed. If the outcome of a calculation doesn't match what people would expect (and this is hard to define, since each person has a slightly different intuition), then we must pause and consider what we were expecting and why. Then, adjust the algorithm accordingly.
Some of the more difficult questions to consider here are:
- If both sides have 100% convincing (Strong) Arguments, but one side has extra supporting Arguments, should we declare that side to be more convincing? (What does 100% mean in this case?)
- Is one 80% Strength Argument more effective, the same, or less than two 40% Strength Arguments?
- If not, how many 40% Arguments would it take to defeat an 80% Argument
3.3.3 Scoring Methods.
3.3.3.1 Claim vs. Argument.
The Score of a Claim is a representation of its truthfulness, the confidence that people can have that the statement represents a true statement. Since this is synonymous with a debate, arriving at a Truth Score for the Claim can be considered the objective of that debate. However, as will be shown below, the Truth Score of a Claim is an important factor in the scores of Arguments upon which they are based. In other words, the Truth Score reflects on any debate in which a Claim is used.
Debates over an Argument affect its Relevance Score. These debates take the form of other Arguments in support of or attacking the relevance or significance of the Argument in the context of this debate (arguments against the accuracy of the argument are actually Arguments applied to the base Claim, rather than the Argument itself).
Depending on the Scoring Method, as described below, the way a Relevance Score is calculated may or may not be mathematically equivalent to the way a Truth Score is derived.
3.3.3.2 Popular Vote.
The Popular Vote is a very straightforward mechanism for scoring. On any Claim or Argument, Debaters can register their opinion on the Score of an element. The Popular Vote, then, is simply the average of all votes.
Popular voting can be considered a "dangerous" form of viewing the results of a debate. There are several reasons for this:
It can be subject to attack by dishonest voters, especially those with many fake accounts
This is a very genuine concern. It is critical that debates not be influenced or dominated by bad actors with fake accounts. As stated in the Principles, the system must guarantee to the best of our ability that there is only one account per person.
Just showing the Popular Vote can distort the outcome, due to popularity bias
There will certainly be some distortion due to this effect - there is no escaping the way our brain works, not completely. However, the main purpose here is not to see which side wins, but rather to provide the tools to help people understand themselves, which is the first tool in the fight against cognitive bias.
There is no guarantee the popular outcome is the right outcome
This is true, but the Popular Score here is not meant to be used as THE measure used for a vote. Rather, it is meant to help each potential voter figure out where they stand on the issue when the time comes to vote. The "tyranny of the majority" is a problem for a democracy, and therefore outside the scope of this system. For more information on how this can be resolved, see the Democracy Earth white paper. With the creation of this system, we certainly hope that the majority will be much more understanding and accommodating, rather than tyrannical.
3.3.3.3 Personal Vote.
The Personal Vote is the simplest scoring mechanism of all. It is simply a view of a Claim or Argument which shows the score the user chose, rather than the Popular Vote. Note, however, that on Claims or Arguments on which a user has not yet voted, the Score shown would instead be the Popular Vote (shown in such a way as to signal to the user that they have yet to register their Score).
3.3.3.4 Roll-up Score.
When one looks at the Popular or Personal Vote of a Claim or an Argument, they are looking at the final overall opinion attributed to that topic. Oddly, that Score is in no way influenced by the Arguments given for or against the topic, except as they indirectly influence the opinions of the people that cast a vote.
It would be very interesting to see how those votes compare to the sum total of the Arguments that were given on each side of that debate. Consider a very simple case:
- Claim A, with a Popular Vote of 65%
- Argument A, with a Popular Vote of 70%
- Argument B, with a Popular Vote of 70%
- An incomplete debate, with key Arguments missing
- Some source of bias in the voting, which is causing people to internally disregard key Arguments or weigh them differently
- Simple laziness - perhaps many users preferred to vote only on the top Claim without giving the Arguments enough consideration to vote
Since the Scores of each of the Arguments, one for and one against, are identical, it is natural to assume that they cancel each other out. It would seem, then, that the correct Score for Claim A from this perspective would be 50%, rather than 65%. Since the 50% is calculated from the bottom (the Arguments) up, we call this the Roll-up Score.
3.3.3.4.1 Strength Score.
Now, consider the following situation: Argument A has a Relevance Score of 70%, and Argument B has a Relevance Score of -70%; however, the Claim underlying Argument A (let's call this Claim AC) has a Truth Score of only 50%, while the Claim underlying Argument B (Claim BC) has a Truth Score of 100% (everyone magically seems to agree with it). How should that impact Claim A? Should an Argument that stands on shakier grounds be considered just as strong as one that doesn't?
For this, we can borrow a basic principle from software engineering: if you have one system that is down 90% of the time, and it depends on another that is also offline 90% of the time, how reliable is it? The answer to this problem is to multiply the two reliabilities to get the reliability of the system as a whole:
0.9 x 0.9 = 0.81, or 81%
This makes a lot of intuitive sense for a Strength Score, as well. If an Argument has a Relevance of 100% (absolutely definitive, if true) and its Claim has a Truth of 100% (absolutely true), then the Strength should be 100% (you can't possibly do better than that). If both are 0% (totally irrelevant, and false), then Strength would have to be 0%. But also, something that is 100% convincing, but based on a total lie should be 0% effective, as with an absolute truth that is completely off the subject. And, what about something that is absolutely convincing if true, but we're only 50% sure it's true? A Strength of 50% seems perfectly reasonable.
Therefore, the Strength Score S of an Argument A, with a Relevance Score of RA, which is based on a Claim C, with a Truth Score of TC, is:
S = RA * TC
3.3.3.4.2 Roll-up from Strength Scores.
Given the example above, of an Argument A with a Relevance Score of 70%, and a Claim of Truth 50%, we can now calculate that its Strength Score is:
0.7 * 0.5 = 0.35, or 35%
Meanwhile, the Strength Score of Argument B is:
0.7 * 1.0 = 0.7, or 70%
How should we calculate the Truth Score of Claim A now? We know that when Arguments are completely balanced, it makes sense to give it an even score of 50%. Also, it makes no sense to go below 0%, nor above 100%, and no matter how many Arguments are for or against the Claim, one Argument on the opposite side should move the needle to place at least some doubt on the Truth of the Claim.
Average of Scores
The simplest way to calculate the Roll-up Score would be to take an average of the Strength Scores of the Arguments, and use that to find where the Truth lies on the range of possible scores (we will use the notation SSpro for Strength Scores in favor of a Claim or Argument, and SScon for those against, and npro and ncon for the number of Arguments in favor and against, respectively):
TS = 0.5 + 0.5 * (∑SSpro - ∑SScon) / (npro + ncon)
In the case of our simple example, the Truth Score TA of Claim A would be:
TA = 0.5 + 0.5 * (0.35 - 0.70)/2 = 0.5 + 0.5 * -0.175 = 0.5 - 0.088 = 0.413, or 41.3%
This approach has the advantage of ease of comprehension and implementation. It also works fairly well against our criteria in the case of having Arguments on both sides of the debate. However, it fails to satisfy one of the main criteria: Adding another argument on a side should increase the weight on that side.
Consider the case of a Claim with only a single, strong Argument to support it, with, say, a Strength of 80%. On its own, the final score would be 90%. However, if an Argument with a Score of 40% were added in support, the Truth Score would actually decrease to 80%.
The naive version of this approach also fails another test: Adding an Argument with a Score of 0% should not affect the result. If a participant counters the aforementioned 80% Argument with a complete lie, the Truth Score would nevertheless become:
TA = 0.5 + 0.5 * (0.8 - 0.0)/2 = 0.5 + 0.5 * 0.4 = 0.5 + 0.2 = 0.7, only 70%
Divide by Total
A slight change in the previous formula offers a solution to the problem of averages. If we perform a weighted average by dividing by the sum of the magnitude of all scores, we get:
TS = 0.5 + 0.5 * (∑SSpro - ∑SScon) / (∑SSpro + ∑SScon)
This immediately solves the problem of weakening your side just by adding a poor Argument. If all the Arguments are on one side, then no matter how many there are, the final score will be either 100% or 0%, depending on which side they reside. Any new Arguments will always (except in the case of a 0% Argument) add Strength to to side it supports, while lowering the average Effectiveness for the other side. Also, any 0% Arguments are completely ignored, since a Score of 0 affects neither the divisor nor the dividend.
Now let's look how it performs in more robust situations. In the case of our simple example, the Truth Score TA of Claim A would be:
TA = 0.5 + 0.5 * (0.35 - 0.70)/(0.35 + 0.70) = 0.5 + 0.5 * -0.33 = 0.5 - 0.167 = 0.333, or 33.3%
This seems like a fairly reasonable approach, at least in the case of a stable debate, with no duplicate Arguments. However, the previous sentence already indicates one form of dishonest attack that could be used to change the scores in a debate: it would be possible to spam the Claim with duplicate or similar Arguments. If you have one Argument A in favort with 100% Strength, and then another Argument B against with 100% Strength, the Roll-up Score is a 50/50 outcome. If someone were to create just one copy of Argument A, it would change the Roll-up to 66.7%, with no material change in the debate, from the perspective of reasoning.
Furthermore, the very useful and common act of curation known as Group Arguments (see Curation, below) can have a serious negative impact on the score for the side that is receiving the reorganization. Consider the following case: A Claim with Arguments A1, A2 and A3 for the Claim, and Arguments B1 through B10 against the Claim, with the following Strength Scores: A1: 70% A2: 50% A3: 30% B1: 80% B2: 60% B3 through B10: 30% each
This yields a Roll-up Score of:
TS = 0.5 + 0.5 * -2.3 / 5.3 = 0.5 - 0.217 = 0.283, or 28.3%
Now, suppose the Arguments B3 through B10 are distinct, but fall under one generally related category that's better expressed as a single Argument. Even in the strongest case, a new Argument with 100% Strength, the result would change:
TS = 0.5 + 0.5 * -0.9 / 3.9 = 0.5 - 0.115 = 0.385, or 38.5%
The simple, and necessary, act of reorganization significantly weakened the case against the Claim.
Aggregate Strength Score
In order to reduce this effect, and the effect of Argument spamming, Gruff therefore takes a slightly different approach to calculating the Roll-up Score, known as the Aggregate Strength Score:
Find the Aggregate Strength of the Arguments for a Claim Find the Aggregate Strength of the Arguments against a Claim Calculate the total Roll-up Score as an average of the Aggregate Scores, as above
We define the Aggregate Strength Score AS via the following algorithm:
Sort the Arguments on the appropriate side (for or against) by the value of their Strength Scores SS Set the current AS to the absolute value of the SS of the top Score For each remaining Argument, in order of the sorted list, set the AS to be the current AS, plus the SS of the current Argument times the amount that remains towards reaching 100%
AS = AS + SS * (1.0 - AS)
Thus, for three Arguments A1, A2 and A3 with SS of 30%, 70% and 50%, respectively, we get an AES of:
AS = 0.7 + 0.5 * (1.0 - 0.7) + 0.3 * (1.0 - (0.7 + 0.5 * (1.0 - 0.7))) = 0.7 + 0.15 + 0.045 = 0.895, or 89.5%
The equation for the Roll-up Score then becomes:
TS = 0.5 + 0.5 * (ASpro - AScon)
Going back to our previous example of the Group Arguments action, we see that before the reorganization, the Truth Score would be:
ASpro = 89.5% AScon = 99.54% TS = 44.98%
After the reorganization, there is a change, but it's a lot less pronounced (assuming, again, that the newly grouped Argument has a Strength Score of 100%):
AES for = 89.5% AES against = 100.00% TS = 44.75%
Again, we see issues with this approach. The more arguments there are, the closer the final score tends towards 50%, which isn't very informative, and may go against human intuition. This solution also violates the principle that Adding another Argument on a side should increase the weight on that side.
Consider the simple case of two 100% strong Arguments, one in support and another against the Claim:
ASpro = 100% AScon = 100% TS = 50%
Now imagine that a second Argument is added against the Claim, again with a 100% Strength. In this case, the score remains unchanged:
ASpro = 100% AScon = 100% TS = 50%
Paired Ranking
This final approach builds on the concept of the Aggregate Strength Score, but adds one more intuition:
Two arguments of equal Strength, one for and one against, should cancel each other out
Since it would be rare to see two Arguments with exactly the same Strength Score, the solution has to be a bit more flexible. We solve this problem in the following manner:
Sort all Arguments on each side by their Strength Scores, from strongest to least Pair up the Arguments on each side according to their order Count the Score of the lesser of each paired Argument as 0, and the Score of the greater Argument as its Score minus the Score of the lesser Calculate the Score using the AS method as above, using the new Scores
Consider a debate with Arguments A1 70%, A2 50% and A3 30% vs. B1 100%, B2 80% and B3 60%:
Score for Paired Score for Score against Paired Score against 70% 0% 100% 30% 50% 0% 80% 30% 30% 0% 60% 30%
This gives us:
ASpro = 0% AScon = 65.7% TS = 17.2%
If we add 3 more Arguments A4 30%, A5 30% and A6 20%, the Score becomes as follows:
Score for Paired Score for Score against Paired Score against 70% 0% 100% 30% 50% 0% 80% 30% 30% 0% 60% 30% 30% 30%
30% 30%
20% 20%
ASpro = 60.8% AScon = 65.7% TS = 47.55%
And if the top Argument in favor becomes 90%:
Score for Paired Score for Score against Paired Score against 90% 0% 100% 10% 50% 0% 80% 30% 30% 0% 60% 30% 30% 30%
30% 30%
20% 20%
ASpro = 60.8% AScon = 55.9% TS = 52.45%
Finally, if we look at the debate that started us down this path, we get:
Score for Paired Score for Score against Paired Score against 100% 0% 100% 0%
100% 100%
ASpro = 0% AScon = 100% TS = 0%
That seems a little counterintuitive (the 100% Strength Argument in favor seems like it should have some impact), but it is still better than the result given by pure AS.
3.3.3.5 Roll-up Levels.
The previous examples only considered the Roll-up Scores based on the Arguments immediately for a single Claim. It is important to note, however, that the Arguments may have Arguments of their own, as may their base Claims. What if we want to see the impact of those sub-Arguments on the debate?
You could say that we want to look down another "level", and so this is the official term for the "depth" of a Roll-up calculation: Level. In the previous section, the calculations referred to Roll-up Level 1. In order to see how the Arguments on the Arguments stack up, we can look at the Roll-up level 2 Score.
Remember that the formula for a Level 1 Roll-up on the Truth Score of a Claim (we'll use Aggregate Strength Score for simplicity) is:
TS = 0.5 + 0.5 * (ASpro - AScon)
We can generalize this to say that the Level 1 Roll-up Score on any one element is:
RuS1 = 0.5 + 0.5 * (ASpro - AScon)
This equation applies equally to the Truth Score of a Claim as it does to the Relevance Score of any Argument.
Taking a simplified version of our Claim A, with pro Argument A1 and con Argument B1, if we know their Strength Scores, we can calculate the Level 1 Roll-up of Claim A. In this case, for a Level 2 Roll-up, we simply replace the Strength Scores with the Roll-up Strength Scores of the Arguments. Roll-up Strength Score can be defined as:
RuSS = RuRS * RuTS
That is, the Roll-up Relevance Score times the Roll-up Truth Score of its base Claim. This means that we basically just replace the voted Score of the Arguments with their Roll-up Scores. If the two are the same (the voting and the roll-ups), the Level 2 Score for Claim A will be the same as its Level 1. If the scores are different, then the result will vary accordingly.
For our Claim A, consider the following:
Claim A: Popular Score 60% Argument A (for): Popular Score 70% Argument AA (for): Popular Score 80%, Strength Score: 70% Argument AB (against): Popular Score 60%, Strength Score 50% Base Claim AC: Popular Score 90% Argument B (against): Popular Score 50% Argument BA (for): Popular Score 55%, Strength Score 40% Base Claim BC: Popular Score 60%
The Level 2 Roll-up Score for Claim A would then be:
TS2 = 0.5 + 0.5 * (AS1A - AS1B) AS1A = (0.5 + 0.5 * (ASAA - ASAB)) * TSAC AS1B = (0.5 + 0.5 * (ASBA - ASBB)) * TSBC
For the calculation, we have recursed down one level on both aspects of an Argument: its pro and con Arguments, and its base Claim. But since we have only gone one level deeper, we do not consider the Roll-up Truth Score of the base Claims of its Arguments. For this, we would need to go one level deeper, and calculate the Level 3 Roll-up (and so on).
3.3.3.5.1 Arguments and Claims without Arguments.
What should we do for the Roll-up Score if we encounter an Argument or a Claim that doesn't have any Arguments of its own? In this case, it will not be possible to calculate any deeper than this. Instead, the calculation will have to substitute this for its Popular Score; that is, its Strength Score or Truth Score, without any other weighting.
In the case of a Claim with no Popular Score (no votes), the default Score of 50% is used. For an Argument with no Popular Score, the default Score of 0% is used.
3.3.3.6 Belief Score.
Scoring up to this point has been based on using the average of the Popular Score for each element. This is fine when what one wants is a census on popular opinion. However, this is not enough to support two of the main objectives of this platform:
Provide tools for people to understand themselves Provide ways for people to see the world from another’s perspective
An individual's vote on the score of a Claim or an Argument is, in essence, a statement of their Belief on that subject. For example, it may fine to see that 56% of participants believe that a particular person was telling the truth when they made a statement, but if you believe they were lying, you would want to see that reflected in the Truth Score as a 0%, not as 56%.
As we saw with the Roll-up Score calculations above, the score on one item can have a significant impact on the results of another debate that relies on the Claim. When it comes to seeing popular opinion, one should use the Popular Score, but when it comes to making up one's own mind, it would be useful to ignore the opinions of everyone else, and instead use their own scores.
We call this view the Belief Score. It uses the same mathematics as the Popular Vote and Roll-up Score mechanisms, with only one slight change: for any node on which the user has voted, the user's vote shall be used as the Score, rather than the Popular Score. Node for which the user has not voted shall use the Popular Score, until such time as the user has cast a vote.
Given the simple case of two Arguments A1 and A2 in favor of a Claim, and one Argument B1 against: A1 (Popular Score: 70%, Belief Score: 40%) A2 (Popular Score: 50%, no Belief Score given) B1 (Popular Score: 40%, Belief Score 80%)
We can calculate the Popular Truth Score PTS using the Aggregate Strength Score method as:
PTS = 0.5 + 0.5 * (0.73 - 0.4) = 0.865, or 86.5%
However, the Truth Score based on Belief BTS would be:
BTS = 0.5 + 0.5 * (0.40 - 0.8) = 0.30, or 30%
In this case, the Debater's own personal opinion on the subject differs significantly from common opinion, and further Roll-up Scores based on this Claim would reflect this.
It would also be interesting for users to be able "browse" the opinions of other Debaters (while protecting their privacy, of course) in order to gain a deeper understanding of how others think, and what informs their beliefs. Thus, the system should support the possibility of selecting sample votes of other users and "exploring" their votes on related topics.
In the example above, the user might decide that they would like to see the votes of a user that gave a Score of more than 90% on the Claim. This might then reveal a result as follows:
A1 (Popular Score: 70%, other user's Belief Score: 80%) A2 (Popular Score: 50%, other user's Belief Score: 90%) B1 (Popular Score: 40%, other user's Belief Score 0%)
Investigating deeper might reveal that this specific user tends to agree very strongly and unquestionably with some public figure, perhaps the subject of the debate, or that they tend to find a certain subject (reflected in Argument B1) as unimportant.
By allowing users to highlight specific Beliefs in this manner, the platform is able to provide the introspection and deep understanding that is our goal.
3.3.4 Reason Score.
Reason Score is a vote-free scoring algorithm that places an emphasis on the interaction of Claims and counter-Claims, rather than popular opinion. Reason Score is meant to provide an objective account of the debate, rather than a personal or subjective one. With the voting-based scoring methods, you can defend your side of the debate by scoring the Claims according to your Belief. With Reason Score, the only way to impact the final result is by providing a new, reasonable Claim. Instead of casting a vote you provide a claim to explain why you would have voted differently than the current score. Reason Score therefore encourages Debaters to think through their objections to a given topic and put it in words, which can result in more complete sets of Claims.
As a side benefit, Reason Score values do not have to be set to zero after Curation actions (see below), unlike with voting scores. On the other hand, Curation can have a significant impact on the results. Assuming Curation is handled correctly, this is not necessarily a negative trait.
Calculation
The Reason Score for any Claim is a number between 1 and -1 (often displayed as a percentage between 100% and -100%) It is a measure of confidence in the validity of the Claim based on the descendant claims.
A Reason Score can be interpreted as:
Score | Percentage | Description |
---|---|---|
1 | 100% | Absolutely True |
.5 | 50% | Likely true |
0 | 0% | Undecided |
-.5 | -50% | Not Likely True |
-1 | -100% | Absolutely Not True |
To Calculate the Reason Score (RS):
For each child claim calculate its Weight which is the same as its Reason Score except we want to eliminate false claims so we zero out any negative Reason Score by taking the max between the Reason Score or zero:
Weight = Max(0,RS)
For each child claim calculate its Strength which is slightly different between a pro and a con.
Pro Strength = Weight * RS Con Strength = Weight * (-RS)To calculate the final Reason Score divide the sum of the pro and con weights by the sum of the strengths.
RS = sum(Weights)/sum(Strengths)
Formulas in a sample spreadsheet
Here is some sample code of the simplest interpretation. This does not include relevance:
function calculateReasonScore(childClaims) { let strengthTotal = 0; let weightTotal = 0; for (let child of childClaims) { const weight = Math.max(0,child.score); weightTotal += weight; if (child.pro){ strengthTotal += weight * child.score } else { strengthTotal += weight * -child.score } } return strengthTotal/weightTotal; }Jsfiddle simple example code
JSFiddle example code with relevance
Note that this is a recursive equation, so care must be taken either to prevent cyclical Arguments, or to replace a repeated value with a default Score (e.g. 1).
Averaging the weights is the default calculation. There are rare circumstances where different calculations will be preferred such as The lowest claim is used. This is for when the claim is an absolute and any counter proof would dismiss the whole claim. The highest claim is used. This is used when any positive proof will prove the claim true.
3.3.5 Ranking of Arguments.
One of the most important roles for scoring is to enable the ranking of Arguments, from "best" (strongest or most relevant) to worst. This is, most of all, in support of the Objective:
It should be possible to grasp the principle arguments of a debate in 30 seconds or less
By assigning a Score based on popular or personal opinion, or on the preponderance of facts, it becomes possible to design an interface which highlights the most salient information in a given debate.
One type of user interface currently under consideration is the possibility of voting by ranking. The ability to manually sort Arguments in order of "best" to "worst", and use this as a mechanism for voting is a compelling possibility. It remains to be decided what such a ranking would mean, however. Would the average user be sophisticated enough to sort ONLY by Relevance, or would the ranking have to be considered its overall Strength, according to how convincing the user considers each Argument to be (and how, then, would one register a Relevance Score for an Argument with a base Claim that is totally false)?
3.4 Viewing.
- Search
- Read debate
- See underlying Claim
- See underlying Arguments
- See popular Score
- See Roll-up Score to N levels
- See my Roll-up
- See Rolls-up by "type" (clustered types of opinion)
3.5 Debating.
3.5.1 Create Claim.
3.5.2 Create Argument from Claim.
3.5.3 Create Argument without Claim.
3.5.4 Vote.
3.6 Curation.
3.6.1 Who Should Curate?
3.6.2 Curation Actions.
What follows is a guide to the types of actions that a Curator may choose to execute in the effort to maintain the debates canonical and constructive. For those who are familiar with software development, these are analogous to types of code refactoring, only for arguments in a debate.
3.6.2.1 Fundamental Actions.
The following actions are the simplest, most rudimentary actions that are required to keep a debate organized and canonical. Some are actions that only users can do, but the act of creation of new content is part of debating in general.
3.6.2.1.1 Create Claim or Argument.
This is the most obvious action required of any debate. It only deserves mention here as it is a required step in some of the more complex actions listed below, including Split Claim and Group Arguments.
3.6.2.1.2 Delete Claim.
This is a highly contentious and dangerous action to put in the hands of anyone. The fact that historical forms of the debate can be viewed helps temper the problem, but it remains to be seen whether or not this should be an option at all.
The simplest case, that of a Claim with no Arguments of its own, and no votes, actually makes sense. There's little or no harm done in this case, and can be considered a simple act of housekeeping (a User can always come along and create the Claim a second time).
If a Claim with parent or sub-Arguments gets deleted, it would necessarily require the deletion of those Arguments as well (and then the deletion of their Arguments, and so on), so it makes sense to treat this option with care.
3.6.2.1.3 Delete Argument.
This carries the same risks as Delete Claim. However, it makes a little more sense than the latter, in the case that an Argument is created that has absolutely no relevance to the debate whatsoever. Also, since an Argument is merely the connection of one Claim to another (or to an Argument), it would be much easier to recreate the Argument, if necessary.
3.6.2.1.4 Attribute Enhancements.
This is the simplest form of curation, involving small changes to the text or attributes of a Claim or an Argument. These changes may include:
- Changing the Title
- Changing the Description
- Changing media elements
- Adding References
- Adding or correcting Context elements
Given its simplicity, one approach under consideration would be to make this type of enhancement available to average Debaters. In a previous experiment, it was possible for common users to propose a new version of a Title or Description. Debaters could then vote on which of all proposed versions they considered to be the best. The version with the most votes was then the version that would appear to users (it would still be possible to click a tab to see all the proposed versions at once). While simple, and subject to attack by bad-intentioned users (or those wanting to make a joke), it would nevertheless be a way to improve Claim and Argument quality without extra overhead for the Curators. An additional feature to counterbalance could be the option to override and lock a Title or Description by a Curator in order to prevent such attacks.
3.6.2.2 Compound Actions.
These are actions that could be reproduced with some extra work by a combination of the actions above, but it is recommended that the system support these as explicit options.
3.6.2.2.1 Merge Duplicates.
Probably the most common problem in attempting to maintain a canonical debate is the occurrence of duplicate Arguments or Claims. Whether Debaters are misunderstand what has previously been said, are too lazy to read the points that have already been made, or just want to say it in their own words, duplication is rampant in online debates.
Merging is the act of taking two identical Claims or Arguments and unifying them into one. It sounds simple, but would require the following fundamental actions to reproduce:
- Move each of the child Arguments over to one of the two Arguments or Claims (the "Survivor").
- In the case of merging a Claim, for each Argument that uses the non-Survivor as base, create a new Argument identical to it using the Survivor as a base instead. Repeat also for any Arguments attached to those Arguments, and so on.
- Perform any changes to Title or Description as appropriate (note that the history of Attribute Enhancements on the non-Survivor will be lost).
- Delete the Argument or Claim that was not selected as the Survivor
Unique ID
There are only two reasonable approaches in this case: choose one of the existing IDs, or create a brand new ID. Conceptually, there should be no problem with pre-existing links to one of the entities connecting to the new merged entity. On the other hand, it would be a loss if prior links were broken (although it would be possible to show the old version, and direct the user to the new merged edition). As will become evident in other cases below, it makes sense to choose one of the two entities (the one that has received more "attention") to persist as before, but with the modifications as noted below.
Title and Description
There can be only one of each of these elements. Ideally, the "best' of each should be chosen. If the system was set up to allow the public to vote on multiple proposals of each, then the most voted option across both versions of the entity should be chosen the victor. If not, the Curator performing the merge can be asked to select which of each version they prefer. In the simplest case the Curator or the system may select which of the two versions is the "survivor" (see below), and the Title and Description of that entity will be the one that prevails.
Links and Related Media
These elements are additive in nature. There is no inherent problem in having multiple external links nor media (assuming they are appropriate). In the case of a merge, it would be sufficient to maintain a (unique, if both entities contain the same entry) copy of each. This can then be followed up with an Attribute Enhancement, as described above, should this be desired.
Context
Assuming both merged entities were properly constructed, it should be assumed that their Context elements are identical, or nearly so. If not so, at least in spirit, then the system should not permit the merger at all. Thus, the best solution to this problem is to require that each entity have the same Contexts before a merger can occur. If this requires some Attribute Enhancement Curation prior to the merger, so be it.
Score
The most profound impact upon merger is on the Score of the two entities. Conceptually, no change has been made to either - they are two identical entities that have been expressed in two different ways. However, differences in their attributes (descriptions and links), and differences in the Arguments made for or against them could have an effect on the votes that have been cast so far (see Move Argument below).
There are three possible ways to handle this situation:
- Merge all the votes (maintain all the previous scores and calculate a new average).
- Maintain only the votes of the survivor.
- Zero out all previous votes.
Arguments Based on Merged Claims
Given that both Claims are identical in nature, any Arguments that have been based on either of them should still be valid under the merged Claim. Thus, the merged version should become the base Claim for all the Arguments involved.
The one edge case here is what happens when both of the merged Claims have an Argument that attacks or defends the same Claim or Argument. In the case that both either attack or defend the target, the merger of the Claim should be performed first, followed by a merger of the Arguments that are based on the newly merged Claim. In the case that they are on opposite sides of the debate, the result depends on the implementation:
- If the system was implemented to treat Arguments that are Pro and Con (see above) as a single entity, the system should merge Scores as discussed above.
- If the system was implemented to maintain two separate copies of Pro and Con versions of an Argument, then the two should be maintained.
Claims aren't affected by the Arguments that use them in debates. There should be no impact to a base Claim under such circumstances. However, Arguments should not be merged unless they both have the same base Claim.
The Target
It doesn't make sense to merge two Arguments unless they are both attacking or defending the same Argument or Claim. There should be no change, and the the merged entity should maintain the Target as before. The only consideration is with how a change to the Score should affect the roll-up Score of the Target. Debaters that had previously voted on the Target should be notified of the change to its Arguments.
Arguments For or Against a Merged Claim or Argument
In principle, the two entities that were merged were identical in nature. As with Arguments that are based on merged Claims, Arguments attacking or defending the merged entities can be preserved, using the merged entity as their new Target. Also similarly, this may trigger a merging of any identical Arguments (those that have the same base Claim), as described above.
Choosing the Survivor
The easiest solution to choosing which entity is the one that survives is to simply ask the Curator to choose. This may, in fact, be the most effective solution as well, since humans may perceive subtleties that an algorithm may not.
If the selection is done algorithmically, then a scoring system should be devised to determine which is the best option to maintain. The calculation should consider the following, according to a yet-to-be-determined system of weighting:
- Did either of the two previously receive Attribute Enhancement curations?
- Which one has the Title and Description with the most votes (if voting is used)?
- Which one has more Score votes?
- Which has the most Arguments?
- Which has the best Arguments?
3.6.2.2.2 Split Claim.
Some Claims may be made in such a way that they are really a combination unrelated facts. An exaggerated example might be something like the phrase "You're ugly and you smell bad", which is really the combination of the two Claims "You're ugly" and "You smell bad". A Curator in such a situation would therefore need to make a change such that two Claims are made from the one.
The manual approach would be to use step-by-step modifications to arrive at the final objective:
- Create a new Claim with a Title and Description referring to the second half of the original Claim
- Rename the Title and edit the Description of the original Claim so that they reflect only the first half of the Claim
- Move (see Move Argument below) any parent or child Arguments to the second Claim as appropriate
- Update the Contexts of each Claim to reflect the reduced scope of each new Argument
- Initiate the process to split the Claim
- Choose a new Title and Description for each
- Let the system choose the new Contexts for each, providing manual help as needed
- Let the system choose according to the new Contexts which Arguments, and which of the Arguments that are based on the old Claim should be associated with each of the new Claims (or both). Provide manual corrections as needed.
- Submit the selections to have the system perform the operations as configured
When the manual approach is used for such an operation, it creates a problem. A new Claim is created with no previous votes, but the old one, which is now only half of its original self, will still maintain its Score. Due to the Move Argument operations, Debaters will be notified of the change, but it will be up to them to alter their vote accordingly.
If the facilitated approach is used, the system should not only notify users, but should also zero out any previous votes on both of the resulting Claims. There is no way to algorithmically determine how the voting will change with enough certainty to try.
3.6.2.2.3 Split Argument.
This is not in fact a feature that needs to be supported. Since an Argument is merely the expression of a Claim used to support or attack another Claim or Argument, you cannot have multiple Arguments in a single entity. However, it's possible to create text that appears to be two Arguments in one. In such a case, the proper solution would be to correct the Title and Description such that it is appropriate for the base Claim, and then locate (or create) a Claim for the second half of the Argument and create a new Argument. If there are any child (or parent) Arguments relating to the second half of the Argument, these should then be move (see Move Argument below).
3.6.2.2.4 Move Argument.
This is one of the most critical and frequent types of Curation. The idea of a canonical debate, and the idea of separating Arguments into their distinct logical components, is not (yet) a common practice for most people. Our natural instinct is to provide a defense or counter-argument of any kind we are able to conjure, regardless of whether it is relevant or not. This is only made worse by the need to have the last word. Creating a constructive and canonical debate, however, takes careful thought and effort.
When an Argument is made in the wrong place, its impact on the outcome of the debate may be negated due to its irrelevance. This does not mean that it is unimportant, only that it has been used in the wrong Context. The work for the Curator, then, is to find the proper Context for the Argument, and move it there.
Consider a debate over gun control, where the following Claim is being discussed:
Preventing people with a diagnosis of mental illness from owning firearms does not go far enough to protect the public.
And the following Argument added as an argument for the Claim:
The shooter at Marjory Stoneman Douglas High School had no clinical diagnosis of mental illness.
While somewhat appropriate for the discussion, it would probably get a rather low Score for Strength based on its low impact (a single example). However, it is highly probable there would be a second Argument on the Claim of the type:
Only 22 percent of mass murderers could be considered mentally ill prior to their crime.
For a Curator in this situation, the task is simple: move the more specific example away from the Claim over the more general Argument as a supporting argument.
It could be implemented using the Fundamental Actions listed as part of the Merge Claims action:
- Create a new Argument with the new Target as its destination.
- "Move" (in the way being described here) any sub-Arguments over to the new Argument.
- Delete the old Argument.
Unique ID
There is no need to change the ID of the Argument, unless the format used for IDs is a unique combination of the base Claim, and the Target (which is now being changed).
Title and Description
In general, no changes would be necessary in this case. There will probably be some minor exceptions.
Links and Related Media
No changes are necessary.
Score
As mentioned above, one of the reasons for moving an Argument is to make it more relevant, or to give it more impact. This means that done properly, Move Argument should result in an increase in its Strength. Unfortunately, there is no way to definitively calculate by how much (nor to tell voters how their votes should change without their input). Therefore, the best solution is to zero out the Score votes, and notify the voters that they'll have to give a new opinion on the subject.
Base Claim
Moving an Argument should have no impact on its base Claim.
The Targets
The original Target (Claim or Argument) will be losing one of its Arguments. It shouldn't directly affect the (voted) Score of the Target to make a radical difference. The current assumption (prior to significant testing) is that the votes should be left undisturbed. Of course, the roll-up Score will be affected mathematically by the change.
The new Target will be gaining a new Argument. However, it will be an Argument without any initial Score. As with the creation of a brand new Argument, there should be no immediate impact on the roll-up of the new Target until voting re-commences.
Arguments For or Against a Moved Argument
This is a complicated issue to consider. Some Arguments may still be as relevant as before, such as:
This is only a single anecdote.
However, recall that the purpose of moving the Argument in the first place was essentially to increase its Relevance by placing it in its proper place. In the case of the high school shooter argument, this meant very slightly increasing its impact rather than its relevance.
Consider this much clearer example regarding ice and snow sports preferences:
Americans prefer downhill skiing over other winter sports
And the counter-argument:
Residents of Minneapolis play ice hockey more than any other winter sport.
Suppose, then, that this Argument is moved below a second Argument:
The population of Minnesota prefers ice hockey over downhill skiing.
What, then to do with this sub-Argument?
Minneapolis is only 0.1% of the population of the United States
The sub-Argument not only changes in terms of its impact, it no longer seems relevant.
Given these conditions, one of the following would be recommended:
- Delete all the sub-Arguments (along with their votes).
- Allow the Curator to choose which sub-Arguments should be deleted (and set the Scores of the remaining to zero).
- Don't allow this action for Arguments that already have sub-Arguments.
3.6.2.2.5 Group Arguments.
The examples given above for Move Argument reveal another possibility: what should be done when there are multiple Arguments all related to the same basic point? For example, imagine these Arguments for the debate I should get a diamond embedded in my tooth:
Study A shows that 3 out of 4 people prefer diamond-embedded teeth to gold-capped. Study B shows that in Utah, diamond-embedded teeth are on the rise. Study C shows that gold-capped teeth are still king.
Under such circumstances, it may make sense to group these Arguments under a single heading, rather than leaving them on their own.
Diamond-embedded teeth are more popular than ones that are gold-capped.
This new statement would be used as a single Argument in the debate, making it easier to understand. The old Arguments would then become sub-Arguments of the new Argument, but there's a subtle point to be made here: This new Argument requires a base Claim of its own.
Grouping Arguments, then, consists of the following Fundamental and Compound Actions:
- Create a new Argument, with a new base Claim
- Move each of the existing Arguments to become Arguments for or against the new Claim
Venryx [10:04] Here's an example of where overlap ratings might be helpful. It's not a great example -- I will try to think of better ones -- but it's something to start a discussion at least. - Spaces are better than tabs for indentation (root claim) -- Spaces are always the same width so give consistent renderings regardless of program or website (supporting argument) -- Spaces don't require the user to process their text before inputting onto a site with differing tab indent-widths (supporting argument)
These arguments hit on some of the same "basis" of "consistent width", but they can't be just merged because they're still different. You could try to group them under some intermediate, but it's not obvious what that regrouping should be, so if you did so it would most likely appear to casual readers that one or both of those points just were never brought up properly. (since the regrouping wouldn't be obvious enough of a parent for those two that they would know to expand to them) I suppose the best way to try regrouping would be: - Spaces are better than tabs for indentation (root claim) -- Spaces are always the same width --- Relevant because: this means they have the same appearance regardless of program or website (supporting argument) --- Relevant because: it means users don't have to process their text before inputting onto a site with differing tab indent-widths (supporting argument) I guess that is sufficient, though some people would probably prefer them to stay listed separately/as direct children. I will see if I can think of a better example.
timothy.high [10:12] the problem with that regrouping is that there’s no conclusion “Spaces are always the same width”…so…. that’s good or bad?
Venryx [10:12] Yeah; it lists the reasons it's good/bad, except as children "relevance arguments". Which is why some people wouldn't like it organized that way. (edited)
timothy.high [10:12] I don’t think that would be a good regrouping
Venryx [10:13] Do you know of a better one?
timothy.high [10:13] I’d say the two are distinct arguments that shouldn’t be grouped together. This is interesting to consider: what’s important is the conclusion, not the underlying fact?
Venryx [10:14] Okay, but if they're not grouped, I don't think they should be giving 100% weight each, because they have substantial "overlap".
timothy.high [10:14] I don’t see the overlap being a problem at all the overlap is in the underlying claim “Spaces are always the same width’, not in the conclusion so they have their own independent relevances
Venryx [10:15] It seems like listed separately they are going to be having more impact than using the regrouping above, though.
timothy.high [10:15] “Spaces have the same appearance regardless of program or website” - “who cares" “With spaces, users don’t have to process their text before inputting onto a site with differing tab indent-widths” - “yeah, that’s a serious pain"
Venryx [10:16] And the regrouping above seems fair to me (though confusing). So if the flat way is fair, how can the other way be fair as well? (given that the impact differ) (edited)
timothy.high [10:16] I don’t see the overlap being a problem here at all
Venryx [10:16] But do you agree that flat vs reorganized (as above) would yield different impact amounts for the two arguments? (edited)
timothy.high [10:17] It depends on the scoring algorithm (remember, that’s one of the principles of a good scoring algorithm)
Venryx [10:18] Yes. I suppose we'll have to answer this concern at the same time that we develop the default scoring algorithm, since they go hand in hand.
timothy.high [10:18] but, a bad regrouping is not a good guide for this
Venryx [10:18] I definitely think that in some scoring models, the impacts would change with the reorganizing above.
timothy.high [10:19] this discussion DOES have a very interesting outcome for me: regroupings make sense when all the arguments are making the same inference and conclusion but not because they are based on the same premise (edited)
3.6.2.2.6 Clone Claim.
In the course of a discussion, it is common to see the main subject of the debate evolve over time. An extreme example of this is the informal logical fallacy known as Moving the Goalposts. However, even (or especially) in the case of a constructive discourse, the original debate may require some refinement in order to come to a reasonable conclusion on the subject.
Consider the debate regarding dental fashion described above:
Diamond-embedded teeth are more popular than ones that are gold-capped.
Suppose that during the course of the debate, it becomes clear that while diamond embedding is obviously clear when it comes to the 65 and older crowd, teenagers clearly prefer gold-capped teeth. The canonical claim above would naturally have mixed results, with an outcome of, perhaps, a 47% Truth Score. While that is interesting to observe, it becomes more interesting to look at each of the more specific cases on their own:
Diamond-embedded teeth are more popular than ones that are gold-capped for people 65 years of age and older. Diamond-embedded teeth are less popular than ones that are gold-capped for teenagers.
Each of those Claims would bear the most relevant parts of the more generalized debate, and have, perhaps, a Truth Score closer to 70% (for example). It would then be interesting to create new debates with even more specifics, such as:
Diamond-embedded teeth are more popular than ones that are gold-capped for people 65 years of age and older on the Island of Tasmania for the period of 2010 through 2018.
And so on. The act of Curation, then, consists of:
- Identifying the opportunity to create a more specific Claim from the original Claim.
- Creating the new Claim with a new, more specific Title and Description.
- Adding or modifying the Context for the more restricted case.
- Creating new Arguments for those Arguments that are still relevant considering the new Context.
- Creating a new Argument with the new Claim as its base for use in the more general Argument.
- Identify the opportunity.
- Choose Clone Claim, and select the new Context, Title and Description as appropriate.
- Offer a menu to choose which Arguments would still be relevant in the new Claim (matching Context could help automate this process).
- Automatically create the Argument linking the new Claim to the general one.
- Choose Clone Claim, with a more general Context, Title and Description.
- Automatically create an Argument linking the more specific Claim to the new one.
3.6.2.2.7 Refine Argument.
Diamond-embedded teeth are preferred by dentists over gold capping.
Partially true - Score?
In Liberland, diamond-embedded teeth are preferred by dentists over gold capping.
3.6.2 On Annoying Debaters.
Moving a lot will start to piss people off...
3.6.3 Viewing Previous Versions.
Curation I’m sure they have something on their back end, but we don’t get to see it. The curation component needs to do some REALLY COMPLEX stuff: — Merge/de-dupe arguments and claims How do you keep it from getting messy? You need to take two identical but differently-worded items and merge them into one. But what happens to their supporting arguments? A nd to the votes on those arguments? — Split claims “Eating meat is really bad for you, and also those cows are soooo cute!" Well, those are actually two separate claims, so they should be broken up. What happens to THEIR arguments and scores?? (don’t forget context!) — Rearrange This argument in Kialo is interesting: https://www.kialo.com/eating-meat-is-wrong-1229/1229.0=1229.1-1229.12-1229.2624 Kialo Harvesting human organs and reselling them would be good for… Harvesting human organs and reselling them would be good for the economy. This does not make it morally acceptable, and this debate centres around the morals… (81 kB) What’s going on there? You actually have a disagreement about what they’re even debating (this happens SO OFTEN in live Oxford-style debates) because the title says “Eating meat is wrong”, but the topology says the topic of the debate is “Humans should stop eating meat”, you have a confusion of arguments How could we solve that problem? You create a new claim: “Eating meat is immoral" You then move ALL arguments about morality under that topic then you use that new claim as an argument in favor of “Humans should stop eating meat" That’s what curation is about - making sense of a jumble of public thoughts on the topic, and putting them where they need to go
More things that the debate platform should provide: Anonymous voting Looks like Kialo does that Open to everyone Kialo allows this, too Protection against Sybil attacks This is something necessary to preserve the authenticity of the debates. Unfortunately, Kialo does not do this. The PoI component of Sovereign, however, would, within reason Evidence Evidence is something LIKE a claim, but different –> It can be debated as to whether or not it is “true” (actually, authentic!) –> It can be used as a supporting argument, and have its relevance and impact debated just like any other argument –> It would have an extra attribute: PROVENANCE, the source of that evidence. Provenance is critical in submitting evidence in a court of law. So should it be in a debate It makes a huge difference, for example, which news outlet is submitting a story. t makes a difference if we’re talking about something Donald Trump said about his foreign policy vs. anyone else in his cabinet (who seem to often have a different policy…) CRITICAL in this day and age, people now cast doubt not only on reported news, but on original quotes, photos and videos. The fact is it is getting easier and easier to fabricate digital evidence. Fortunately, technology is already coming to the rescue vis-a-vis Blockchain technologies and digital hashes. Hopefully at some point in the near future, it will be possible to validate if a video is from the original source, or if it has been altered (Snapchat shows us, however, that even “original” content can be altered at the time of creation - perhaps in the future, even the hardware and software used for a recording will need to be part of its provenance…) PHILOSOPHIES: I’ll just throw these out there for now Debates should be canonical. This facilitates gathering all the information in one place. It also avoids the problems of repeating the debate with only partial information elsewhere. Lastly, it empowers people with the power to make their point without having to rely only on what’s in their head. I’ve tried to debate gun control many times, but generally it’s with people that are much more passionate, and informed about the issue than I. It’s an imbalanced exchange. A debate should be separated from the ones debating How often have we watched a debate, political, Oxford-style, or otherwise, and picked a “winner” based mostly on WHO made the best arguments? That’s NOT a great way to make important decisions. It’s a great way for motivating and hitting emotional chords, but it’s only part of the whole story. Emotions are for motivating. Logic is for making decisions. The emotional component of a debate SHOULD NOT be stripped from the debate. BUT it should be used to motivate people to participate, not to make a final decision. Debates are not “won" Too often we talk about who “won” a debate. The NPR show Intelligence Squared chooses a winning side based on which one was better able sway the opinion of the audience more. Political debates result in discussions about who won, both by gaining votes, and by seeming more “presidential” or “confident”. That attitude actually serves to drive a wedge between people. Debate should REALLY be about learning. Learning about he topic. Learning about the trade-offs. Learning about how your family, or your friends or neighbors feel and think about an issue. Learning not to demonize those that disagree with you, but rather to understand it’s a difference in values (usually rather noble ones). Nelson Cardozo [1] About this, I think politics.stackexchange.com is a great example of that. In this, avoid emotional arguments is crucial to make a good point. Timothy High [10:55] Debates are the one natural place we have that brings opposites together Timothy High [10:57] In this age, we are terribly worried about the “information bubbles” that we, google and Facebook are creating around us. As they seek to serve us the information we most “like’, and we choose to like the things that most agree with us, it has been demonstrated that we’ve created our own isolated groups. (There’s a lot of talk about this in the Sovereign white paper, as well) People are asking FB and Google to change their BUSINESS model in order to force us to eat our vegetables and see news that’s good for us. We can try to force it, but it’s working against natural incentives. Debate creates a natural opportunity to see the other side while trying to make your own points. It should be possible to grasp the essence of a debate on all sides in under 30 seconds. We are used to reading tweets, and only looking at the headlines of articles. We are becoming lazy and information-poor. The debate should be able to accommodate that lazine ss while providing the ability to drill down into it as much as necessary. Kialo does this pretty well, along my lines of thinking. The point is that the problem with tweets and headlines is not their brevity, but their lack of support a reliability. If I tweet that “Liz Warren is a liar because she said she’s an Indian” it sounds like flamebait but if I send that same statement out as an argument backed by dozens of arguments and evidence, you will know right away what I’m saying (and whether or not to agree) There is no “right” answer There’s just getting closer to the truth as best we can. In a decision, there’s also figuring out what the trade offs are, and making your choice between them. Debates can be as logical as we can make them. But a machine can never decide for us. A canonical debate is essential for direct democracy Up to this point in history, democratic decisions have been made based on incomplete information. It is no wonder decisions about who should lead the most powerful nation in the world sometimes come down to relatively irrelevant but emotional issues like gay marriage and abortion people are asked to decide on to many extremely complex issues at once with to little real information worse, these days, the issues that affect us the most, local elections, have so little information readily available that many voters don’t even recognize the candidates. If you build it, they MUST come if we are ever successful at hosting a truly canonical debate (think Wikipedia - level success) there will be no place for the liars and cheaters to hide Let Breitbart try and make absurd claims off-debate. the first thing that will happen is someone else will submit the article as “evidence” and see it get destroyed in public once Breitbart, or MSNBC, etc. starts appearing there, their followers will have no choice but to recognize the importance of the debate contents A canonical debate changes the incentives of news sources so far in the internet era, the incentive seems to be on the creation of sensationalist news and “clickbait” in order to make money off views once reporting itself can be constructively debated, reputations will be on the line, and false claims easily refuted The canonical debate will become a place which drives traffic out to news sources - those with the best original investigative reporting (via “evidence”), the best fact-chec king, and the best fact-based opinion papers and stories that can tie it all together in a thoughtful way. No more running around calling something “FAKE news!” - the proof is in the pudding.
3.7 Other Features.
3.7.1 Notifications.
3.7.2 Search.
3.7.3 Recommendations.
3.7.4 Other Opportunities for Machine Learning.
3.7.4.1 Stopping Trolls.
https://www.wired.com/2017/02/googles-troll-fighting-ai-now-belongs-world/ https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge https://arxiv.org/abs/1610.08914 https://arxiv.org/abs/1805.05345
3.8 Technologies.
TBD - in experimental phase
3.9 Blockchain.
3.10 Problem Solving.
The first phase of Gruff will not focus on more complex questions, the kind that make up discussions around legislation. While Gruff is excellent for debating specific points of a proposed solution, its structure is inadequate for this situation.
- Problem description
- Debate whether or not the problem needs a solution at all
- Different proposed solutions
- Scoring solutions vs. trade offs
- Governance: cost, time for implementation, when to review success
- Voting
kwiesmueller [2:05] But, I am not even sure, if what I mentioned here is even a "problem" but I wanted to kick this thought in from the side, as all philosophical thinking and solutions about debating might be useless if we lose the focus on how, where and by whom decisions are made in the near and distant future
timothy.high [2:05]
- How do we stop terrorist attacks on the U.S.?
- Not letting people in is bad for the economy
- Not letting people in is bad for human rights (esp. for refugees)
- Surveillance is bad for privacy
- Not doing anything will put U.S. (or whatever country’s) lives are risk
4. About.
The Canonical Debate Lab is a community project of the Democracy Earth Foundation it is made possible by collaborators and supporters around the world. While we maintain a close relationship with the Democracy Earth Foundation, we hold our own separate community events, keep our own Slack Team and Github repository and act as an independent organization.
4.1 Team & Collaborators.
Timothy High, Bruno Sato, Stephen Wicklund, Iwan Itterman, Kevin Wiesmueller, Bentley Davis, Jamie Joyce, Oz Fraier, Benjamin Brown, James Tolley.
4.2 Acknowledgements.
These are some of the minds that inspired the ideas expressed on this document.
Mark Klein (MIT).
4.3 Supporters.
These organizations supported our work through partnerships and recognition of our research and development efforts.
Democracy Earth Foundation, Digital Peace Talks, Internet Government.