/COMP30024

Team: _blank_

Primary LanguagePython

COMP300024 - Artificial Intelligence, The University of Melbourne

GitHub for Akira and Callum.

RAPPORT https://www.overleaf.com/4252552393kxhndqtrynfg

Summary Notes for AI here

Notes for the subject (courtesy to this bloke)

What’s in the Exam???

  • Worth 70 marks - 7 questions 10 marks each
  • 3 hours
  • 35 mark hurdle
  • Half of subject on Symbolic AI and half on Probabilistic AI
  • 2-3 sentence answers

Week 1: What is AI? Intelligent Agents

Explain different approaches to defining AI
  • Thinking like a human
  • Thinking rationally
  • Acting like a human
  • Acting rationally
Describe the operation of the Turing test
  • A person talks to a computer via typing
  • Computer has to imitate a human well enough to fool the other guy
Characterise the difficulty of different common tasks
  • Semantic understand is hard
  • Maths related things are easy
  • Conversational agents are hard
Characterise requirements for an agent in terms of its percepts, actions, environment and performance measure
  • Has to take inputs i.e. sensors
  • Has to do actions which change the environment with actuators
  • Has to exist in an environment
  • Has to evaluate its current state and have a performance measure of the desirability of the state
  • Just remember PEAS
    • Performance Measure
    • Environment
    • Actuators
    • Sensors
Characterise the environment for a given problem
  • Fully vs Partially Observable
    • Can you see all relevant info?
  • Deterministic vs Stochastic
    • Is there a chance for your choice to fuck up or is it certain that your move determines the next
  • Episodic vs Sequential
    • Does your environment change at all?
  • Static vs Dynamic
    • Does your environment change while you're thinking/deciding?
  • Discrete vs Continuous
    • Is there a finite number of moves?
Choose and justify choice of agent type for a given problem
  • Simple Reflex Agents
    • Very simple
    • If x do y
    • Good for categorising
  • Model Based Reflex Agents
    • Has a current internal representation of the world
    • Uses a lot of memory
    • Decides things based on current state
  • Goal Based Agents
    • Contains future predicted states of the world
    • Has a goal to achieve
    • Decides things based on current/future predictions
  • Utility Based Agents
    • Same as above but has utility values
    • Tries to maximize happiness
  • Make a decision based on the environment
    • Don't use something too complex

Week 2: Problem Solving and Search

Formulate single state search problem
  • Formulate a Problem:
    • Consider the current state and goal state
    • Consider the operations we can do to change state
    • e.g. drive from southbank to unimelb
      • States: Southbank, Unimelb
      • Actions: Drive
  • A solution is considered a sequence of actions leading from the initial state to the goal state
    • We measure how good a solution is by using path cost
Analyse complexity of a search strategy
  • Completeness: Does it always find a solution if one exists?
  • Time Complexity: Number of nodes generated/expanded
  • Space Complexity: Maximum number of nodes in memory
  • Optimality: Does it always find a least-cost solution?
Apply a search strategy to solve problem
  • Note that:

    • b = maximum branching factor of the search tree
    • d = depth of the least-cost solution
    • m = maximum depth of the state space (could be infinite)
  • BFS

    • Expands the shallowest unexpanded node
    • Complete if the branching factor (b) is finite
    • Time: O(bd)
    • Space: O(bd)
    • Optimal Path if path cost is uniform (cost = 1), otherwise not complete or optimal in general
  • DFS

    • Expands depth wise (from the furthest unexpanded node)
    • Not Complete if in an Infinite Space (algorithm would infinitely expand states of zero cost)
    • Time: O(bm)
    • Space: O(bm) (at any given time, in the worst case, only need to remembe to depth m states, to get which you expanded at most b nodes each --> bm!)
    • Not an Optimal Path
  • Uniform Cost Search

    • Expand least-cost unexpanded node
    • Complete if step size > 0
    • Time: number of nodes with g <= cost of optimal solution
    • Space: number of nodes with g <= cost of optimal solution
    • Optimal since it searches the least-cost node
  • Depth Limited Search

    • Perform a DFS with depth limit = l
    • Not Complete
    • Time: O(bl)
    • Space: O(bl)
    • Not Optimal
  • Iterative Deepening Search

    • Do DFS for each 'step'
    • Complete
    • Time: O(bd)
    • Space: O(bd)
    • Optimal if step size = 1
    • Can be modified to explore a uniform-cost tree
  • Bidirectional Search

    • Search from both goal and start
    • Complete
    • Time: O(bd/2)
    • Space: O(bd/2)
    • Optimal if done with the correct strategy (such as BFS)

Week 3: Informed Search Algorithms

Demonstrate operation of search algorithms
Discuss and evaluate the properties of search algorithms
  • Greedy Best First Search

    • Go down the path that seems to be as close as possible to the end
    • Not Complete (needs repeated state checking)
    • Time: O(bm)
    • Space: O(bm)
    • Not Optimal
  • AStar (A*) Search

    • Avoid expanding paths that are already expensive
      • Keep an ordered list of the f(n) cost of things
      • Expand the next one
    • Evaluate things with:
      • f(n) = cost so far to reach n (g(n)) + estimate cost to goal from n (heuristic h(n))
    • Similar to Uniform Cost search but evaluating distance from the goal
    • Complete
    • Time: Exponential in [relative error in h * length of soln] - Expands similarly to breadth first so O(berror*d)
    • Space: O(bd) all nodes in memory**
      • Optimal
  • Iterative Improvement Search

    • IDEA: Consider a current state, and try to optimise it iteratively by choosing configurations that improve on the current.
    • Evaluate with some "objective" function (i.e. a utility/goal function) to determine how close a state is to optimality
    • Path is irrelevant, only achievement of the goal/constraint is priority
    • Constant space (history is not recorded, your current state always overwrites the previous in the algorithm)
    • Example algorithm: Hill-climbing (applied iteratively to achieve this search)
      • Take the current state
      • Look at successor states, and take the state that is closest to the goal
      • NOTE: sometimes finding successors are difficult --> some algorithms randomly choose at this stage
    • HOWEVER this algorithm gets stuck in local maximums rather than global maximums
Heuristic behaviour for A*
  • If h(n) is very high relative to g(n), then only h(n) plays a role, and A* turns into Greedy Best-First-Search
  • If h(n) is 0, then only g(n) plays a role, and A* turns into Uniform Cost Search, which finds the optimal solution
  • If h(n) is always lower than (or equal to) the cost of moving from n tothe goal, then A* is guaranteed to find a shortest path. The lower h(n) is, the more node A* expands, making it slower.
  • If h(n) is sometimes greater than the cost of moving from n to the goal, then A* is not guaranteed to find a shortest path, but it can run faster
  • If h(n) is exactly equal to the cost of moving from n to the goal, then A* will only follow the best path and never expand anything else, making it very fast.
Derive and compare heuristics for a problem e.g., is a given heuristic h1 admissible; for given heuristics h1 and h2, does h1 dominate h2
  • Heuristics is the h(n) part of the AStar evaluation
  • An Admissible heuristic is when the heuristic value is always lower than the cost of getting to the goal
    • If a heuristic is admissible then it will always find a path to the goal
    • If a heristic is not admissible then it may not find an optimal path to the goal but it will do it faster
  • A heuristic dominates another when all of it's values for the same input are higher than the other's
    • h1(n) >= h2(n) for all of n

Week 4: Game Playing and Adversarial Search

n-ply means n-move games

Types of Games:

Deterministic Chance
Perfect Information Chess, Checkers, Go, Othello Backgammon, Monopoly
Imperfect Information Bridge, Poker, Scrabble, Nuclear War

Representing a Game as a Search Problem

We can formally define a strategic two-player game by:

  • Initial State
  • Actions
  • Terminal Test (Win / Low / Draw)
  • A Utility Function (Numeric Reward for Outcome):
    • Chess: +1, 0, -1
    • Poker: Cash won or lost

In a zero-sum game with 2 players, each player's utility for a state are equal and opposite

Minimax

  • Check all possible moves for their value up to a certain level
    • Calculate our value with a heuristic similar to AStar
    • Maximize our utility value when it is our turn
    • Minimize enemy utility value when it is not
  • Absolutely perfect play with infinite lookahead
  • Complete
  • Optimal (against another optimal opponent)
  • Complexity is same as DFS
  • Time: O(bm)
  • Space: O(bm)
def MINIMAX_DECISION(game):
	for op in OPERATORS[game]:
		VALUE[op] = MINIMAX_VALUE(APPLY(op,game),game)
	return max(VALUE[op])

def MINIMAX_VALUE(stae, game):
	if TERMINAL_TEST[game](state):
		return UTILITY[game][state]
	elif it is our turn:
		return the highest MINIMAX_VALUE of SUCCESSORS(state)
	else:
		return the lowest MINIMAX_VALUE of SUCCESSORS(state)

How to speed Minimax in time constraints:

  • Cutoff test (a depth limit) is used instead of Terminal test
  • Evaluation function (estimated desirability of position) is used instead of Utility function
  • Idea: Prune parts of the tree that are not impactful without the need to examine them
  • Works the same as Minimax, but every time you find a new minimum/maximum, keep track of it and discard values that are under/over this value
    • Calculate our value
    • If our current lowest MIN value is 3 and we find a 2 while minimizing, set the new lowest MIN value to 2 and discard the rest of the branch
    • Basically imagine you're pruning a tree to the shortest/longest branch
  • Not Complete
  • Optimal since does not affect the final result
  • Time: O(bm/2) with Perfect Ordering
    • Doubles depth of search and can reach depth 8 to play good chess
def MAX_VALUE(state,game,alpha,beta):
	inputs: state, current state in game
		game, game description
		alpha, the best score for MAX along the path to state
		beta, the best score for MIN along the path to state
	
	if CUTOFF_TEST(state):
		return EVAL(state)
	for s in SUCCESSORS(state):
		a = MAX(alpha, MIN_VALUE(s, game, alpha, beta))
		if alpha >= beta:
			return beta
	return alpha

def MIN_VALUE(state, game, alpha, beta):
	if CUTOFF_TEST(state):
		return EVAL(state)
	for s in SUCCESSORS(state):
		beta = MIN(beta, MAX_VALUE(s, game, alpha, beta))
		if b <= alpha:
			return alpha
	return beta
  • Expectiminimax
    • Minimax but before every outcome you add a "chance node" which has a probability of outcome
    • Tree grows really fast
    • As depth increases, probability of reaching a node becomes diminished
      • Means we care less about really deep nodes

Week 5: Machine Learning in Game Search

Discuss opportunities for learning in game playing
  • Can use machine learning in AI when we have an algorithm with weighted features
  • Gradient Descent Learning
    • Error: \large E = \frac{1}{2}\sum(t-z)^{2}
    • Weight Update Rule: \large w_{i} \leftarrow w_{i} - \frac{\partial E}{\partial w_{i}}
    • Keep making small changes in weight until we reach a global minimum error value
    • Hard to balance the n learning rate variable
      • Too big = May miss the global minimum
      • Too small = Gets stuck in a local minimum and it takes forever
  • Problems with Machine Learning:
    • Delayed Reinforcement: It takes a while before we get feedback
    • Credit Assignment: We don't know what actions resulted in the outcome
Explain differences between supervised and temporal difference learning
  • Supervised Learning:
    • Single step prediction
    • Predict tomorrow
  • Temporal Difference Learning
    • Multi-step prediction
      • Predict tomorrow and the day after etc.
    • Delayed feedback
    • Hope we're doing this right lol
Not expected to derive or memorise the TDLeaf(λ) weight update rule, but if given this rule may ask you to explain what the main terms mean
  • TD Leaf: \large w_{j} \leftarrow w_{j} + \eta \sum_{m=1}^{N-1} \frac{\partial r(s_{l}^{i}, w)}{\partial w_{j}} [\sum_{m=1}^{N-1}\lambda^{m-i}d_{m}]
  • wj: Current weight
  • di: Error
  • η: Learning rate
  • λ = 0: weights move towards the predicted reward at next state (i.e. move towards r(si+1, w)) using only the next state
  • λ = 1: weights move towards the final true reward (i.e. move towards r(sN)) using all states
  • \large \frac{\partial r(s_{l}^{i}, w)}{\partial w_{j}}: How reward is changing with respect to weight
  • \large [\sum_{m=1}^{N-1}\lambda^{m-i}d_{m}]: Weighted Sum of Error

Week 6: Constraint Satisfaction Problems

Model a given problem as a CSP
  • A CSP has three main parts.
    • X set of variables: {X1... Xn}
    • D set of domains for each variable: {D1...Dn}
    • C set of constraints that must be fulfilled
      • Can be unary (itself), binary (2 variables) or higher order/tertiary constraints (multiple variables e.g. all different values)
      • A goal test is defined by these constraints
  • Model a problem in terms of these:
    • e.g. classic lecture scheduling problem
    • X would be lectures
    • D would be times available
    • C would be all times different for each lecture i.e. allDiff
Varieties of Constraints
  • Unary constraints involve a single variable (SA not "green")
  • Binary constraints involve a pair of variables (SA not WA)
  • High-Order constraints involve 3 or more variables (cryptarithmetic column constraints)
  • Preferences (soft constraints) are often representable by a cost for each variable assignment (constraint optimization problems)
Cryptarithmetic

Assign values to letters such that a computer can solve "TWO + TWO" to be equal to "FOUR"

  • variables: F, T, U, W, R, O, X1, X2, X3
  • Domains: {0,1,2,3,4,5,6,7,8,9}
  • Constraints: alldiff(F, T, U, W, R, O)
    • 'O + 'O' = 'R' + 10·X1 ('O' from 'TWO')
Demonstrate operation of CSP search algorithms
  • Backtracking Search
    • Depth first search by assigning each domain value to each of variables
      • Literally takes forever
      • Ends up making n!dn leaves! Insanity.
  • Forward Checking
    • Whenever you assign a variable, remove this value from any neighbours
      • e.g. for the australia state colour problem:
        • We assign RED to WA:
        • This results in RED being removed from NT and SA
        • Now we assign GREEN to Q:
        • Continue till we reach a point where there is no domain left for a variable
      • Complexity depends on whatever order you check variables/values
      • It could be O(dn)
  • Arc Consistency (AC-3)
    • Start from a queue of arcs and check each node at each arc whether they share any values/domain in common
      • If they do have a conflict, add all arcs from that node to the queue to check again later
      • Keep going until the queue is empty (i.e. the graph is consistent and there are no conflicts)
    • Time Complexity: O(n2d3)
      • Can be reduced to O(n2d2) with a good heuristic/ordering
    • This checks for conflicts
    • You should do this before backtracking search
e.g., in what order are variables or values chosen using minimum remaining values, degree heuristic, least constraining value
  • Variable Ordering
    • Minimum Remaining Values
      • Pick a variable with the least number of remaining legal values (i.e. smallest domain left)
        • e.g. WA has only RED left, so we assign WA first
          • Reduces the number of choices for other variables
    • Degree Heuristic
      • Pick the variable which has the most number of constraints on other values
        • e.g. we start with SA since it constraints literally every other state lol
        • Can be used as a tiebreaker for minimum remaining values
  • Value Ordering
    • Least Constraining Value
      • Pick the value for the variable which removes the least number of values from its neighbours
e.g., show how the domain of values of each variable are updated by forward checking, or arc consistency, where X → Y means using arc consistency to update domain of X so that for every value x ∈ X there is some allowed value y ∈ Y
  • See above
Discuss and evaluate the properties of different constraint satisfaction techniques
  • Tree Structured CSP
    • Make a variable a root node
    • Go down the tree and apply makeArcConsistent(Parent(Xj) Xj) to all nodes
    • If you're at the deepest point, assign X so that it is consistent with it's parents
    • Best Case Time Complexity: Tree can be solved in O(nd2) i.e. linear time
      • You only need to take into account 2 nodes at any one time since removing a domain value from a node will not impact its other children d2
      • If there are loops then you need to account for those with the below:
    • Worst Case Time Complexity: O(dn) i.e. Tree with Loops
  • Nearly Tree Structured CSP
    • We can use Cutset Conditioning
    • Conditioning: instantiate a variable and prune its neighbours' domain values
    • Cutset: set of variables that can be deleted so constraint graph forms a tree
    • Cutset conditioning: instantiate (in all ways) a set of variables such that the remaining constraint graph is a tree
      • Then we can apply the above technique
    • Time Complexity: where c is the cutset size -> O(dc(n-c)d2) basically remove the variables which make the tree a loop with dn and then use the tree time complexity nd2

Week 8: Making Complex Decisions

Compare and contrast different types of auctions
  • Auctions have 3 main dimensions:
    • Winner Determination: Which bidder wins and what do they pay?
      • First price auctions: highest bid is paid
      • Second-price auctions: second highest bid is paid
    • Knowledge of Bids:
      • Open cry: Every bidder can see all bids from all other bidders
      • Sealed: Every bidder can’t see the other bids
    • Order of bids:
      • One-shot: each bidder can only make one bid
      • Ascending: each successive bid must exceed the previous bid
      • Descending: start from a high price and go down and the first to bid wins
  • Mechanism for an auction consists of:
    • a language to describe the allowable strategies an agent can follow
    • a communication protocol for communicating bids from bidders to the auctioneer
    • an outcome rule, used by auctioneer to determine the outcome
      • when to stop the auction
      • who’s the winner
      • how much they have to pay
Describe the properties of a given type of auction
  • English
    • Start low and keep open-cry bidding higher
    • Efficient
    • Susceptible to the winner's curse
    • Susceptible to collusion
  • Dutch
    • Start high and go lower until someone open-cry bids
    • Efficient
    • Susceptible to the winner's curse
    • Susceptible to collusion
  • First-Price Sealed-Bid Auction
    • Everyone bids what they think it's worth once and then the winner pays the highest value
    • May not be efficient
      • Since person who values it the most may not win
    • Susceptible to the winner's curse
    • Not susceptible to collusion
    • Easier to communicate
  • Second-Price Sealed-Bid (Vickrey) Auction
    • Everyone bids what they think it's worth once and the winner pays the second highest value
    • Efficient and Truth revealing
      • Strategy is just to bid your value
    • Not susceptible to the winner's curse
      • You aren't paying what you value it
    • Not susceptible to collusion
    • Computational simplicity
    • Counter intuitive for human bidders
Select the most appropriate type of auction for a given application

Week 9: Uncertainty

Calculate conditional probabilities using inference by enumeration
  • Inference by enumeration is changing the space in which the probability is taken from
Use conditional independence to simplify probability calculations
  • A and B are independent iff
    • P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A)P(B)
    • Also means P(A|!B) = P(A) and etc.
  • This means if A and B are independent then we can simplify P(A|B,C)
    • Since P(A|B) = P(A) then P(A|B,C) becomes P(A|C)
    • Also P(A, B|C) = P(A|C)P(B|C) since P(A, B) is P(A)P(B)
  • Now we can simplify complex things such as full joint distribution using chain rule
    • Given A and B are independent P(A, B, C) would normally have 23-1 entries
    • Reduce it by using the above rules + chain rules
      • Chain Rule: P(X1, X2, X3, X4) = P(X4 | X1, X2, X3)
      • P(A, B, C) = P(C|A, B) P(A, B)
      • P(A, B, C) = P(C|A, B) P(B|A) P(A)
      • P(A, B, C) = P(C|A) P(B|A) P(A) = 2 + 2 + 1 = 5 entries vs 7
Use Bayes’ rule for solving diagnostic problems
  • P(A|B)P(B) = P(B|A)P(A)
  • Solve this into
    • P(A|B) = P(B|A)P(A)/P(B)
Note: if the arithmetic is too complex to compute the exact final value then simplify the expression as best you can

Week 10: Bayesian Networks

Formulate a belief network for a given problem domain
  • Basically form a graph where causal nodes lead to symptomal modes
    • Causes -> Symptoms
  • A grandfather node should be independent of it's grandchildren
Derive expression for joint probability distribution for given belief network
  • Look at Uncertainty
  • This is can be done with global semantics:
    • Defines a full joint distribution as the product of the local conditional distributions
      • Applying this to :
    • Local semantics: each node is conditionally independent of its nondescendants given its parents
Use inference by enumeration to answer a query about simple or conjunctive queries on a given belief network
  • Simple Query:
    • Simply calculating a probability from posterior marginals
    • e.g., P(NoGas|Gauge = empty, Lights = on, Starts = false)
  • Conjunctive Query:
    • Breaking a query into smaller queries using the chain rule:
    • P(Xi, Xj|E = E) = P(Xi|E = E) P(Xj|Xi, E = E)

Week 11: Robotics

Determine the number of degrees of freedom of a robot, and whether it is holonomic
  • Degrees of freedom: Basically how many different ways can we "move"
    • You need 6 degrees of freedom to move around without restrictions in 3d space (3 for position and 3 for rotation)
  • Something is non-holonomic when it has more degrees of freedom than controls
    • For example a car is non-holonomic because has 2 controls but has 3 degrees of freedom
      • Since it can rotate in the x axis and we can accelerate/decelerate in the x axis but we cannot accelerate/decelerate in the y axis
Characterise sources of uncertainty in a robot application scenario
  • Examples of Sources of Uncertainty in Robots:
    • Perceptions (incorrect sensing)
      • Incorrect sensors
      • False positive/negative
      • Limited/Finite Time
      • Partially Obstructed Environment
    • Actions (incorrect doing)
      • Rough Surfaces
      • Obstacles
      • Slippage
      • Inaccurate Joint Encoding
      • Injuries (Effective Breakdown)
Explain the basic concepts of localisation and mapping
  • Basically we scan our environment and decide where we can't go
Formulate an application problem using incremental Bayes, and calculate posterior probabilities
  • Tutorial 11
    • Basically re-calculating P(+) when calculating P(o|++)
Model the configuration space for a simple robot
  • Draw where it can't go lol
Compare different approaches to motion planning given a particular configuration space
  • Cell Decomposition:
    • Break configuration space into simpler cells
      • Depends on the resolution of the cells
      • Basically like pixel-izing our config space so it's finite
    • Inaccurate
  • Skeletonisation:
    • Voronoi Diagrams:
      • We define limits around the configuration space
      • Voronoi Diagrams don't scale well
    • Probabilistic Roadmap:
      • Generate random points within configuration space and creating a graph
      • Need to generate enough points to ensure everywhere is reachable
  • Localisation vs Mapping
    • Localisation is when you're given a map and observed landmarks and you update your location and orientation distribution
      • Where am I?
    • Mapping is when you're given the pose distribution and observed landmarks and you derive a map from this
      • Making a map
    • SLAM (Simultaneous):
      • Do both of the above at the same time