Theoretical analysis of Machine Learning algorithms. Assignments from AI master at University of Bucharest.
๐ LaTeX is not supported in GitHub's markdown files: For better formatting, jump to my PDF solutions.
Check my solution is here.
Exercise 1. Give an example of a finite hypothesis class H with VCdim(H) = 2021. Justify your choice.
Exercise 2. Consider H balls to be the set of all balls in Rยฒ: H balls = {B(x,r), x โ โยฒ , r โฅ 0 }, where B(x,r) = {y โ โ 2 | || y โ x ||โ โค r}. As mentioned in the lecture, we can also view H_balls as the set of indicator functions of the balls B(x,r) in the plane. Can you give an example of a set A in Rยฒ of size 4 that is shattered by H balls ? Give such an example or justify why you cannot find a set A of size 4 shattered by H_balls .
Exercise 3. Let X = Rยฒ and consider Hฮฑ the set of concepts defined by the area inside a right triangle ABC with the two catheti AB and AC parallel to the axes (Ox and Oy) and with AB / AC = ฮฑ (fixed constant > 0). Consider the realizability assumption. Show that the class H_ฮฑ can be (๐, ๐ฟ) โ PAC learned by giving an algorithm A and determining an upper bound on the sample complexity m_H(๐, ๐ฟ) such that the definition of PAC-learnability is satisfied.
Exercise 4. Consider H to be the class of all centered in origin sphere classifiers in the 3D space. A centered in origin sphere classifier in the 3D space is a classifier h_r that assigns the value 1 to a point if and only if it is inside the sphere with radius r > 0 and center given by the origin O(0,0,0). Consider the realizability assumption.
- show that the class H can be (๐, ๐ฟ) โ PAC learned by giving an algorithm A and determining an upper bound on the sample complexity m_H(๐, ๐ฟ) such that the definition of PAC-learnability is satisfied.
- compute VCdim(H).
Exercise 5. Let H = {โ_๐ : โ โ {0, 1} , โ_๐(๐ฅ) = ๐[๐, ๐+1]โช[๐+2, โ)(๐ฅ), ๐ โ โ}. Compute VCdim(H).
Exercise 6. Let X be an instance space and consider H โ {0,1}^X a hypothesis space with finite VC dimension. For each ๐ฅ โ X, we consider the function z_x : H โ{0,1} such that z_x(h) = h(x) for each โ โ H. Let Z = {z_x : Hโ{0,1}, ๐ฅ โ X}. Prove that VCdim(Z) < 2^{VCdim(H)+1}.
Check my solution is here.
Exercise 1. Let X be an instance space. The learning algorithm A is better than the learning algorithm B with respect to some probability distribution, D, if we have: L_D(A(S)) โค L_D(B(S)) for all samples S โ (X ร {0,1})^m. Prove that for every distribution D over X ร {0,1} there exist a learning algorithm A_D that is better than any other algorithm with respect to D.
Exercise 2. Consider H to be the class of concentric circles centered in origin in the 2D plane. Consider the realizability assumption.
- show that the class H can be (๐, ๐ฟ) โ PAC learned by giving the algorithm A and determining the sample complexity m_H(๐, ๐ฟ) such that the definition of PAC-learnability is satisfied.
- compute VCdim(H).
Exercise 3. Consider the concept class C formed by closed intervals [a,b] with a,b โ โ: C = {h_a,b : Rโ{0,1}, a โค b, h_a,b(๐ฅ) = ๐[a, b](๐ฅ)}. Compute the shattering coefficient ๐_H(๐) of the growth function for m โฅ 0.
Exercise 4. Consider de concept class C 2 formed by union of two closed intervals, that is [๐, ๐] โช [๐, ๐], with a, b, c, d โ R (with a โค b โค c โค d). Give an efficient ERM algorithm for learning the concept class Cโ and compute its complexity for each of the following cases:
- realizable case.
- agnostic case.
Exercise 5. Consider H2DNF^d the class of 2-term disjunctive normal form formulae consisting of hypothesis of the form h: {0,1}^d โ {0,1}, h(x) = Aโ(x) โจ Aโ(x), where Aแตข(x) is a Boolean conjunction of literals (in Hconj^d). It is known that the class H2DNF^d is not efficient properly learnable but can be learned improperly considering the class H2CNF^d. Give a ฮณ-weak-learner algorithm for learning the class H2DNF^d which is not a stronger PAC learning algorithm for H2DNF^d (like the one considering H2CNF^d). Prove that this algorithm is a ฮณ-weak-learner algorithm for H2DNF^d.