RAI Document PDF
Document Details
Tags
Summary
This document contains questions and answers related to graph theory, graph neural networks (GNNs), and graph learning. It covers topics such as the connection between nodes and edges in graph theory, the role of p-cells in cell complexes, and the significance of the FORGE acronym in graph learning. The document also touches upon guardedness and self-regulation in AI technology.
Full Transcript
In graph theory, how many nodes does a single edge connect? One node Two nodes Three nodes Any number of nodes ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Two nodes* ***1 point*** Which of the following tasks do Graph Neural Networks (GNNs) typically struggle...
In graph theory, how many nodes does a single edge connect? One node Two nodes Three nodes Any number of nodes ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Two nodes* ***1 point*** Which of the following tasks do Graph Neural Networks (GNNs) typically struggle with? Node classification Link prediction Cycle detection Graph clustering ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Cycle detection* ***1 point*** In the context of cell complexes, what does a p-cell represent? A cell with p sides A cell with p vertices An element of dimension p A cell with p edges ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *An element of dimension p* ***1 point*** What does the acronym FORGE stand for in the context of graph learning? Framework for Higher-Optimized Representation in Graph Environments Framework for Higher-Order Representations in Graph Explanations Functional Optimization for Regular Graph Embeddings Fast Operational Research for Graph Equations ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Framework for Higher-Order Representations in Graph Explanations* ***1 point*** After applying FORGE, how do explainers perform compared to Random baselines? They consistently surpass Random baselines They perform equally to Random baselines They occasionally underperform Random baselines They consistently underperform compared to Random baselines ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *They consistently surpass Random baselines* ***1 point*** Based on the lecture content: What can the boundary relation be loosely translated to in graph theory? Nodes Edges Faces Weights ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Edges* ***1 point*** What does guardedness mean as discussed in the lecture? Personal information is guarded from being revealed to the outside world due to privacy reasons A class is guarded if a classifier can't identify data points belonging to that class A model is guarded if you cannot retrieve training data from it An attribute is guarded if you can't classify along that attribute ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *An attribute is guarded if you can't classify along that attribute* ***1 point*** What is the process/transformation used to achieve guardedness? Affine Concept Erasure Affine Attribute Erasure Affine Model Erasure Affine Class Erasure ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Affine Concept Erasure* ***1 point*** How are steering vectors generally defined as discussed in the lecture? v=μ0−μ1v=μ0−μ1 where μ0μ0 is the mean of undesirable class and μ1μ1 is the mean of desirable class v=μ0−μ1v=μ0−μ1 where μ0μ0 is the mean of desirable class and μ1μ1 is the median of desirable class v=μ0−μ1v=μ0−μ1 where μ0μ0 is the mean of desirable class and μ1μ1 is the mean of undesirable class v=μ0−μ1v=μ0−μ1 where μ0μ0 is the mean of undesirable class and μ1μ1 is the median of undesirable class ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** v=μ0−μ1v=μ0−μ1* where *μ0μ0* is the mean of desirable class and *μ1μ1* is the mean of undesirable class* ***1 point*** Which of the following is a limitation of graphs as a data structure? They can only represent hierarchical relationships They can only model pairwise relationships between nodes They are restricted to acyclic structures They cannot represent directed edges ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *They can only model pairwise relationships between nodes* In the context of the paper SaGE, what is semantic consistency? Semantically equivalent questions should yield semantically equivalent answers Semantically equivalent questions should yield same answers Same questions should yield same answers Same questions should yield semantically equivalent answers ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Semantically equivalent questions should yield semantically equivalent answers* ***1 point*** Why do models struggle with tasks in "moral scenarios"? Models lack the ability to process large datasets efficiently Models prioritise emotion over logic in moral decision-making The models are limited by insufficient computational power Conflicting training data due to different morals that people have ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Conflicting training data due to different morals that people have* ***1 point*** What metric was used to determine the quality of the paraphrase of the questions in the SaGE paper? BERTScore Parascore Jaccard Similarity Cosine Similarity ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Parascore* ***1 point*** How does the SaGE paper relate entropy and consistency? More Entropy implies consistency Less entropy implies inconsistency Less Entropy implies consistency More entropy implies inconsistency ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Less Entropy implies consistency* *More entropy implies inconsistency* ***1 point*** Identify the statements that are TRUE with respect to the current LLMs. LLMs are not consistent in their generation A good accuracy on benchmark datasets correlates with high consistency LLMs are consistent in their generation A good accuracy on benchmark datasets does not correlate with high consistency ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *LLMs are not consistent in their generation* *A good accuracy on benchmark datasets does not correlate with high consistency* ***1 point*** Why is AI Governance important? To prevent AI from learning new tasks independently To limit the efficiency of AI in performing complex tasks Ensure AI isn't used for unethical acts To prevent AI from being used in scientific research ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Ensure AI isn't used for unethical acts* ***1 point*** What are some aspects of AI Governance that are in focus in the current times? Revealing the amount of compute used during training past a certain compute threshold Limiting AI systems to only perform manual labor tasks Ensuring right to erasure Prohibiting the use of AI in any form of automation ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Revealing the amount of compute used during training past a certain compute threshold* *Ensuring right to erasure* ***1 point*** Which of the following are key OECD AI Principles? Inclusive growth, sustainable development, and well-being Limiting AI to industrial use cases Transparency and explainability Restricting international AI collaboration ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Inclusive growth, sustainable development, and well-being* *Transparency and explainability* ***1 point*** As discussed in the lecture, When you have domain specific task, what kind of finetuning is preferred? Full-model finetuning Layer-specific finetuning Head-level finetuning Retraining ### **Partially Correct. Score: 0.5** ### **Accepted Answers:** *Full-model finetuning* *Layer-specific finetuning* ***1 point*** What are the cons of full-model finetuning? Overfitting Catastrophic forgetting Increase in parameters Change in architecture ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Catastrophic forgetting* ***1 point*** What are adapters? Remove existing layers from a model Convert a model to a simpler architecture Replace the model\'s original parameters entirely Add additional layers to a preexisting architecture ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Add additional layers to a preexisting architecture* ***1 point*** What is instruction finetuning? Model is trained to ignore user instructions and operate independently based on its previous training Model's training objective is to follow the directions provided by the user when performing the task Process of training a model solely on instruction data without any real-world data Modifying the model's architecture to include specific instructions directly within its layers ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Model's training objective is to follow the directions provided by the user when performing the task* Which of the following are challenges faced by AI in the current era? (Select all that apply) AI systems being biased AI models requiring zero human intervention AI systems being completely unbiased AI models being trained on harmful data AI systems always making ethical decisions AI systems using transformer models ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *AI systems being biased* *AI models being trained on harmful data* ***1 point*** What does the abbreviation \"AGI\" stand for? Artificial General Intelligence Automated Guided Interface Artificial General Information Artificial Geospatial Intelligence ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Artificial General Intelligence* ***1 point*** Which of the following best describes one of the possible definitions of AGI? AI systems that are limited to specific tasks AI models trained for basic automation AI systems surpassing human intelligence AI systems which show high training accuracy ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *AI systems surpassing human intelligence* ***1 point*** Which of the following are potential challenges associated with AGI? (Select all that apply) Chaotic power struggles Utilization for selfish and short-term objectives Guaranteed long-term global stability Universal agreement on AGI's ethical use Potential misuse by a few for personal gain Non-existential risks ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Chaotic power struggles* *Utilization for selfish and short-term objectives* *Potential misuse by a few for personal gain* ***1 point*** What is considered the ideal way to balance the development of AI models? Ensuring a safety-performance trade-off Maximizing performance without considering safety Focusing solely on safety at the expense of performance Ignoring both safety and performance to expedite development ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Ensuring a safety-performance trade-off* ***1 point*** What should governments primarily focus on concerning AI systems at present? How to eliminate AI research entirely How to encourage unrestricted AI development without regulations How to ignore AI advancements altogether How to govern these AI systems and make them safe ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *How to govern these AI systems and make them safe* ***1 point*** What are the primary focuses of the EU AI Act? (Select all that apply) Regulating the use of AI to ensure safety, transparency, and accountability Banning all forms of AI development in Europe Promoting the unregulated use of AI across all sectors Requiring all AI systems to be open-source and publicly accessible Banning systems with cognitive behavioral manipulation of people or specific vulnerable groups ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Regulating the use of AI to ensure safety, transparency, and accountability* *Banning systems with cognitive behavioral manipulation of people or specific vulnerable groups* ***1 point*** What does the abbreviation \"GDPR\" stand for? Global Data Privacy Regulation General Data Protection Rule General Data Protection Regulation Global Data Protection Requirement ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *General Data Protection Regulation* ***1 point*** Which of the following is a way to protect personal data? Giving people more power over their data Allowing unrestricted access to personal data Reducing transparency in data usage Increasing the collection of personal data without consent ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Giving people more power over their data* ***1 point*** What is one challenge even after training a model on unbiased data? Implicit bias may still exist in the model No challenge, model will be completely free of any bias The model will exhibit perfect performance in all scenarios The model will have issues related to data processing ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Implicit bias may still exist in the model* ***1 point*** Among the following, which type of biased system is considered particularly harmful? Decision systems Recommendation systems Entertainment systems Weather forecasting systems ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Decision systems* How is AI best described? A multidisciplinary problem A single-field issue An entirely theoretical concept A purely mathematical challenge ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *A multidisciplinary problem* ***1 point*** Which of the following statement is true regarding the development and implementation of AI systems? Policy considerations and technical details are equally important Technical details alone are sufficient for effective AI development Policy considerations are only important in few countries AI systems do not require any policy or ethical considerations ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Policy considerations and technical details are equally important* ***1 point*** Based on the lecture content: Which of the following statements about AGI is appropriate? Learning new skills on its own and having emotional intelligence can be characteristics of AGI AGI cannot learn new skills independently and lacks emotional intelligence AGI is limited to performing specific tasks and does not require emotional intelligence AGI only focuses on technical problem-solving without any consideration of emotional aspects ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Learning new skills on its own and having emotional intelligence can be characteristics of AGI* ***1 point*** Based on the lecture content: From a mathematical perspective, which of the following is not considered a major problem in AI today, compared among others? Explainability Hallucinations Data privacy Bias ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Data privacy* ***1 point*** Which of the following statements best reflects the significance of information rights? The right to inclusion of information and the right to exclusion of information are both important The right to exclusion of information is more important The right to inclusion of information is irrelevant compared to other rights The right to inclusion is important but not right to exclusion ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The right to inclusion of information and the right to exclusion of information are both important* ***1 point*** Which of the following statements is true regarding checking for copyrighted information in black-box models? There is no way to check whether models have copyrighted information in black-box models All black-box models provide transparency for verifying copyrighted information Copyright information can be easily extracted from black-box models Black-box models disclose their training data for copyright verification ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *There is no way to check whether models have copyrighted information in black-box models* ***1 point*** Who is primarily responsible for self-regulation in the context of AI and technology? Individuals / Individual organizations Government agencies Only the organizations with more than 1 crore turnover International organizations ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Individuals / Individual organizations* ***1 point*** What role does the government play in the regulation and deployment of large language models (LLMs)? Strict regulation is provided by the government, which issues licenses to deploy or use LLMs The government does not regulate LLMs and leaves all oversight to private companies The government only provides financial support for LLM development without any regulatory role The government encourages unrestricted use of LLMs without any form of licensing ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Strict regulation is provided by the government, which issues licenses to deploy or use LLMs* ***1 point*** Which of the following is a notable drawback of AI? Environmental effects, such as high water and energy consumption Less accuracy in predictions and results Usage of transformers in models Consuming more training time ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Environmental effects, such as high water and energy consumption* ***1 point*** Jailbreaking AI models \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_? Produce harmful content Improve the model\'s computational efficiency Increase the model\'s interpretability Enhance the model\'s ethical standards ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Produce harmful content* ***1 point*** Which of the following are reasons for the delays in AI regulation? Lack of domain expertise Challenge of regulating bad without compromising good Lack of funding Political pressures Overabundance of regulations already in place ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Lack of domain expertise* *Challenge of regulating bad without compromising good* How is AI best described? A multidisciplinary problem A single-field issue An entirely theoretical concept A purely mathematical challenge ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *A multidisciplinary problem* ***1 point*** Which of the following statement is true regarding the development and implementation of AI systems? Policy considerations and technical details are equally important Technical details alone are sufficient for effective AI development Policy considerations are only important in few countries AI systems do not require any policy or ethical considerations ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Policy considerations and technical details are equally important* ***1 point*** Based on the lecture content: Which of the following statements about AGI is appropriate? Learning new skills on its own and having emotional intelligence can be characteristics of AGI AGI cannot learn new skills independently and lacks emotional intelligence AGI is limited to performing specific tasks and does not require emotional intelligence AGI only focuses on technical problem-solving without any consideration of emotional aspects ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Learning new skills on its own and having emotional intelligence can be characteristics of AGI* ***1 point*** Based on the lecture content: From a mathematical perspective, which of the following is not considered a major problem in AI today, compared among others? Explainability Hallucinations Data privacy Bias ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Data privacy* ***1 point*** Which of the following statements best reflects the significance of information rights? The right to inclusion of information and the right to exclusion of information are both important The right to exclusion of information is more important The right to inclusion of information is irrelevant compared to other rights The right to inclusion is important but not right to exclusion ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The right to inclusion of information and the right to exclusion of information are both important* ***1 point*** Which of the following statements is true regarding checking for copyrighted information in black-box models? There is no way to check whether models have copyrighted information in black-box models All black-box models provide transparency for verifying copyrighted information Copyright information can be easily extracted from black-box models Black-box models disclose their training data for copyright verification ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *There is no way to check whether models have copyrighted information in black-box models* ***1 point*** Who is primarily responsible for self-regulation in the context of AI and technology? Individuals / Individual organizations Government agencies Only the organizations with more than 1 crore turnover International organizations ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Individuals / Individual organizations* ***1 point*** What role does the government play in the regulation and deployment of large language models (LLMs)? Strict regulation is provided by the government, which issues licenses to deploy or use LLMs The government does not regulate LLMs and leaves all oversight to private companies The government only provides financial support for LLM development without any regulatory role The government encourages unrestricted use of LLMs without any form of licensing ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Strict regulation is provided by the government, which issues licenses to deploy or use LLMs* ***1 point*** Which of the following is a notable drawback of AI? Environmental effects, such as high water and energy consumption Less accuracy in predictions and results Usage of transformers in models Consuming more training time ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Environmental effects, such as high water and energy consumption* ***1 point*** Jailbreaking AI models \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_? Produce harmful content Improve the model\'s computational efficiency Increase the model\'s interpretability Enhance the model\'s ethical standards ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Produce harmful content* ***1 point*** Which of the following are reasons for the delays in AI regulation? Lack of domain expertise Challenge of regulating bad without compromising good Lack of funding Political pressures Overabundance of regulations already in place ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Lack of domain expertise* *Challenge of regulating bad without compromising good* ***1 point*** When calculating the sensitivity in εε-Differential Privacy where the values to be derived from the data points is a d-dimension vector, identify the normalisation technique. (Notations are the same as used in the lecture) Manhattan normalisation Eucledian normalisation Max normalisation Min-max normalisation Sigmoid normalization ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Manhattan normalisation* ***1 point*** In (ε,δ)(ε,δ)- Differential privacy what does δ=0δ=0 imply? (Notations are the same as used in the lecture) The equation (P(M(x)ϵS)≤eε(P(M(x′)ϵS)(𝑃(𝑀(𝑥)ϵ𝑆)≤𝑒ε(𝑃(𝑀(𝑥′)ϵ𝑆) should hold for some of the subsets SS The equation (P(M(x)ϵS)≤eε(P(M(x′)ϵS)(𝑃(𝑀(𝑥)ϵ𝑆)≤𝑒ε(𝑃(𝑀(𝑥′)ϵ𝑆) should hold for most of the subsets SS The equation (P(M(x)ϵS)≤eε(P(M(x′)ϵS)(𝑃(𝑀(𝑥)ϵ𝑆)≤𝑒ε(𝑃(𝑀(𝑥′)ϵ𝑆) should hold for all of the subsets SS The equation (P(M(x)ϵS)≤eε(P(M(x′)ϵS)(𝑃(𝑀(𝑥)ϵ𝑆)≤𝑒ε(𝑃(𝑀(𝑥′)ϵ𝑆) should hold for none of the subsets SS ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *The equation *(P(M(x)ϵS)≤eε(P(M(x′)ϵS)(𝑃(𝑀(𝑥)ϵ𝑆)≤𝑒ε(𝑃(𝑀(𝑥′)ϵ𝑆)* should hold for all of the subsets *SS ***1 point*** How do the utilities vary in the Laplacian mechanism vs the Gaussian mechanism in a higher dimension differential privacy setting? As the dimension increases, the Gaussian mechanism requires quadratically more amount of noise than the Laplacian mechanism, decreasing the utility As the dimension increases, the Gaussian mechanism requires quadratically lesser amount of noise than the Laplacian mechanism, decreasing the utility As the dimension increases, the Gaussian mechanism requires quadratically lesser amount of noise than the Laplacian mechanism, increasing the utility As the dimension increases, the Gaussian mechanism requires quadratically more amount of noise than the Laplacian mechanism, increasing the utility ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *As the dimension increases, the Gaussian mechanism requires quadratically lesser amount of noise than the Laplacian mechanism, increasing the utility* ***1 point*** \_\_\_\_\_ property ensures that a function applied on the privacy-protected data \_\_\_\_\_ its privacy aspect after applying a function over it. i. Post-processing ii. Retains i. Post-processing ii. Loses i. Composition ii. Retains i. Composition ii. Loses ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *i. Post-processing ii. Retains* ***1 point*** After using kk mechanisms for getting k(ε,δ)k(ε,δ)- differentially private data variations for a dataset, the combined leakage that is observed from these kk mechanisms can be minimized by: Using Laplacian Mechanism Using Gaussian Mechanism Using Uniform Mechanism Using Exponential Mechanism ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Using Gaussian Mechanism* ***1 point*** In a buyer-seller problem, given nn buyers and nn valuations by the buyers, what is the total **revenue **given a price pp. p∑i=nnA𝑝∑𝑖=𝑛n𝐴 𝑤ℎ𝑒𝑟𝑒 A=1𝐴=1 𝑖𝑓 vi≥p𝑣𝑖≥𝑝 𝑎𝑛𝑑 A=0𝐴=0 𝑖𝑓 vi≤p𝑣𝑖≤𝑝 p∑i=nnA𝑝∑𝑖=𝑛n𝐴 𝑤ℎ𝑒𝑟𝑒 A=0𝐴=0 𝑖𝑓 vi≥p𝑣𝑖≥𝑝 𝑎𝑛𝑑 A=1𝐴=1 𝑖𝑓 vi≤p𝑣𝑖≤𝑝 pn𝑝𝑛 p(n−1)𝑝(𝑛−1) p(1/n)𝑝(1/𝑛) ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** p∑i=nnA𝑝∑𝑖=𝑛n𝐴* 𝑤ℎ𝑒𝑟𝑒 *A=1𝐴=1* 𝑖𝑓 *vi≥p𝑣𝑖≥𝑝* 𝑎𝑛𝑑 *A=0𝐴=0* 𝑖𝑓 *vi≤p𝑣𝑖≤𝑝 ***1 point*** In the exponential mechanism to calculate the price to maximize the revenue, identify the correct statement in the scenario where 2 unequal prices result in the same revenue: Both prices have an unequal probability of being selected Both prices have an equal probability of being selected A higher price has a higher probability of being chosen due to normalisation A lower price has a higher probability of being chosen due to normalisation ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Both prices have an equal probability of being selected* ***1 point*** In a classification problem, if a data point lies on a hyperplane that perfectly separates the two classes, the probability of the data point belonging to class A is: 25% 50% 75% 100% ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *50%* ***1 point*** In a vanilla Principle Component Analysis method, the reconstruction loss of a protected group is \_\_\_\_\_\_\_ than the remaining data before resampling and \_\_\_\_\_\_\_ than the remaining data after resampling. Higher, higher Higher, lower Lower, higher Lower, lower ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Higher, higher* ***1 point*** The goal of a Fair PCA is to find a PCA solution U where U=\[Ua, Ub\] such that reconstruction loss of the two groups A and B where A is the protected group is: Equal Unequal The protected group has a lower reconstruction loss The protected group has a higher reconstruction loss ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Equal* ***1 point*** In an ideal situation where the models are completely fair, the different parity values are: Approach 0 1 Approach 1 0 ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *0* ***1 point*** Match the following:\ i. P(M(x)=1\|xinC)−P(M(x)=1)𝑃(𝑀(𝑥)=1\|𝑥𝑖𝑛𝐶)−𝑃(𝑀(𝑥)=1) a. Fair Logistic regression\ ii. P(M(x)=1\|y=1andC)−P(M(x)=1\|y=1)𝑃(𝑀(𝑥)=1\|𝑦=1𝑎𝑛𝑑𝐶)−𝑃(𝑀(𝑥)=1\|𝑦=1) b. Statistical Parity\ iii. P(M(x)=1\|C=1)−P(M(x)=1\|C=0)𝑃(𝑀(𝑥)=1\|𝐶=1)−𝑃(𝑀(𝑥)=1\|𝐶=0) c. Equality of Opportunity i. - a, ii. - b, iii. - c i. - b, ii. - a, iii. - c i. - c, ii. - a, iii. - b i. - b, ii. - c, iii. - a ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *i. - b, ii. - c, iii. - a* Which of the following methods is the best method to efficiently protect the data to preserve the privacy of the users? Anonymization Cryptographical Solution Statistical solution Data Compression Data Duplication ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Statistical solution* ***1 point*** Between a randomized response (with epsilon\>0) and a fair coin toss response, which algorithm would you use to preserve privacy but have a better utility? Randomized response because the chance of falsehood is 50% Randomized response because the chance of truth is greater than 50% Randomized response because the chance of falsehood is greater than 50% Coin toss response because the chance of falsehood is 50% Coin toss response because the chance of truth is greater than 50% Coin toss response because the chance of falsehood is greater than 50% ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Randomized response because the chance of truth is greater than 50%* ***1 point*** Consider the equation in the context of privacy guarantees (The notations used are the same as used during the lecture).\ P(RR(x′)=b)∗e−ε≤P(RR(x)=b)≤P(RR(x′)=b)∗eεP(RR(x′)=b)∗e−ε≤P(RR(x)=b)≤P(RR(x′)=b)∗eε\ \ To maximize the privacy gains, which of the following values should be changed and how? εε should be maximum for privacy, εε should be minimum for utility εε should be minimum for privacy, εε should be minimum for utility εε should be maximum for privacy, εε should be maximum for utility εε should be minimum for privacy, εε should be maximum for utility εε is unrelated ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** εε* should be minimum for privacy, *εε* should be maximum for utility* ***1 point*** Consider the below values:\ X=x1,x2,\.....xNX=x1,x2,\.....xN is the truth of an experiment\ Y=y1,y2,\...\...yNY=y1,y2,\...\...yN is the revealed values instead of the truth\ To identify the average of truth, YY as an estimator cannot be used for the process by which it was obtained. You derive new values ZZ where Z=z1,z2,\...\...zNZ=z1,z2,\...\...zN from YY which are better estimators of XX. How do you arrive at the values ZZ? Removing the bias from Y introduced through the random process Adding the bias to Y removed through the random process Removing the variance from Y introduced through the random process Adding the variance to Y removed through the random process ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Removing the bias from Y introduced through the random process* ***1 point*** If εε is fixed, given a privacy guarantee, to improve the utility, which of the following values can be modified? Increase the number of experiments Increase the amount of randomness Increase the amount of bias introduced in the random process Increase the amount of variance introduced in the random process ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Increase the number of experiments* ***1 point*** Identify the equation for the ε-differential mechanism (The notations used are the same as used during the lecture): P(M(x)ЄS)P(M(x′)ЄS)≤eεP(M(x)ЄS)P(M(x′)ЄS)≤eε P(M(x′)ЄS)P(M(x′)ЄS)≤eεP(M(x′)ЄS)P(M(x′)ЄS)≤eε P(M(x)ЄS)P(M(x)ЄS)≤eεP(M(x)ЄS)P(M(x)ЄS)≤eε P(M(x)ЄS)P(M(x′)ЄS)≥eεP(M(x)ЄS)P(M(x′)ЄS)≥eε P(M(x′)ЄS)P(M(x′)ЄS)≥eεP(M(x′)ЄS)P(M(x′)ЄS)≥eε P(M(x)ЄS)P(M(x)ЄS)≥eεP(M(x)ЄS)P(M(x)ЄS)≥eε ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** P(M(x)ЄS)P(M(x′)ЄS)≤eεP(M(x)ЄS)P(M(x′)ЄS)≤eε ***1 point*** Identify the correct scenario in the case of differential privacy Trust the curator; Trust the world Do not trust the curator; Trust the world Trust the curator; Do not trust the world Do not trust the curator; Do not trust the world ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Trust the curator; Do not trust the world* ***1 point*** Identify all the values representing sensitivity in a laplacian mechanism where the function under consideration is an average of n binary values {0,1} (The notations used are the same as used during the lecture). 1n1n 1n\|x′n−xn\|1n\|xn′−xn\| εε εnεn −1n−1n ΔεΔε ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** 1n1n 1n\|x′n−xn\|1n\|xn′−xn\| ***1 point*** Identify the distribution from which the noise is derived in a laplacian mechanism. The representation is of the form Laplacian(a,b) where a is the mean and b is the spread parameter. (The notations used are the same as used during the lecture) laplacian (1,Δε)(1,Δε) laplacian (Δε,0)(Δε,0) laplacian (Δε,1)(Δε,1) laplacian (0,Δε)(0,Δε) laplacian (1,1)(1,1) laplacian (0,0)(0,0) ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *laplacian *(0,Δε)(0,Δε) ***1 point*** Higher privacy guarantees can be achieved in which of the following scenarios? Identify all the possible scenarios. Epsilon should be high Inverse Sensitivity should be high Variance should be high Noise should be high Utility should be high ### **Partially Correct. Score: 0.5** ### **Accepted Answers:** *Variance should be high* *Noise should be high* ***1 point*** Identify the deviation of the value from the truth in the scenario of a laplacian mechanism. (The notations used are the same as used during the lecture). o(1εn)o(1εn) o(n′εn)o(n′εn) o(εn)o(εn) o(eεn)o(eεn) o(εn)o(εn) ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** o(1εn)o(1εn) ***1 point*** In the scenario of a privacy-utility trade-off, for fixed privacy, the number of samples required for a particular utility varies between the Laplacian mechanism and Randomized response is different by what factor? Constant factor Linear factor Exponential factor Logarithmic factor Quadratic factor ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Quadratic factor* What does SEAT stand for? Sentence Evaluation and Association Test Semantic Embedding and Association Test Sentence Embedding Association Test Sentiment Embedding Analysis Test ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Sentence Embedding Association Test* ***1 point*** What is coreference resolution? Identifying all mentions in a text referring to different entities Finding all mentions in a text that refer to the same real-world entity Analyzing sentence structure for grammatical correctness Determining the sentiment of a text ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Finding all mentions in a text that refer to the same real-world entity* ***1 point*** For a reliable COBIAS score, the context variance (CV) should: Tend to 0 Tend to 1 Be maximized Be minimized to 1 ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Tend to 0* ***1 point*** Given the sentences \"A man is walking\" and \"A nurse is walking,\" a higher similarity score indicates: More bias Less bias No bias Ambiguous bias ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Less bias* ***1 point*** Pointwise Mutual Information (PMI) is used to measure the association between two events. Which of the following statements best describes PMI? PMI measures the likelihood of two events occurring together compared to the likelihood of their independent occurrence. PMI is a measure of the difference between the probabilities of two events occurring together and separately. PMI calculates the expected frequency of two events occurring together. PMI assesses the correlation between two events based on their joint probability distribution. ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *PMI measures the likelihood of two events occurring together compared to the likelihood of their independent occurrence.* ***0 points*** Consider the following prompt: \"The image shows a subject with occupation \'{occupation}\'. What is the gender of the central subject in this image?\"\ \ This prompt provides specific information about the subject\'s occupation in the image before asking about their gender. Based on this description, how would you classify the prompt? Informed Task Blind Task Follow-up Task Generic Task ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Informed Task* ***1 point*** Which of the following statements is true? Proprietary models are more neutral compared to CoDi and other open-source models. Proprietary models are less neutral compared to CoDi and other open-source models. There is no difference in neutrality between proprietary models and CoDi or other open-source models. Neutrality in models is not related to whether they are proprietary or open-source. ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Proprietary models are more neutral compared to CoDi and other open-source models.* ***1 point*** Mitigating bias can be done through: Fine-Tuning Increasing the parameters Using encoder models Guardrails ### **Partially Correct. Score: 0.5** ### **Accepted Answers:** *Fine-Tuning* *Guardrails* ***1 point*** In the paper AutoDebias, we search for bias by \_\_\_\_\_\_\_\_\_ JS Divergence and debias by \_\_\_\_\_\_\_\_\_\_\_ JS Divergence. Maximizing, Minimizing Minimizing, Maximizing Maximizing, Not Changing Not Changing, Minimizing ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Maximizing, Minimizing* ***1 point*** What does a high language modeling score indicate? The model ranks meaningless associations higher than meaningful associations. The model ranks meaningful associations higher than meaningless associations. The model equally ranks meaningful and meaningless associations. The model\'s performance is not related to ranking associations. ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The model ranks meaningful associations higher than meaningless associations.* ***1 point*** What does the Idealized CAT score represent? Combination of language model score and bias score Combination of stereotype score and fairness score Combination of language model score and stereotype score Combination of language model score and accuracy score ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Combination of language model score and stereotype score* Which of the following define bias? (Select all that apply) Systematic favoritism towards certain groups Random errors in data collection Neutral and unbiased judgment Systematic exclusion of certain data points Random selection of samples Systematic deviation from rationality in judgment ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Systematic favoritism towards certain groups* *Systematic deviation from rationality in judgment* ***1 point*** Which of the following is one of the ways to measure bias Random sampling Cross-Validation Using Benchmark datasets Chi-Square test ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Using Benchmark datasets* ***1 point*** Which of the following statements is true? Demographics do not influence the perception of bias. Demographics can influence the perception of bias. Perception of bias is unaffected by demographics. Demographics and bias perception are unrelated. ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Demographics can influence the perception of bias.* ***1 point*** What does each pair in the CrowS-Pairs dataset consist of? One stereotype sentence and one neutral sentence One biased sentence and one unbiased sentence One stereotype sentence and one opposite stereotype sentence One stereotype sentence and one less stereotype sentence ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *One stereotype sentence and one less stereotype sentence* ***1 point*** The statement "Women are bad drivers" is a Stereotype Anti-Stereotype Non-Stereotype Neutral Statement ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Stereotype* ***1 point*** Which of the following is NOT a common source of bias in data? Historical inequities reflected in the data Data collection methods that over-represent certain groups Balanced representation of all demographic groups Labeling errors or subjective annotations ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Balanced representation of all demographic groups* ***1 point*** At which stage of the machine learning process can bias be introduced by over- or under-representing certain groups? Data Collection Data Pre-processing Data Annotation All of the above ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *All of the above* ***1 point*** Which of the following characteristics can make a statement biased? Being stereotypical Being socially aligned Having inaccurate information Having emotional language ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Being stereotypical* ***1 point*** One of the reasons bias exists in models is: Algorithm complexity Data Model architecture Training duration ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Data* ***1 point*** When is bias a problem? When it reduces the models performance When it has negative impact When it aligns with train data but not test data When it increases the complexity ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *When it has negative impact* ***0 points*** Which of the following is the correct setting for contrastive learning? Irrespective of the sentences, minimise the distance between their embeddings Irrespective of the sentences, maximise the distance between their embeddings If sentences are similar, minimise the distance between their embeddings If sentences are different, minimise the distance between their embeddings If sentences are different, maximise the distance between their embeddings If sentences are different, minimise the distance between their embeddings ### **Yes, the answer is correct. Score: 0** ### **Accepted Answers:** *If sentences are similar, minimise the distance between their embeddings* *If sentences are different, maximise the distance between their embeddings* What kind of content information do you want to remove from the model data? Biased or discriminatory data Useful patterns and trends General public data Random noise Personally identifiable information Valid and accurate data ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Biased or discriminatory data* *Personally identifiable information* ***1 point*** What are the reasons to choose unlearning over retraining? To improve the overall performance of the model To add new data to the model To change the underlying algorithm of the model To completely overhaul the model\'s architecture To save computational resources and time ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *To save computational resources and time* ***1 point*** Identify the steps involved in the exact unlearning as discussed in the course. Isolate the data -\> shard the data -\> slice the data -\> aggregate the data Aggregate the data -\> isolate the data -\> slice the data -\> shard the data Shard the data -\> Slice the data -\> Isolate the data -\> Aggregate the data Shard the data -\> Isolate the data -\> Slice the data -\> Aggregate the data Isolate the data -\> slice the data -\> shard the data -\> aggregate the data ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Shard the data -\> Isolate the data -\> Slice the data -\> Aggregate the data* ***1 point*** Which model should be retrained in the exact unlearning process? The constituent model that is trained over the isolated data The constituent model that is trained over the sharded data The constituent model that is trained over the aggregated data The constituent model that is trained over the sliced data ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The constituent model that is trained over the sharded data* ***1 point*** How should the original model and the model after the below unlearning methods behave?\ 1) exact unlearning\ 2) approximate unlearning 1) distributionally identical 2) distributionally identical 1) distributionally close 2) distributionally close 1) distributionally identical 2) distributionally close 1) distributionally close 2) distributionally identical ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *1) distributionally identical 2) distributionally close* ***1 point*** How does unlearning via differential privacy work? check whether an adversary can reliably tell apart the models before unlearning and after unlearning check whether the model can output private and sensitive information before and after unlearning check whether the model\'s predictions become more consistent and stable for private information before and after unlearning. check whether an adversary can identify the differences in the distribution of output data of the model before and after unlearning ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *check whether an adversary can reliably tell apart the models before unlearning and after unlearning* ***1 point*** Identify all the methods for privacy unlearning. Gradient descent on encountering the forget set Remove noise from the weights influencing the forget set Add noise to weights influencing data in forget set Gradient ascent on encountering the forget set Increase the learning rate when encountering the forget set Apply dropout to all layers when encountering the forget set ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Add noise to weights influencing data in forget set* *Gradient ascent on encountering the forget set* ***1 point*** Match the unlearning method to their corresponding concept\ 1) privacy unlearning I. data and model architecture is not modified\ 2) concept unlearning II. use membership inference attack concept\ 3) example unlearning III. forget set is not clearly defined\ 4) ask for unlearning IV. forget set is clearly defined 1-III, 2-I, 3-IV, 4-II 1-II, 2-III, 3-IV, 4-I 1-IV, 2-II, 3-I, 4-III 1-I, 2-IV, 3-II, 4-III 1-IV, 2-I, 3-III, 4-II ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *1-II, 2-III, 3-IV, 4-I* ***1 point*** The forget set to be unlearned is not known in which of the following: Example Unlearning Differential Privacy Unlearning Privacy unlearning Concept unlearning ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Concept unlearning* ***1 point*** In the scenario of ask for unlearning, what kind of things can be easily unlearned? Hate speech Toxic content Factual Information Sensitive information ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Factual Information* ***1 point*** When evaluating the quality of unlearning using Membership Inference Attack, which of the following scenarios implies that the unlearning is successful? The accuracy increases on the forget set The accuracy drops on the forget set The accuracy stays the same on the forget set The accuracy increases on the test set The accuracy drops on the test set The accuracy stays the same on the test set ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The accuracy drops on the forget set* ***1 point*** What are some metrics to evaluate the unlearning? If it was more computationally efficient compared to retraining Increased size of the original dataset If the unlearning retains information derived from the concept to be forgotten If the performance has been maintained before and after unlearning ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *If it was more computationally efficient compared to retraining* *If the performance has been maintained before and after unlearning* ***1 point*** In an interclass confusion scenario where confusion is synthetically added to a dataset by label flipping for some of the concepts, identify the kind of unlearning method that can be used to unlearn the data points that have their labels flipped. Assume that you have the entire data points for which the labels were flipped. Concept unlearning Example Unlearning Differential Privacy Unlearning Exact unlearning Ask to forget ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Exact unlearning* ***1 point*** What idea does the paper Corrective Machine Learning build upon? Not all poisoned data can be identified for unlearning Identifying and removing a small subset of poisoned data points is sufficient to ensure the model\'s integrity Enhancing the model\'s ability to handle completely new, unseen poisoned data The accuracy of the model improves proportionally with the amount of data removed, regardless of whether it is poisoned or not Adding redundant data to the dataset to counteract the effects of poisoned data. Not all poisoned data can be identified for unlearning ### **No, the answer is incorrect. Score: 0** ### **Accepted Answers:** *Not all poisoned data can be identified for unlearning* *OR* *Not all poisoned data can be identified for unlearning* ***1 point*** Identify all the methods that act as the baseline for the TOFU benchmark dataset Gradient Descent Gradient Ascent Gradient Difference Gradient boosting Gradient Clipping ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Gradient Ascent* *Gradient Difference* ***1 point*** The WMDP benchmark tests on unlearning what kind of information? Biosecurity High-school biology Hate speech on Twitter Crime data ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Biosecurity* ***1 point*** You are in charge of building graph models trained on Instagram social networks to provide content recommendations to users based on their connections' content. You realize that a particular user in the network is leading to toxic content recommendations. What kind of unlearning would you use in this scenario to prevent the recommendation of toxic content? Node feature unlearning Node unlearning Edge Unlearning Subgraph unlearning ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Node unlearning* ***1 point*** In Representation Engineering, what is the **representation**? Attention heads affecting the data Positional embeddings of the data Activations of the layer affecting the data The encoder of a transformer The decoder of a transformer ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Activations of the layer affecting the data* A robust model provides unreliable predictions when met with adversaries. Which all of the following are common adversaries in this context? Distribution Shift Overfitting Noisy Data Model Compression Gradient Descent Data Augmentation ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Distribution Shift* *Noisy Data* ***1 point*** In the context of AI research, which of the following events could be considered a black swan? Incremental improvements in natural language processing algorithms. The consistent performance of AI models on standard benchmarks. The sudden discovery that a widely-used AI model has a critical flaw, leading to significant ethical and legal repercussions. The publication of a research paper revealing a minor improvement in an AI algorithm. A gradual increase in the accuracy of AI models over time. ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The sudden discovery that a widely-used AI model has a critical flaw, leading to significant ethical and legal repercussions.* ***1 point*** To train a model that achieves accuracy in the range of 95% to 98%, you need 1GB of data. To get 100% accuracy, you need 120GB of data. This idea is similar to which of the following principles: Sigmoid Distribution Power law distribution Uniform distribution Gaussian Distribution Long-tailed distribution ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Power law distribution* *Long-tailed distribution* ***1 point*** Identify the equations that can lead to a long-tailed distribution. Idea \* student \* resources \* time Idea \* student + resources \* time Idea + student + resource + time Idea - student \* resource - time ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Idea \* student \* resources \* time* ***1 point*** Black Swan lies in which of the following categories? Known Knowns Known Unknowns Unknown Knowns Unknown Unknowns ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Unknown Unknowns* ***1 point*** Match the items below with their corresponding descriptions.\ Column A Column B\ I. Known Knowns A. Close-ended questions\ II. Known Unknowns B. Recollection\ III. Unknown Knowns C. Open-ended exploration\ IV. Unknown Unknowns D. Self-Analysis I-C, II-D, III-A, IV-B I-B, II-A, III-D, IV-C I-A, II-B, III-C, IV-D I-D, III-C, III-B, IV-A ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *I-B, II-A, III-D, IV-C* ***1 point*** Why is Black Swan and Long-tailed distribution important? Understand small things that have a small but useful effect Understand large things that have a small but useful effect Understand small things that have a catastrophic effect Understand large things that have a catastrophic effect ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Understand small things that have a catastrophic effect* ***1 point*** To check if an image classification model is robust, identify all the training and testing processes that can be used from below. The three datasets are ImageNet, AugMix, Mixup Train on AugMix and test on AugMix Train on AugMix and test on ImageNet Train on ImageNet and test on AugMix Train on Mixup and test on ImageNet Train on ImageNet and test on ImageNet Train on ImageNet and test on Mixup ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Train on ImageNet and test on AugMix* *Train on ImageNet and test on Mixup* ***1 point*** Identify all the conditions to check if a model is robust? Models with larger parameters Models with small parameters Models that can generalise better Models trained to perform the best on a specific type of data ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Models with larger parameters* *Models that can generalise better* ***1 point*** Which of the following are some data augmentation methods? DataShrink AugMix Label Flipping Mixup ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *AugMix* *Mixup* ***1 point*** The introduction of new lighting conditions in an image dataset would most likely cause? Distribution Shift Concept Shift Model Decay Feature Extraction ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Distribution Shift* ***1 point*** Identify the goal(s) of a model when training with RLHF is as follows: Maximize the penalty Maximize the reward Minimize the penalty Minimize the reward ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Maximize the reward* *Minimize the penalty* ***1 point*** Identify the step(s) involved in RLHF pipeline: Supervised fine-tuning Unsupervised fine-tuning Reward model training Penalty model training Proximal Policy optimization Convex Policy Optimization ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Supervised fine-tuning* *Reward model training* *Proximal Policy optimization* ***1 point*** Identify issue(s) associated with RLHF from below: It does not perform as well as supervised learning Performance sensitive to hyperparameters It does not perform as well as unsupervised learning Fitting and optimization of the reward function is computationally expensive Pretrained models easily outperform them in tasks like summarization ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Performance sensitive to hyperparameters* *Fitting and optimization of the reward function is computationally expensive* ***1 point*** What is the constraint under which the model optimization is done in RLHF to ensure that the model doesn't diverge too far from the pretrained model? KL Divergence L2 Regularization Entropy Maximization Gradient Clipping ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *KL Divergence* ***1 point*** What are the issues with reward modelling? Reward shrinking - gradually decreasing rewards over time Reward misalignment - reward signals do not align with the desired outcomes Reward saturation - model stops learning after a certain reward threshold is reached Reward consistency - ensuring rewards are uniformly distributed Reward hacking - maximise reward with imperfect proxy and forget the goal ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Reward hacking - maximise reward with imperfect proxy and forget the goal* ***1 point*** Direct Preference of Optimization works in which one of the following ways: RLHF without rewards model RLHF without human feedback RLHF without reinforcement learning RLHF without KL divergence ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *RLHF without rewards model* ***1 point*** Identify the issues with human feedback in RLHF Overabundance of feedback Consistent and uniform feedback Biased and harmful feedback Feedback redundancy ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Biased and harmful feedback* ***1 point*** Identify the way(s) to maintain transparency in the context of RLHF to avoid safety and alignment issues. Quality-assurance measures for human feedback Minimize the involvement of humans to reduce biases Use black-box algorithms to simplify the process Avoid documenting the feedback process to save time Limit the diversity of human feedback to ensure consistency Have a powerful loss function when optimizing the reward model ### **Partially Correct. Score: 0.5** ### **Accepted Answers:** *Quality-assurance measures for human feedback* *Have a powerful loss function when optimizing the reward model* ***1 point*** What is distribution shift in machine learning? The training distribution is not similar to the test distribution The model\'s parameters change during training The target variable\'s distribution changes over time The model\'s prediction accuracy improves on new data ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The training distribution is not similar to the test distribution* ***1 point*** What is data poisoning in the context of machine learning? Removing important features from the dataset Oversampling minority classes in imbalanced datasets Adding hidden functionalities to control the model behaviour Encrypting sensitive information in the training data ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Adding hidden functionalities to control the model behaviour* ***1 point*** When is a data poisoning attack considered successful? The model\'s overall accuracy decreases significantly The model fails to converge during training The model becomes computationally inefficient The model outputs the specific behaviour when it encounters the trigger ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *The model outputs the specific behaviour when it encounters the trigger* ***1 point*** Consider the following scenario:\ You have a dataset with 10000 samples and a model trained over it has a test accuracy of almost 94%. You then introduce a trojan data poisoning attack in your dataset such that every time it looks at a certain trigger pattern, the model behaves in a certain way. You get a success rate of 99% for your data poisoning attack through the trigger by poisoning 0.01% of your samples. What is the new test accuracy of your model? Identify the correct range. 40 to 50 percent 50 to 60 percent 60 to 70 percent 70 to 80 percent 80 to 90 percent 90 to 100 percent ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *90 to 100 percent* ***1 point*** What are the different defence mechanism(s) against poisoning attacks? Biasing Filtering Unlearning Representation engineering AutoDebias ### **Yes, the answer is correct. Score: 1** ### **Accepted Answers:** *Filtering* *Unlearning*