18 Questions
What is the initial prediction of the output values in XGBoost?
0.5
How is the residual calculated in XGBoost?
Residual = Actual value - Predicted value
What does the similarity score in XGBoost represent?
Similarity score = (sum of residuals)^2 / Number of residuals + λ
How is the information gain calculated in XGBoost?
Information Gain = Left Similarity + Right Similarity - Similarity of root
What is the role of the learning rate (æ) in XGBoost predictions?
The learning rate (æ) is used to control the contribution of each tree in the final prediction.
What is the main difference between Random Forest and Gradient Boosting algorithms?
Random Forest builds multiple independent trees in parallel, while Gradient Boosting builds trees sequentially to correct errors.
What does XGBoost stand for?
Extreme Gradient Boosting
In what types of problems can XGBoost be used?
Both classification and regression problems
How does XGBoost work?
By training a number of decision trees on subsets of data and combining their predictions
What is the purpose of the parameter 'gamma' in XGBoost?
To remove branches with gains smaller than the gamma value
How does XGBoost use regularization to improve performance?
By reducing sensitivity to individual data and avoiding overfitting
What is the main advantage of using ensemble techniques like XGBoost?
Building a more robust predictive model by combining multiple weak learners
Which algorithm performed better on the dataset, Random Forest or Decision Tree?
Random Forest
How was the accuracy of Decision Tree algorithm compared to Random Forest?
Decision Tree had an accuracy of 67.12% while Random Forest performed better.
What was the aim of the paper in predicting the selling price of used cars?
To predict the selling price using Linear Regression, Decision Tree, and Gradient Boosting algorithms.
What is the purpose of feature selection in machine learning?
To reduce the number of features and minimize computational cost.
Which algorithm performed better among Linear Regression, Decision Tree, and Gradient Boosting?
Gradient Boosted regression performed better.
What library was used to run python applications while utilizing Apache Spark Capabilities?
pyspark
This quiz covers the concept of 'min_samples_leaf' in XGBoost, which refers to the minimum number of data points allowed in a leaf node. It also provides an overview of XGBoost, a technique for supervised learning used for classification and regression problems.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free