Random forest is a version of ensemble learning.
It's when you take the same algorithm multiple times and create something more powerful.
Yfor the data point in question, and assign the new data point the average across all the predicted
Doing this allows you to improve the accuracy of your prediction.
How many lollies in a jar? Imagine taking notes of every guess - getting around 1000 and then beginning to average them out or take the median. Statistically speaking, you have a highly likelihood of being closer to the truth.
Once you hit the middle of the normal distribution, you are more likely to be on the money for the guess.
This is the last regression model. If you understand decision tree regression, you'll understand random forest.
From decision tree, we know that we will need the visualisation using the non-continuous result.
For the regressor, we use RandomForestRegressor library.
# Prediciting the Random Forest results # Create the Regressor from sklearn.ensemble import RandomForestRegressor regressor = RandomForestRegressor(random_state=0) regressor.fit(X, y)
Simply, with these lines, we can already determine that the graph is no longer continuous.
By having several decision trees, we end up with a lot more "steps" than we had with just one decision tree.
More tree !== more steps. The more trees you have, the more the average will converge towards the same average.
Generally the steps will become better placed depending on the average.