.

bansalrishi · bansalrishi · commit 8816917ab290 · 2021-07-05T18:05:43.000+05:30
diff --git a/.ipynb_checkpoints/03. Support Vector Machines-checkpoint.ipynb b/.ipynb_checkpoints/03. Support Vector Machines-checkpoint.ipynb
@@ -704,6 +704,24 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Advantages of SVM\n",
+    "1. Robust to outliers: it ignore outliers and find the hyper-plane that has the maximum margin. \n",
+    "2. Effective in cases where the number of features is greater than the number of samples\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Disadvantages of SVM\n",
+    "1. Doesn’t perform well when the dataset is large because the required training time is higher\n",
+    "2. Doesn’t perform very well, when the data set has more noise i.e. target classes are overlapping"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -728,7 +746,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.6"
+   "version": "3.8.5"
   }
  },
  "nbformat": 4,
diff --git a/.ipynb_checkpoints/09. Random Forest-checkpoint.ipynb b/.ipynb_checkpoints/09. Random Forest-checkpoint.ipynb
@@ -6,7 +6,8 @@
    "source": [
     "# Random Forest\n",
     "- Ensemble Algorithm\n",
-    "- model made up of many decision trees\n",
+    "- model made up of many decision trees, which are independent of/uncorrelated to each other\n",
+    "- because of no/little correlation between trees provides randomness, due to which impact of error from one tree dont influence other trees\n",
     "\n",
     "Key Concepts:\n",
     "- While building trees it performs random sampling of training data points\n",
@@ -220,6 +221,34 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Advantages\n",
+    "* its robust for missing and erroneous data as well as insufficient information with good performance"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Questions and Answers\n",
+    "**Ques:** While taking samples randomly in random forest for making individual decision trees, some of the entries are duplicates. DOes this creates problem?\n",
+    "**Ans**:  \n",
+    "- It will be considered as “Row sampling without replacement”.When a row is added, it means a new data point is added in the feature space. The model will only try to learn from this data point.  \n",
+    "- Problem happens when the same feature is added twice, it means a dimension will overlap another dimension. It technically makes no sense but just increases the computation of the algorithm.  \n",
+    "- CRicket Balls Box: Adding same row/reading is like adding ball. No issue with this. But if we add a plate in the base of box it is of no use(like adding duplicate feature)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Research Paper:  \n",
+    "https://www.hindawi.com/journals/jam/2012/258054/    "
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -244,7 +273,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.6"
+   "version": "3.8.5"
   }
  },
  "nbformat": 4,
diff --git a/03. Support Vector Machines.ipynb b/03. Support Vector Machines.ipynb
@@ -704,6 +704,24 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Advantages of SVM\n",
+    "1. Robust to outliers: it ignore outliers and find the hyper-plane that has the maximum margin. \n",
+    "2. Effective in cases where the number of features is greater than the number of samples\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Disadvantages of SVM\n",
+    "1. Doesn’t perform well when the dataset is large because the required training time is higher\n",
+    "2. Doesn’t perform very well, when the data set has more noise i.e. target classes are overlapping"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
diff --git a/09. Random Forest.ipynb b/09. Random Forest.ipynb
@@ -6,7 +6,8 @@
    "source": [
     "# Random Forest\n",
     "- Ensemble Algorithm\n",
-    "- model made up of many decision trees\n",
+    "- model made up of many decision trees, which are independent of/uncorrelated to each other\n",
+    "- because of no/little correlation between trees provides randomness, due to which impact of error from one tree dont influence other trees\n",
     "\n",
     "Key Concepts:\n",
     "- While building trees it performs random sampling of training data points\n",
@@ -220,6 +221,34 @@
     "plt.show()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Advantages\n",
+    "* its robust for missing and erroneous data as well as insufficient information with good performance"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Questions and Answers\n",
+    "**Ques:** While taking samples randomly in random forest for making individual decision trees, some of the entries are duplicates. DOes this creates problem?\n",
+    "**Ans**:  \n",
+    "- It will be considered as “Row sampling without replacement”.When a row is added, it means a new data point is added in the feature space. The model will only try to learn from this data point.  \n",
+    "- Problem happens when the same feature is added twice, it means a dimension will overlap another dimension. It technically makes no sense but just increases the computation of the algorithm.  \n",
+    "- CRicket Balls Box: Adding same row/reading is like adding ball. No issue with this. But if we add a plate in the base of box it is of no use(like adding duplicate feature)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Research Paper:  \n",
+    "https://www.hindawi.com/journals/jam/2012/258054/    "
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,