eli5-org · lopuhin · Mar 30, 2025 · Nov 16, 2022
diff --git a/notebooks/xgboost-titanic.ipynb b/notebooks/xgboost-titanic.ipynb
@@ -105,7 +105,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We do just minimal preprocessing: convert obviously contiuous *Age* and *Fare* variables to floats,\n",
+    "We do just minimal preprocessing: convert obviously continuous *Age* and *Fare* variables to floats,\n",
     "and *SibSp*, *Parch* to integers. Missing *Age* values are removed."
    ]
   },
@@ -170,14 +170,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "There is one tricky bit about the code above: one may be templed to just pass ``dense=True`` to ``DictVectorizer``: after all, in this case the matrixes are small. But this is not a great solution, because we will loose the ability to distinguish features that are missing and features that have zero value.\n",
+    "There is one tricky bit about the code above: one may be tempted to just pass ``dense=True`` to ``DictVectorizer``: after all, in this case the matrixes are small. But this is not a great solution, because we will lose the ability to distinguish features that are missing and features that have zero value.\n",
     "\n",
     "\n",
     "## 3. Explaining weights\n",
     "\n",
-    "In order to calculate a prediction, XGBoost sums predictions of all its trees.\n",
+    "To calculate a prediction, XGBoost sums predictions of all its trees.\n",
     "The number of trees is controlled by ``n_estimators`` argument and is 100 by default.\n",
-    "Each tree is not a great predictor on it's own, but by summing across all trees,\n",
+    "Each tree is not a great predictor on its own, but by summing across all trees,\n",
     "XGBoost is able to provide a robust estimate in many cases. Here is one of the trees:"
    ]
   },
@@ -1151,8 +1151,8 @@
    "source": [
     "## 5. Adding text features\n",
     "\n",
-    "Right now we treat *Name* field as categorical, like other text features.\n",
-    "But in this dataset each name is unique, so XGBoost does not use this feature at all, because it's\n",
+    "Now we treat *Name* field as categorical, like other text features,\n",
+    "but in this dataset, each name is unique, so XGBoost does not use this feature at all, because it's\n",
     "such a poor discriminator: it's absent from the weights table in section 3.\n",
     "\n",
     "But *Name* still might contain some useful information. We don't want to guess how to best pre-process it\n",