diff --git a/notebooks/xgboost-titanic.ipynb b/notebooks/xgboost-titanic.ipynb index 2fba63d1..c26cabde 100644 --- a/notebooks/xgboost-titanic.ipynb +++ b/notebooks/xgboost-titanic.ipynb @@ -105,7 +105,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We do just minimal preprocessing: convert obviously contiuous *Age* and *Fare* variables to floats,\n", + "We do just minimal preprocessing: convert obviously continuous *Age* and *Fare* variables to floats,\n", "and *SibSp*, *Parch* to integers. Missing *Age* values are removed." ] }, @@ -170,14 +170,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "There is one tricky bit about the code above: one may be templed to just pass ``dense=True`` to ``DictVectorizer``: after all, in this case the matrixes are small. But this is not a great solution, because we will loose the ability to distinguish features that are missing and features that have zero value.\n", + "There is one tricky bit about the code above: one may be tempted to just pass ``dense=True`` to ``DictVectorizer``: after all, in this case the matrixes are small. But this is not a great solution, because we will lose the ability to distinguish features that are missing and features that have zero value.\n", "\n", "\n", "## 3. Explaining weights\n", "\n", - "In order to calculate a prediction, XGBoost sums predictions of all its trees.\n", + "To calculate a prediction, XGBoost sums predictions of all its trees.\n", "The number of trees is controlled by ``n_estimators`` argument and is 100 by default.\n", - "Each tree is not a great predictor on it's own, but by summing across all trees,\n", + "Each tree is not a great predictor on its own, but by summing across all trees,\n", "XGBoost is able to provide a robust estimate in many cases. Here is one of the trees:" ] }, @@ -1151,8 +1151,8 @@ "source": [ "## 5. Adding text features\n", "\n", - "Right now we treat *Name* field as categorical, like other text features.\n", - "But in this dataset each name is unique, so XGBoost does not use this feature at all, because it's\n", + "Now we treat *Name* field as categorical, like other text features,\n", + "but in this dataset, each name is unique, so XGBoost does not use this feature at all, because it's\n", "such a poor discriminator: it's absent from the weights table in section 3.\n", "\n", "But *Name* still might contain some useful information. We don't want to guess how to best pre-process it\n",