This project aims to predict anaemia in individuals based on various features using a Decision Tree Classifier. The dataset used for this project contains information about patients, including their sex and various pixel measurements. The model is trained to classify whether an individual is anaemic or not.
To run this project, you need to have Python installed along with the following libraries:
- pandas
- scikit-learn
You can install the required libraries using pip:
pip install pandas scikit-learn
- Clone this repository or download the project files.
- Place the
anaemia_dataset.csv
file in the project directory. - Run the script using Python:
python anaemia_detection.py
The dataset anaemia_dataset.csv
contains the following columns:
- Number: Unique identifier for each patient (not used in the model).
- Sex: Gender of the patient (M/F).
- %Red Pixel: Percentage of red pixels in the image.
- %Green Pixel: Percentage of green pixels in the image.
- %Blue Pixel: Percentage of blue pixels in the image.
- Hb: Hemoglobin level of the patient.
- Anaemic: Target variable indicating whether the patient is anaemic (1) or not (0).
The project uses a Decision Tree Classifier to predict anaemia. The following steps are performed:
-
Data Preprocessing:
- Load the dataset using pandas.
- Encode the categorical variable 'Sex' using
LabelEncoder
. - Split the dataset into features (
X
) and target variable (y
).
-
Train-Test Split:
- The dataset is divided into training and testing sets using
train_test_split
.
- The dataset is divided into training and testing sets using
-
Model Fitting:
- A Decision Tree Classifier is instantiated and fitted to the training data.
-
Predictions:
- The model is used to predict outcomes on the test set.
The accuracy of the model is evaluated using the accuracy score metric. The output will display the accuracy of the predictions made on the test dataset.
print('Accuracy of dataset is: ')
score
Contributions are welcome! If you have suggestions for improvements or new features, please fork the repository and submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
Feel free to modify any section to better fit your project's specifics or your personal style!