18IT033_Practical_Examination_Work

Nidhi Gajjar
4 min readNov 18, 2021

About Dataset:

The dataset is about Autistic spectrum Disorder and dataset has various features on basis of which one can classify whether a person has ASD or not.

  1. Below screenshot shows the dataset that is being provided and it is being loaded in the Orange tool for data preprocessing and visualization.

2. Setting the “Target variable” while loading the dataset to provide target variable to our model.

3. For gaining the information about the dataset the “Data Info” is used in orange. using this we get information that dataset has 704 rows with 21 columns features out of which 2 are numeric and rest 19 are categorical feature.

4. The dataset is viewed in the tabular form using data table.

5. Without performing any data preprocessing on given dataset, evaluation results for different models can be viewed using the Test and Score feature in Orange tool. For this given dataset without preprocessing also we get a very good level of model accuracy for all the models.

5. Confusion matrix can be plotted for all data models and below is the confusion matrix for KNN model.

6. Performing data preprocessing using the Preprocess function and in preprocessing I performed “Feature Selection” which selects 80% features from all available features in dataset, performs normalization on the given data, if there any missing values in dataset Impute missing value process is executed which imputes imputes average values in that particular row and One hot encoding is performed.

7. After preprocessing look at the preprocessed data.

8. After preprocessing again performing Test and Score, we can observe that there is negligible increase in the accuracy of KNN and SVM model.

9. The confusion matrix for Naive Baiyes Model.

10. The entire work flow created in orange tool

11. Loading the preprocessed data in PowerBI tool.

12. Created a stacked bar chart which shows the A1-A10 score vs the classification whether disease is present or not having A1-A10 score = 1.

13. A pie chart is created for “BRICS” nation which shows the numbers for ASD class = yes and no for all the BRICS nation and we can identify India has major value in “No”

14. Stacked chart based on the ethnicity of different countries and presence of ASD in those people.

14. Box Plot representation for class Yes-No for feature A10 score.

15. The entire dashboard created in PowerBI

--

--

Nidhi Gajjar

Site Reliability Engineer, 2X AWS Certified, AWS Cloud Enthusiastic