C4.5 [release 8] decision tree generator Fri Jun 1 10:11:07 2001 ---------------------------------------- Options: File stem Read 4141 cases (57 attributes) from sb2.data.0.data Decision Tree: word_freq_remove <= 0 : | word_freq_000 <= 0.25 : | | word_freq_money <= 0.03 : | | | word_freq_free <= 0.19 : | | | | word_freq_credit <= 0.34 : | | | | | char_freq_$ <= 0.104 : | | | | | | word_freq_font <= 0.11 : | | | | | | | char_freq_! <= 0.391 : | | | | | | | | word_freq_george > 0 : 0 (597.0) | | | | | | | | word_freq_george <= 0 : | | | | | | | | | word_freq_receive <= 0.28 : | | | | | | | | | | word_freq_hp > 0.11 : 0 (533.0/4.0) | | | | | | | | | | word_freq_hp <= 0.11 : | | | | | | | | | | | word_freq_650 <= 0.17 :[S1] | | | | | | | | | | | word_freq_650 > 0.17 : | | | | | | | | | | | | word_freq_make > 0.37 : 0 (2.0) | | | | | | | | | | | | word_freq_make <= 0.37 :[S2] | | | | | | | | | word_freq_receive > 0.28 : | | | | | | | | | | word_freq_internet > 0.32 : 1 (5.0) | | | | | | | | | | word_freq_internet <= 0.32 : | | | | | | | | | | | word_freq_address > 0.43 : 1 (3.0) | | | | | | | | | | | word_freq_address <= 0.43 : | | | | | | | | | | | | word_freq_email <= 0.38 : 0 (26.0/1.0) | | | | | | | | | | | | word_freq_email > 0.38 : 1 (3.0/1.0) | | | | | | | char_freq_! > 0.391 : | | | | | | | | word_freq_order > 0.74 : 1 (13.0) | | | | | | | | word_freq_order <= 0.74 : | | | | | | | | | capital_run_length_average <= 3.277 : | | | | | | | | | | word_freq_business > 0.19 : 1 (13.0/1.0) | | | | | | | | | | word_freq_business <= 0.19 : | | | | | | | | | | | char_freq_! <= 0.979 : | | | | | | | | | | | | word_freq_people <= 0.72 : 0 (93.0/4.0) | | | | | | | | | | | | word_freq_people > 0.72 :[S3] | | | | | | | | | | | char_freq_! > 0.979 : | | | | | | | | | | | | word_freq_over > 0.72 : 1 (2.0) | | | | | | | | | | | | word_freq_over <= 0.72 :[S4] | | | | | | | | | capital_run_length_average > 3.277 : | | | | | | | | | | word_freq_edu > 0.6 : 0 (2.0) | | | | | | | | | | word_freq_edu <= 0.6 :[S5] | | | | | | word_freq_font > 0.11 : | | | | | | | char_freq_; <= 0.023 : 1 (19.0/1.0) | | | | | | | char_freq_; > 0.023 : 0 (18.0/1.0) | | | | | char_freq_$ > 0.104 : | | | | | | word_freq_hp > 0.14 : 0 (27.0) | | | | | | word_freq_hp <= 0.14 : | | | | | | | capital_run_length_average <= 1.657 : 0 (13.0) | | | | | | | capital_run_length_average > 1.657 : | | | | | | | | word_freq_edu > 0.44 : 0 (8.0) | | | | | | | | word_freq_edu <= 0.44 : | | | | | | | | | capital_run_length_average > 2.291 : 1 (58.0/2.0) | | | | | | | | | capital_run_length_average <= 2.291 : | | | | | | | | | | word_freq_you <= 1.91 : 0 (6.0/1.0) | | | | | | | | | | word_freq_you > 1.91 : 1 (10.0) | | | | word_freq_credit > 0.34 : | | | | | capital_run_length_average > 3.391 : 1 (18.0) | | | | | capital_run_length_average <= 3.391 : | | | | | | word_freq_re <= 0.8 : 0 (10.0/1.0) | | | | | | word_freq_re > 0.8 : 1 (2.0) | | | word_freq_free > 0.19 : | | | | word_freq_george > 0.12 : 0 (28.0) | | | | word_freq_george <= 0.12 : | | | | | word_freq_edu <= 0.1 : | | | | | | word_freq_hp <= 0.23 : | | | | | | | char_freq_! > 0.567 : 1 (106.0/2.0) | | | | | | | char_freq_! <= 0.567 : | | | | | | | | word_freq_people > 0.19 : 0 (9.0/1.0) | | | | | | | | word_freq_people <= 0.19 : | | | | | | | | | word_freq_meeting <= 0.09 : | | | | | | | | | | word_freq_business > 0.09 : 1 (23.0) | | | | | | | | | | word_freq_business <= 0.09 : | | | | | | | | | | | word_freq_font > 0.8 : 1 (8.0) | | | | | | | | | | | word_freq_font <= 0.8 : | | | | | | | | | | | | char_freq_# > 0.02 : 0 (9.0/1.0) | | | | | | | | | | | | char_freq_# <= 0.02 : | | | | | | | | | | | | | char_freq_; > 0.022 : 0 (6.0/1.0) | | | | | | | | | | | | | char_freq_; <= 0.022 : | | | | | | | | | | | | | | word_freq_pm <= 0.09 :[S6] | | | | | | | | | | | | | | word_freq_pm > 0.09 :[S7] | | | | | | | | | word_freq_meeting > 0.09 : | | | | | | | | | | word_freq_mail <= 0.9 : 0 (8.0) | | | | | | | | | | word_freq_mail > 0.9 : 1 (2.0) | | | | | | word_freq_hp > 0.23 : | | | | | | | char_freq_! <= 0.196 : 0 (26.0/1.0) | | | | | | | char_freq_! > 0.196 : 1 (4.0/1.0) | | | | | word_freq_edu > 0.1 : | | | | | | word_freq_650 <= 0.21 : 0 (36.0/1.0) | | | | | | word_freq_650 > 0.21 : 1 (3.0/1.0) | | word_freq_money > 0.03 : | | | word_freq_hp > 0.11 : 0 (16.0) | | | word_freq_hp <= 0.11 : | | | | word_freq_edu > 0.18 : 0 (13.0/1.0) | | | | word_freq_edu <= 0.18 : | | | | | capital_run_length_longest <= 9 : | | | | | | word_freq_free > 2.17 : 1 (2.0) | | | | | | word_freq_free <= 2.17 : | | | | | | | word_freq_money <= 2.45 : 0 (9.0) | | | | | | | word_freq_money > 2.45 : 1 (3.0/1.0) | | | | | capital_run_length_longest > 9 : | | | | | | word_freq_re <= 0.45 : 1 (189.0/4.0) | | | | | | word_freq_re > 0.45 : | | | | | | | word_freq_your <= 1.06 : 0 (5.0/1.0) | | | | | | | word_freq_your > 1.06 : 1 (6.0) | word_freq_000 > 0.25 : | | word_freq_1999 <= 0 : | | | capital_run_length_longest > 10 : 1 (208.0/4.0) | | | capital_run_length_longest <= 10 : | | | | word_freq_re > 0.45 : 0 (2.0) | | | | word_freq_re <= 0.45 : | | | | | word_freq_address > 0.19 : 1 (2.0) | | | | | word_freq_address <= 0.19 : | | | | | | capital_run_length_longest > 8 : 1 (6.0) | | | | | | capital_run_length_longest <= 8 : | | | | | | | word_freq_all <= 0.68 : 0 (3.0) | | | | | | | word_freq_all > 0.68 : 1 (3.0) | | word_freq_1999 > 0 : | | | word_freq_over > 0.06 : 1 (11.0) | | | word_freq_over <= 0.06 : | | | | capital_run_length_longest <= 48 : 0 (8.0) | | | | capital_run_length_longest > 48 : 1 (3.0/1.0) word_freq_remove > 0 : | word_freq_hp <= 0.19 : | | word_freq_edu <= 0.08 : | | | word_freq_your <= 0.32 : | | | | word_freq_internet <= 0.18 : | | | | | word_freq_3d > 0.21 : 1 (4.0) | | | | | word_freq_3d <= 0.21 : | | | | | | word_freq_font > 0.46 : 1 (4.0) | | | | | | word_freq_font <= 0.46 : | | | | | | | word_freq_report > 0.04 : 1 (4.0) | | | | | | | word_freq_report <= 0.04 : | | | | | | | | word_freq_your <= 0.07 : 1 (60.0/4.0) | | | | | | | | word_freq_your > 0.07 : | | | | | | | | | char_freq_( <= 0.049 : 0 (2.0) | | | | | | | | | char_freq_( > 0.049 : 1 (3.0) | | | | word_freq_internet > 0.18 : | | | | | word_freq_our <= 0.05 : 0 (5.0/1.0) | | | | | word_freq_our > 0.05 : 1 (8.0) | | | word_freq_your > 0.32 : | | | | word_freq_1999 <= 0.25 : 1 (555.0/6.0) | | | | word_freq_1999 > 0.25 : | | | | | word_freq_george <= 0.08 : 1 (23.0) | | | | | word_freq_george > 0.08 : 0 (3.0) | | word_freq_edu > 0.08 : | | | word_freq_money <= 0.04 : 0 (6.0) | | | word_freq_money > 0.04 : 1 (19.0) | word_freq_hp > 0.19 : | | word_freq_our <= 0.3 : 0 (15.0) | | word_freq_our > 0.3 : | | | capital_run_length_average <= 2.689 : 0 (3.0/1.0) | | | capital_run_length_average > 2.689 : 1 (8.0) Subtree [S1] capital_run_length_average <= 3.621 : | word_freq_business <= 0.13 : | | word_freq_internet <= 0.05 : | | | word_freq_re > 0.77 : 0 (171.0) | | | word_freq_re <= 0.77 : | | | | capital_run_length_longest <= 10 : | | | | | capital_run_length_average > 1.55 : 0 (214.0) | | | | | capital_run_length_average <= 1.55 : | | | | | | word_freq_lab > 0.06 : 0 (16.0) | | | | | | word_freq_lab <= 0.06 : | | | | | | | capital_run_length_longest <= 2 : | | | | | | | | capital_run_length_longest > 1 : 0 (27.0) | | | | | | | | capital_run_length_longest <= 1 : | | | | | | | | | word_freq_meeting > 0.27 : 0 (7.0) | | | | | | | | | word_freq_meeting <= 0.27 : | | | | | | | | | | word_freq_you <= 4.65 : 0 (95.0/2.0) | | | | | | | | | | word_freq_you > 4.65 : | | | | | | | | | | | word_freq_you <= 4.76 : 1 (2.0) | | | | | | | | | | | word_freq_you > 4.76 : 0 (19.0/2.0) | | | | | | | capital_run_length_longest > 2 : | | | | | | | | word_freq_pm > 0.17 : 0 (8.0) | | | | | | | | word_freq_pm <= 0.17 : | | | | | | | | | word_freq_data > 0.12 : 0 (8.0) | | | | | | | | | word_freq_data <= 0.12 : | | | | | | | | | | word_freq_order > 0.03 : 0 (7.0) | | | | | | | | | | word_freq_order <= 0.03 : | | | | | | | | | | | word_freq_cs > 0.1 : 0 (9.0) | | | | | | | | | | | word_freq_cs <= 0.1 :[S8] | | | | capital_run_length_longest > 10 : | | | | | capital_run_length_total <= 36 : | | | | | | word_freq_meeting > 0.38 : 0 (4.0) | | | | | | word_freq_meeting <= 0.38 : | | | | | | | word_freq_you > 2.86 : 1 (8.0) | | | | | | | word_freq_you <= 2.86 : | | | | | | | | capital_run_length_total <= 31 : 0 (3.0) | | | | | | | | capital_run_length_total > 31 : 1 (2.0) | | | | | capital_run_length_total > 36 : | | | | | | word_freq_edu > 0.04 : 0 (69.0/1.0) | | | | | | word_freq_edu <= 0.04 : | | | | | | | word_freq_project > 0.07 : 0 (6.0) | | | | | | | word_freq_project <= 0.07 : | | | | | | | | word_freq_people > 0.04 : 0 (4.0) | | | | | | | | word_freq_people <= 0.04 : | | | | | | | | | char_freq_( > 0.165 : 0 (25.0) | | | | | | | | | char_freq_( <= 0.165 : | | | | | | | | | | word_freq_original > 0.13 : 0 (3.0) | | | | | | | | | | word_freq_original <= 0.13 : | | | | | | | | | | | word_freq_hpl > 0.17 : 0 (9.0) | | | | | | | | | | | word_freq_hpl <= 0.17 : | | | | | | | | | | | | word_freq_over > 0.62 : 1 (2.0) | | | | | | | | | | | | word_freq_over <= 0.62 :[S9] | | word_freq_internet > 0.05 : | | | word_freq_our > 1.34 : 1 (3.0) | | | word_freq_our <= 1.34 : | | | | word_freq_your <= 0.76 : 0 (18.0) | | | | word_freq_your > 0.76 : | | | | | word_freq_address <= 0.49 : 1 (2.0) | | | | | word_freq_address > 0.49 : 0 (2.0) | word_freq_business > 0.13 : | | word_freq_make > 0.21 : 1 (4.0) | | word_freq_make <= 0.21 : | | | char_freq_; > 0.038 : 1 (2.0) | | | char_freq_; <= 0.038 : | | | | char_freq_! <= 0.049 : 0 (17.0/1.0) | | | | char_freq_! > 0.049 : | | | | | word_freq_re <= 0.3 : 1 (4.0) | | | | | word_freq_re > 0.3 : 0 (2.0) capital_run_length_average > 3.621 : | word_freq_3d > 0.21 : 1 (3.0) | word_freq_3d <= 0.21 : | | char_freq_$ > 0.008 : 1 (3.0) | | char_freq_$ <= 0.008 : | | | word_freq_our > 0.76 : 1 (6.0) | | | word_freq_our <= 0.76 : | | | | char_freq_; > 0.055 : 1 (4.0/1.0) | | | | char_freq_; <= 0.055 : | | | | | char_freq_# > 0.045 : 0 (4.0) | | | | | char_freq_# <= 0.045 : | | | | | | word_freq_your <= 3.61 : 0 (38.0/6.0) | | | | | | word_freq_your > 3.61 : 1 (2.0) Subtree [S2] word_freq_1999 > 0.14 : 0 (2.0) word_freq_1999 <= 0.14 : | word_freq_internet > 0.15 : 0 (2.0) | word_freq_internet <= 0.15 : | | capital_run_length_longest > 19 : 1 (14.0) | | capital_run_length_longest <= 19 : | | | word_freq_our <= 0.36 : 0 (6.0) | | | word_freq_our > 0.36 : 1 (2.0) Subtree [S3] capital_run_length_average <= 2.567 : 0 (3.0) capital_run_length_average > 2.567 : 1 (2.0) Subtree [S4] word_freq_our > 0.83 : 1 (2.0) word_freq_our <= 0.83 : | word_freq_meeting > 0.89 : 0 (2.0) | word_freq_meeting <= 0.89 : | | word_freq_1999 > 0.32 : 0 (2.0) | | word_freq_1999 <= 0.32 : | | | word_freq_you <= 2.32 : | | | | capital_run_length_total <= 2 : 1 (2.0) | | | | capital_run_length_total > 2 : 0 (19.0/1.0) | | | word_freq_you > 2.32 : | | | | word_freq_re <= 0.34 : 1 (7.0/1.0) | | | | word_freq_re > 0.34 : 0 (2.0) Subtree [S5] capital_run_length_total > 103 : 1 (20.0) capital_run_length_total <= 103 : | capital_run_length_longest <= 26 : 1 (9.0) | capital_run_length_longest > 26 : 0 (5.0/1.0) Subtree [S6] word_freq_technology > 0.2 : 1 (5.0) word_freq_technology <= 0.2 : | word_freq_re > 0.53 : 1 (11.0) | word_freq_re <= 0.53 : | | word_freq_our > 1.04 : 1 (21.0/1.0) | | word_freq_our <= 1.04 : | | | word_freq_mail > 0.53 : 0 (4.0) | | | word_freq_mail <= 0.53 : | | | | word_freq_report > 0.13 : 1 (3.0) | | | | word_freq_report <= 0.13 : | | | | | word_freq_mail > 0.15 : 1 (3.0) | | | | | word_freq_mail <= 0.15 : | | | | | | word_freq_internet > 0.45 : 1 (5.0) | | | | | | word_freq_internet <= 0.45 : | | | | | | | word_freq_order > 0.33 : 0 (2.0) | | | | | | | word_freq_order <= 0.33 : | | | | | | | | char_freq_! <= 0.107 : | | | | | | | | | word_freq_receive > 0.4 : 1 (2.0) | | | | | | | | | word_freq_receive <= 0.4 : | | | | | | | | | | word_freq_email > 0.54 : 0 (4.0) | | | | | | | | | | word_freq_email <= 0.54 : | | | | | | | | | | | char_freq_( > 0.028 : 0 (2.0) | | | | | | | | | | | char_freq_( <= 0.028 : | | | | | | | | | | | | word_freq_all > 0.65 : 0 (2.0) | | | | | | | | | | | | word_freq_all <= 0.65 :[S10] | | | | | | | | char_freq_! > 0.107 : | | | | | | | | | word_freq_our <= 0.34 : 1 (13.0/1.0) | | | | | | | | | word_freq_our > 0.34 : | | | | | | | | | | word_freq_your <= 1.2 : 0 (2.0) | | | | | | | | | | word_freq_your > 1.2 : 1 (2.0) Subtree [S7] word_freq_our <= 0.09 : 0 (3.0) word_freq_our > 0.09 : 1 (2.0) Subtree [S8] word_freq_meeting > 0.01 : 0 (6.0) word_freq_meeting <= 0.01 : | word_freq_over > 0.05 : 0 (8.0) | word_freq_over <= 0.05 : | | word_freq_project > 0.11 : 0 (9.0) | | word_freq_project <= 0.11 : | | | word_freq_people > 0.04 : 0 (10.0) | | | word_freq_people <= 0.04 : | | | | char_freq_; > 0.015 : 0 (5.0) | | | | char_freq_; <= 0.015 : | | | | | capital_run_length_average <= 1.122 : 1 (5.0/1.0) | | | | | capital_run_length_average > 1.122 : | | | | | | capital_run_length_longest <= 5 : | | | | | | | word_freq_mail <= 0.38 : | | | | | | | | capital_run_length_longest <= 3 : 0 (24.0) | | | | | | | | capital_run_length_longest > 3 : | | | | | | | | | capital_run_length_longest > 4 : 0 (8.0) | | | | | | | | | capital_run_length_longest <= 4 : | | | | | | | | | | capital_run_length_total > 19 : 0 (12.0) | | | | | | | | | | capital_run_length_total <= 19 :[S11] | | | | | | | word_freq_mail > 0.38 : | | | | | | | | capital_run_length_total <= 29 : 0 (3.0) | | | | | | | | capital_run_length_total > 29 : 1 (2.0) | | | | | | capital_run_length_longest > 5 : | | | | | | | capital_run_length_average <= 1.329 : 1 (4.0) | | | | | | | capital_run_length_average > 1.329 : 0 (9.0/1.0) Subtree [S9] word_freq_over > 0.17 : 0 (2.0) word_freq_over <= 0.17 : | word_freq_mail > 0.73 : 0 (2.0) | word_freq_mail <= 0.73 : | | word_freq_order > 0.23 : 1 (2.0) | | word_freq_order <= 0.23 : | | | char_freq_# > 0.011 : 0 (2.0) | | | char_freq_# <= 0.011 : | | | | word_freq_make > 0.21 : 1 (2.0) | | | | word_freq_make <= 0.21 : | | | | | word_freq_conference > 0.41 : 0 (2.0) | | | | | word_freq_conference <= 0.41 : | | | | | | capital_run_length_longest <= 11 : 1 (3.0) | | | | | | capital_run_length_longest > 11 : | | | | | | | word_freq_you > 3.85 : 1 (2.0) | | | | | | | word_freq_you <= 3.85 : | | | | | | | | word_freq_pm > 0.32 : 1 (2.0/1.0) | | | | | | | | word_freq_pm <= 0.32 : | | | | | | | | | char_freq_$ > 0.018 : 0 (2.0) | | | | | | | | | char_freq_$ <= 0.018 : | | | | | | | | | | char_freq_; > 0.015 : 0 (4.0) | | | | | | | | | | char_freq_; <= 0.015 : | | | | | | | | | | | char_freq_( <= 0.02 : 0 (9.0) | | | | | | | | | | | char_freq_( > 0.02 : 1 (3.0/1.0) Subtree [S10] word_freq_you <= 3.46 : 1 (6.0/1.0) word_freq_you > 3.46 : 0 (2.0) Subtree [S11] capital_run_length_total > 18 : 1 (2.0) capital_run_length_total <= 18 : | capital_run_length_average <= 1.464 : 0 (7.0) | capital_run_length_average > 1.464 : 1 (3.0/1.0) Simplified Decision Tree: word_freq_remove <= 0 : | word_freq_000 <= 0.25 : | | word_freq_money <= 0.03 : | | | word_freq_free <= 0.19 : | | | | word_freq_credit <= 0.34 : | | | | | char_freq_$ <= 0.104 : | | | | | | word_freq_font <= 0.11 : | | | | | | | char_freq_! <= 0.391 : | | | | | | | | word_freq_george > 0 : 0 (597.0/1.4) | | | | | | | | word_freq_george <= 0 : | | | | | | | | | word_freq_receive <= 0.28 : | | | | | | | | | | word_freq_hp > 0.11 : 0 (533.0/6.2) | | | | | | | | | | word_freq_hp <= 0.11 : | | | | | | | | | | | word_freq_650 <= 0.17 :[S1] | | | | | | | | | | | word_freq_650 > 0.17 : | | | | | | | | | | | | word_freq_make > 0.37 : 0 (2.0/1.0) | | | | | | | | | | | | word_freq_make <= 0.37 :[S2] | | | | | | | | | word_freq_receive > 0.28 : | | | | | | | | | | word_freq_internet > 0.32 : 1 (5.0/1.2) | | | | | | | | | | word_freq_internet <= 0.32 : | | | | | | | | | | | word_freq_address > 0.43 : 1 (3.0/1.1) | | | | | | | | | | | word_freq_address <= 0.43 : | | | | | | | | | | | | word_freq_email <= 0.38 : 0 (26.0/2.6) | | | | | | | | | | | | word_freq_email > 0.38 : 1 (3.0/2.1) | | | | | | | char_freq_! > 0.391 : | | | | | | | | word_freq_order > 0.74 : 1 (13.0/1.3) | | | | | | | | word_freq_order <= 0.74 : | | | | | | | | | capital_run_length_average <= 3.277 : | | | | | | | | | | word_freq_business > 0.19 : 1 (13.0/2.5) | | | | | | | | | | word_freq_business <= 0.19 : | | | | | | | | | | | char_freq_! <= 0.979 : | | | | | | | | | | | | word_freq_people <= 0.72 : 0 (93.0/6.2) | | | | | | | | | | | | word_freq_people > 0.72 :[S3] | | | | | | | | | | | char_freq_! > 0.979 : | | | | | | | | | | | | word_freq_over > 0.72 : 1 (2.0/1.0) | | | | | | | | | | | | word_freq_over <= 0.72 :[S4] | | | | | | | | | capital_run_length_average > 3.277 : | | | | | | | | | | word_freq_edu > 0.6 : 0 (2.0/1.0) | | | | | | | | | | word_freq_edu <= 0.6 :[S5] | | | | | | word_freq_font > 0.11 : | | | | | | | char_freq_; <= 0.023 : 1 (19.0/2.5) | | | | | | | char_freq_; > 0.023 : 0 (18.0/2.5) | | | | | char_freq_$ > 0.104 : | | | | | | word_freq_hp > 0.14 : 0 (27.0/1.4) | | | | | | word_freq_hp <= 0.14 : | | | | | | | capital_run_length_average <= 1.657 : 0 (13.0/1.3) | | | | | | | capital_run_length_average > 1.657 : | | | | | | | | word_freq_edu > 0.44 : 0 (8.0/1.3) | | | | | | | | word_freq_edu <= 0.44 : | | | | | | | | | capital_run_length_average > 2.291 : 1 (58.0/3.8) | | | | | | | | | capital_run_length_average <= 2.291 : | | | | | | | | | | word_freq_you <= 1.91 : 0 (6.0/2.3) | | | | | | | | | | word_freq_you > 1.91 : 1 (10.0/1.3) | | | | word_freq_credit > 0.34 : | | | | | capital_run_length_average > 3.391 : 1 (18.0/1.3) | | | | | capital_run_length_average <= 3.391 : | | | | | | word_freq_re <= 0.8 : 0 (10.0/2.4) | | | | | | word_freq_re > 0.8 : 1 (2.0/1.0) | | | word_freq_free > 0.19 : | | | | word_freq_george > 0.12 : 0 (28.0/1.4) | | | | word_freq_george <= 0.12 : | | | | | word_freq_edu <= 0.1 : | | | | | | word_freq_hp <= 0.23 : | | | | | | | char_freq_! > 0.567 : 1 (106.0/3.8) | | | | | | | char_freq_! <= 0.567 : | | | | | | | | word_freq_people > 0.19 : 0 (9.0/2.4) | | | | | | | | word_freq_people <= 0.19 : | | | | | | | | | word_freq_meeting <= 0.09 : | | | | | | | | | | word_freq_business > 0.09 : 1 (23.0/1.3) | | | | | | | | | | word_freq_business <= 0.09 : | | | | | | | | | | | word_freq_font > 0.8 : 1 (8.0/1.3) | | | | | | | | | | | word_freq_font <= 0.8 : | | | | | | | | | | | | char_freq_# > 0.02 : 0 (9.0/2.4) | | | | | | | | | | | | char_freq_# <= 0.02 : | | | | | | | | | | | | | char_freq_; > 0.022 : 0 (6.0/2.3) | | | | | | | | | | | | | char_freq_; <= 0.022 : | | | | | | | | | | | | | | word_freq_pm <= 0.09 :[S6] | | | | | | | | | | | | | | word_freq_pm > 0.09 :[S7] | | | | | | | | | word_freq_meeting > 0.09 : | | | | | | | | | | word_freq_mail <= 0.9 : 0 (8.0/1.3) | | | | | | | | | | word_freq_mail > 0.9 : 1 (2.0/1.0) | | | | | | word_freq_hp > 0.23 : | | | | | | | char_freq_! <= 0.196 : 0 (26.0/2.6) | | | | | | | char_freq_! > 0.196 : 1 (4.0/2.2) | | | | | word_freq_edu > 0.1 : | | | | | | word_freq_650 <= 0.21 : 0 (36.0/2.6) | | | | | | word_freq_650 > 0.21 : 1 (3.0/2.1) | | word_freq_money > 0.03 : | | | word_freq_hp > 0.11 : 0 (16.0/1.3) | | | word_freq_hp <= 0.11 : | | | | word_freq_edu > 0.18 : 0 (13.0/2.5) | | | | word_freq_edu <= 0.18 : | | | | | capital_run_length_longest <= 9 : | | | | | | word_freq_free > 2.17 : 1 (2.0/1.0) | | | | | | word_freq_free <= 2.17 : | | | | | | | word_freq_money <= 2.45 : 0 (9.0/1.3) | | | | | | | word_freq_money > 2.45 : 1 (3.0/2.1) | | | | | capital_run_length_longest > 9 : | | | | | | word_freq_re <= 0.45 : 1 (189.0/6.2) | | | | | | word_freq_re > 0.45 : | | | | | | | word_freq_your <= 1.06 : 0 (5.0/2.3) | | | | | | | word_freq_your > 1.06 : 1 (6.0/1.2) | word_freq_000 > 0.25 : | | word_freq_1999 <= 0 : | | | capital_run_length_longest > 10 : 1 (208.0/6.2) | | | capital_run_length_longest <= 10 : | | | | word_freq_re > 0.45 : 0 (2.0/1.0) | | | | word_freq_re <= 0.45 : | | | | | word_freq_address > 0.19 : 1 (2.0/1.0) | | | | | word_freq_address <= 0.19 : | | | | | | capital_run_length_longest > 8 : 1 (6.0/1.2) | | | | | | capital_run_length_longest <= 8 : | | | | | | | word_freq_all <= 0.68 : 0 (3.0/1.1) | | | | | | | word_freq_all > 0.68 : 1 (3.0/1.1) | | word_freq_1999 > 0 : | | | word_freq_over > 0.06 : 1 (11.0/1.3) | | | word_freq_over <= 0.06 : | | | | capital_run_length_longest <= 48 : 0 (8.0/1.3) | | | | capital_run_length_longest > 48 : 1 (3.0/2.1) word_freq_remove > 0 : | word_freq_hp <= 0.19 : | | word_freq_edu <= 0.08 : | | | word_freq_1999 <= 0.25 : 1 (640.0/19.5) | | | word_freq_1999 > 0.25 : | | | | word_freq_george <= 0.08 : 1 (28.0/1.4) | | | | word_freq_george > 0.08 : 0 (3.0/1.1) | | word_freq_edu > 0.08 : | | | word_freq_money <= 0.04 : 0 (6.0/1.2) | | | word_freq_money > 0.04 : 1 (19.0/1.3) | word_freq_hp > 0.19 : | | word_freq_our <= 0.3 : 0 (15.0/1.3) | | word_freq_our > 0.3 : | | | capital_run_length_average <= 2.689 : 0 (3.0/2.1) | | | capital_run_length_average > 2.689 : 1 (8.0/1.3) Subtree [S1] capital_run_length_average <= 3.621 : | word_freq_business <= 0.13 : | | word_freq_internet <= 0.05 : | | | word_freq_re > 0.77 : 0 (171.0/1.4) | | | word_freq_re <= 0.77 : | | | | capital_run_length_longest <= 10 : 0 (529.0/24.9) | | | | capital_run_length_longest > 10 : | | | | | capital_run_length_total > 36 : 0 (155.0/18.3) | | | | | capital_run_length_total <= 36 : | | | | | | word_freq_meeting > 0.38 : 0 (4.0/1.2) | | | | | | word_freq_meeting <= 0.38 : | | | | | | | word_freq_you > 2.86 : 1 (8.0/1.3) | | | | | | | word_freq_you <= 2.86 : | | | | | | | | capital_run_length_total <= 31 : 0 (3.0/1.1) | | | | | | | | capital_run_length_total > 31 : 1 (2.0/1.0) | | word_freq_internet > 0.05 : | | | word_freq_our > 1.34 : 1 (3.0/1.1) | | | word_freq_our <= 1.34 : | | | | word_freq_your <= 0.76 : 0 (18.0/1.3) | | | | word_freq_your > 0.76 : | | | | | word_freq_address <= 0.49 : 1 (2.0/1.0) | | | | | word_freq_address > 0.49 : 0 (2.0/1.0) | word_freq_business > 0.13 : | | word_freq_make > 0.21 : 1 (4.0/1.2) | | word_freq_make <= 0.21 : | | | char_freq_; > 0.038 : 1 (2.0/1.0) | | | char_freq_; <= 0.038 : | | | | char_freq_! <= 0.049 : 0 (17.0/2.5) | | | | char_freq_! > 0.049 : | | | | | word_freq_re <= 0.3 : 1 (4.0/1.2) | | | | | word_freq_re > 0.3 : 0 (2.0/1.0) capital_run_length_average > 3.621 : | word_freq_3d > 0.21 : 1 (3.0/1.1) | word_freq_3d <= 0.21 : | | char_freq_$ > 0.008 : 1 (3.0/1.1) | | char_freq_$ <= 0.008 : | | | word_freq_our > 0.76 : 1 (6.0/1.2) | | | word_freq_our <= 0.76 : | | | | char_freq_; > 0.055 : 1 (4.0/2.2) | | | | char_freq_; <= 0.055 : | | | | | word_freq_your <= 3.61 : 0 (42.0/8.3) | | | | | word_freq_your > 3.61 : 1 (2.0/1.0) Subtree [S2] word_freq_1999 > 0.14 : 0 (2.0/1.0) word_freq_1999 <= 0.14 : | word_freq_internet > 0.15 : 0 (2.0/1.0) | word_freq_internet <= 0.15 : | | capital_run_length_longest > 19 : 1 (14.0/1.3) | | capital_run_length_longest <= 19 : | | | word_freq_our <= 0.36 : 0 (6.0/1.2) | | | word_freq_our > 0.36 : 1 (2.0/1.0) Subtree [S3] capital_run_length_average <= 2.567 : 0 (3.0/1.1) capital_run_length_average > 2.567 : 1 (2.0/1.0) Subtree [S4] word_freq_our > 0.83 : 1 (2.0/1.0) word_freq_our <= 0.83 : | word_freq_you <= 2.32 : | | capital_run_length_total <= 2 : 1 (2.0/1.0) | | capital_run_length_total > 2 : 0 (20.0/2.5) | word_freq_you > 2.32 : | | word_freq_re <= 0.34 : 1 (7.0/2.4) | | word_freq_re > 0.34 : 0 (5.0/1.2) Subtree [S5] capital_run_length_total > 103 : 1 (20.0/1.3) capital_run_length_total <= 103 : | capital_run_length_longest <= 26 : 1 (9.0/1.3) | capital_run_length_longest > 26 : 0 (5.0/2.3) Subtree [S6] word_freq_our > 1.04 : 1 (22.0/2.5) word_freq_our <= 1.04 : | word_freq_mail > 0.53 : 0 (4.0/1.2) | word_freq_mail <= 0.53 : | | word_freq_mail > 0.15 : 1 (9.0/1.3) | | word_freq_mail <= 0.15 : | | | word_freq_internet > 0.45 : 1 (5.0/1.2) | | | word_freq_internet <= 0.45 : | | | | word_freq_order > 0.33 : 0 (2.0/1.0) | | | | word_freq_order <= 0.33 : | | | | | char_freq_! > 0.107 : 1 (28.0/4.9) | | | | | char_freq_! <= 0.107 : | | | | | | word_freq_receive > 0.4 : 1 (2.0/1.0) | | | | | | word_freq_receive <= 0.4 : | | | | | | | word_freq_email > 0.54 : 0 (4.0/1.2) | | | | | | | word_freq_email <= 0.54 : | | | | | | | | char_freq_( > 0.028 : 0 (2.0/1.0) | | | | | | | | char_freq_( <= 0.028 : | | | | | | | | | word_freq_all > 0.65 : 0 (2.0/1.0) | | | | | | | | | word_freq_all <= 0.65 : | | | | | | | | | | word_freq_you <= 3.46 : 1 (6.0/2.3) | | | | | | | | | | word_freq_you > 3.46 : 0 (3.0/2.1) Subtree [S7] word_freq_our <= 0.09 : 0 (3.0/1.1) word_freq_our > 0.09 : 1 (2.0/1.0) Tree saved Evaluation on training data (4141 items): Before Pruning After Pruning ---------------- --------------------------- Size Errors Size Errors Estimate 345 74( 1.8%) 223 108( 2.6%) ( 6.2%) <<