Lorenzo Gangi lg149738 CS557 Project 1 Languages used: DHTML, PHP Running the program: go to http://lorenzogangi.com/cs557/index.php upload a csv file and hit the submit button Sudo Algorithm 1 Upload the csv it parse save the data in corresponding data structures ( this is were my problems started, I should have loaded this into a database table instead i put it in a matrix which has no record keys makeing sub sets hard to retrieve and process) 2 calculate the initial information gain calculate the entropy of the target attribute calcutate the entropy of all sub attributes 3 get the attribute with the highest info gain calculate info gain for each attribute sort them in reverse order (on a stack) pop them off as needed 4 build the decision tree build name value associated info gain array were the array keys are attribute names get the attribute info gains and sort them highest at index 0 etc get the attribute with highes info gain build attribute node ----------------------------------------------------------------------------------------------------- push the attribute node on the stack with title and tree level info make the braches get attribute possible values get subset data for possible values for each possible value build possible value node (branch) ------------------------------------------------------------------------------------ check stop conditions if values all equal yes or no your done assign yes or no value if subset is empty your done assign yes or no value else calculate subset info gain. ----------------------------------------------------------------------------------- call build attribute //remove parent attribure ----------------------------------------------------------------------------------------------------- //end build decision tree