두 개의 클래스가 포함 된 스크립트가 있습니다. (나는 분명히 내가 다루고있는 오류와 관련이 있다고 생각지 않는 많은 것들을 삭제하고있다.) 궁극적 인 작업은 내가 this 질문에서 언급했듯이 의사 결정 트리를 만드는 것이다.python3에서 클래스의 목록에 행을 추가 할 때 무한 루프
불행히도 무한 루프가 발생하며 그 이유를 파악하는 데 어려움이 있습니다. 내가 엮어내는 코드 라인을 확인했지만 반복자와 내가 추가 할 목록이 다른 객체라고 생각했을 것입니다. 목록의 .append 기능에 대한 부작용이 있습니까? 아니면 다른 눈에 띄게 실수를 저지르고 있습니까?
class Dataset:
individuals = [] #Becomes a list of dictionaries, in which each dictionary is a row from the CSV with the headers as keys
def field_set(self): #Returns a list of the fields in individuals[] that can be used to split the data (i.e. have more than one value amongst the individuals
def classified(self, predicted_value): #Returns True if all the individuals have the same value for predicted_value
def fields_exhausted(self, predicted_value): #Returns True if all the individuals are identical except for predicted_value
def lowest_entropy_value(self, predicted_value): #Returns the field that will reduce <a href="http://en.wikipedia.org/wiki/Entropy_%28information_theory%29">entropy</a> the most
def __init__(self, individuals=[]):
및
class Node:
ds = Dataset() #The data that is associated with this Node
links = [] #List of Nodes, the offspring Nodes of this node
level = 0 #Tree depth of this Node
split_value = '' #Field used to split out this Node from the parent node
node_value = '' #Value used to split out this Node from the parent Node
def split_dataset(self, split_value): #Splits the dataset into a series of smaller datasets, each of which has a unique value for split_value. Then creates subnodes to store these datasets.
fields = [] #List of options for split_value amongst the individuals
datasets = {} #Dictionary of Datasets, each one with a value from fields[] as its key
for field in self.ds.field_set()[split_value]: #Populates the keys of fields[]
fields.append(field)
datasets[field] = Dataset()
for i in self.ds.individuals: #Adds individuals to the datasets.dataset that matches their result for split_value
datasets[i[split_value]].individuals.append(i) #<---Causes an infinite loop on the second hit
for field in fields: #Creates subnodes from each of the datasets.Dataset options
self.add_subnode(datasets[field],split_value,field)
def add_subnode(self, dataset, split_value='', node_value=''):
def __init__(self, level, dataset=Dataset()):
내 초기화 코드는 현재 :
if __name__ == '__main__':
filename = (sys.argv[1]) #Takes in a CSV file
predicted_value = "# class" #Identifies the field from the CSV file that should be predicted
base_dataset = parse_csv(filename) #Turns the CSV file into a list of lists
parsed_dataset = individual_list(base_dataset) #Turns the list of lists into a list of dictionaries
root = Node(0, Dataset(parsed_dataset)) #Creates a root node, passing it the full dataset
root.split_dataset(root.ds.lowest_entropy_value(predicted_value)) #Performs the first split, creating multiple subnodes
n = root.links[0]
n.split_dataset(n.ds.lowest_entropy_value(predicted_value)) #Attempts to split the first subnode.
+1 좋은 답변입니다. –