When the data contains the observations (x_1,y_1),(x_2,y_2),\dots,(x_n,y_n) and the line y=ax+b is fitted to the data, the error can be computed with the sum of squares formula \sum_{i=1}^{n}(y_i-(ax_i+b))^2. For example, when the data is (1,1),(3,2),(5,3) and the line is y=x-1 (i.e., a=1 and b=-1), the error is (1-(1-1))^2+(2-(3-1))^2+(3-(5-1))^2=2.
In a file squaresum.py
, implement the class DataAnalyzer
with the methods
add_point(x, y)
: add an observation to the datacalculate_error(a, b)
: return the sum of squares error for the given line parameters
The time complexity of both methods should be O(1).
class DataAnalyzer: def __init__(self): # TODO def add_point(self, x, y): # TODO def calculate_error(self, a, b): # TODO if __name__ == "__main__": analyzer = DataAnalyzer() analyzer.add_point(1, 1) analyzer.add_point(3, 2) analyzer.add_point(5, 3) print(analyzer.calculate_error(1, 0)) # 5 print(analyzer.calculate_error(1, -1)) # 2 print(analyzer.calculate_error(3, 2)) # 293 analyzer.add_point(4, 2) print(analyzer.calculate_error(1, 0)) # 9 print(analyzer.calculate_error(1, -1)) # 3 print(analyzer.calculate_error(3, 2)) # 437
You can test the efficiency of your solution with the following code. In this case too the code should finish almost immediately.
analyzer = DataAnalyzer() total = 0 for i in range(10**5): analyzer.add_point(i, i % 100) total += analyzer.calculate_error(i % 97, i % 101) print(total) # 25745448974503313754828