The SumParser ============= The sumparser parses sum-lines and stores the bandwidth sum. Unlike the `IperfParser` it uses the summed lines that iperf calculates. This means that if intervals and parallel threads were used and within one (or more) of those intervals one or more of the threads didn't report its value then the interval-sums will have fewer values than there were intervals (iperf will just skip summing an interval that didn't have all the threads reporting back). The ``IperfParser``, in contrast, will under-report the interval (unless the thread was actually dead) but it will at least give you a value. On the other hand, if all the sum-lines were there, then this would presumably be the more accurate method since we're taking what iperf itself reports. .. ' The HumanExpressionSum Class ---------------------------- .. uml:: HumanExpression <|-- HumanExpressionSum .. module:: iperflexer.sumparser .. autosummary:: :toctree: api HumanExpression HumanExpression.thread_column CsvExpressionSum Class ---------------------- .. uml:: CsvExpression <|-- CsvExpressionSum .. autosummary:: :toctree: api CsvExpressionSum CsvExpressionSum.thread_column The SumParser ------------- .. uml:: IperfParser <|-- SumParser SumParser: __call__(line) SumParser: last_line_bandwidth .. autosummary:: :toctree: api SumParser SumParser.__call__ SumParser.pipe Using the SumParser ------------------- Checking The Call Output ------------------------ The ``__call__`` is the main way to use it. There are two ways to get the interval sums from the SumParser (and the IperfParser). One is to poll the returned value from the ``__call__`` to see if a value was returned. I'll start by working with :download:`client-side ` input that has two threads and one-second reporting intervals. :: if in_documentation: data_folder = 'tests/steps/samples/' data_path = os.path.join(data_folder, 'client_data.iperf') parser = SumParser(threads=2) for line in open(data_path): bandwidth = parser(line) if bandwidth is not None: print(bandwidth) :: 96.5 94.4 94.4 93.3 93.3 94.4 94.4 94.4 92.3 94.4 .. warning:: the returned value is a float, not a string so it has to be cast to a string to be saved (don't do ``bandwidth + '\n'``). .. ' Traversing the Values --------------------- The original way to use it is to add all the lines and traverse the bandwidths afterwards. For the ``IperfParser`` this might be the safer way to use it if the data is being fed to it live while iperf is running, since it's adding up the threads. .. warning:: The way iperf seems to work is that if you are using multiple threads and one or more of the threads misses a reporting interval it will report the thread information but not a summed-line. This means that the SumParser will have fewer data-points than the actual number of intervals that really exist (and the times will be shifted backwards). If you don't inspect the raw output before using the SumParser you could end up with incorrect data. If you need to work with the intervals, use the IperfParser. :: if in_documentation: parser.reset() for line in open(data_path): parser(line) for bandwidth in parser.bandwidths: print(bandwidth) :: 96.5 94.4 94.4 93.3 93.3 94.4 94.4 94.4 92.3 94.4 The Last Line Bandwidth ----------------------- When the `SumParser` matches a line that has an interval larger than what it is set to accept then it will set its ``last_line_bandwidth`` attribute to it, so once the whole iperf output has been consumed that attribute will have the final bandwidth value that iperf calculated for the entire session, assuming that the output is complete and this was the last line. If the line is missing it should be None. Based on some empirical checking and some threads on the iperf discussion boards it looks like this is the most accurate value if there is a discrepancy between it and the added interval sums. A Comparison to the Sums ~~~~~~~~~~~~~~~~~~~~~~~~ Here I'll compare what happens when you add the sum-lines up and take the mean versus using the ``last_line_bandwidth`` (iperf's calculated rate). ``parser.bandwidths`` is a generator of interval bandwidths and ``parser.intervals`` is a dictionary that maps `interval:bandwidth`. Since the ``bandwidths`` attribute is a generator I can't take it's length so I'm using the length of the ``intervals`` instead. .. ' :: if in_documentation: parser.reset() parser.threads = 4 for line in open(data_path): parser(line) calculated_average = sum(parser.bandwidths)/len(parser.intervals) Now the outcome. .. csv-table:: Calculated Sums-mean vs Iperf's Mean :header: Source, Bandwidth (Mbits/Second) Sum Lines, 94.18 Iperf, 94.1 .. ' So... the re-calculated mean is higher... I don't really know what this means. My guess would be that this is a problem of loss of precision in converting everything into Mbits/second. Let's try an iperf file that used bits as the units. Bit Sums ~~~~~~~~ First I'll set up the IperfParser and SumParser to convert to bits (which means no conversion in this case, since the source file was in bits). I'll also import the UnitConverter, a dict that has a sub-dict that returns the conversion factor when converting from one unit to another (it takes the form `unitconverter[][] = `). The file that's going to be checked is :download:`tests/steps/samples/client_p4_bits.iperf ` which is the output of the client-side output (the transmitter) when run with four parallel threads and the output format in bits. .. ' :: if in_documentation: #set up the unitconverter from unitconverter import UnitConverter from unitconverter import UnitNames from unitconverter import BinaryUnitNames as b_names from unitconverter import BinaryUnitconverter converter = UnitConverter() b_converter = BinaryUnitconverter() data_path = os.path.join(data_folder, 'client_p4_bits.iperf') # rename the sum-parser used earlier to make it clearer sum_parser = parser #setup the parsers to use bits voodoo = IperfParser(units=UnitNames.bits, threads=4) sum_parser.reset() sum_parser.units = UnitNames.bits sum_parser.threads = 4 # load them up with the raw lines for line in open(data_path): sum_parser(line) voodoo(line) Now we add the interval bandwidths together, convert the total from bits to Mbits and then take the mean. :: if in_documentation: # convert the sums to Mbits and take the average total_bandwidth = sum(sum_parser.bandwidths) * converter[UnitNames.bits][UnitNames.mbits] calculated_average = total_bandwidth/len(sum_parser.intervals) # same for the re-added threads v_total = sum(voodoo.bandwidths) * converter['bits']['Mbits'] v_average = v_total/len(voodoo.intervals) # now iperf's iperf_mean = sum_parser.last_line_bandwidth * converter['bits']['Mbits'] And here's what we get. .. ' .. csv-table:: Bandwidth Comparison :header: Source, Mean Bandwidth (Mbits/Second) Iperf, 93.592467 Sum-Lines, 93.9524096 Threads, 93.9524096 So in this case, since there were no threads with missing intervals the SumParser and the IperfParser came up with the same values but both were higher than iperf's calculated final value. It appears that there's more going on than just a round-off error. Transfers --------- I think that there are multiple things going on. One is that I'm assuming that each interval is exactly 1 second, but that's not necessarily the case. Also, the last transfer isn't included in the interval reports, just in the final report. I'll try a file with bits again, but this time I specified two threads and a buffer of 512 KiloBytes. :: if in_documentation: voodoo = IperfParser(units=UnitNames.bits, threads=2) sum_parser = SumParser(threads=2, units=UnitNames.bits) filename = os.path.join(data_folder, 'tartarus_p2_bits_halfM.iperf') with open(filename) as reader: for line in reader: voodoo(line) sum_parser(line) print(line) :: [SUM] 0.0-10.2 sec 119537664 Bytes 94015278 bits/sec Looking at the last line output you can see that it actually ran for a reported 10.2 seconds (or at least one of the threads did). We'll try the re-calculation on the transfers. .. ' :: if in_documentation: mbytes = b_converter[b_names.bytes][b_names.mebibytes] recalculated_transfer = sum(voodoo.transfers) recalculated_transfer_mbytes = recalculated_transfer * mbytes iperfs_transfer = sum_parser.last_line_transfer iperfs_transfer_mbytes = iperfs_transfer * mbytes .. csv-table:: Data Transfered :header: Source, Transfer (MBytes) Re-Calculated,113.0 Iperf's Transfer,114.0 So the re-added transfer in still missing data. The most likely reason is that the last data-transfer isn't added to the last interval but added to the final tally instead. Each thread adds one buffer's worth of data to the final tally so in this case it should be 1 Megabyte short like we see. We can double-check. :: if in_documentation: missing = b_converter[b_names.mebibytes][b_names.bytes] recalculated_transfer += missing recalculated_transfer_mbytes = recalculated_transfer * mbytes .. csv-table:: Re-added Data Transfered :header: Source, Transfer(Mbytes) Re-Calculated,114.0 Iperf's Transfer,114.0 Now we can re-try the bandwidth, remembering that it took 10.2 seconds to finish. :: if in_documentation: m_bits = converter[UnitNames.bits][UnitNames.mbits] recalculated_bandwidth = recalculated_transfer * b_converter[b_names.bytes][b_names.bits] recalculated_bandwidth = recalculated_bandwidth recalculated_bandwidth_mbits = (recalculated_bandwidth/10.2) * m_bits iperfs_bandwidth = sum_parser.last_line_bandwidth * m_bits .. csv-table:: Bandwidths :header: Source, Bandwidth (Mbits) Re-Calculated,93.76 Iperf,94.02 So it still doesn't capture the full bandwidth... We can get the actual time with a little algebra. .. ' .. math:: bandwidth &= \frac{bits}{seconds}\\ seconds &= \frac{bits}{bandwidth}\\ :: if in_documentation: transfer = sum_parser.last_line_transfer * b_converter[b_names.bytes][b_names.bits] seconds = transfer/float(sum_parser.last_line_bandwidth) print(seconds) :: 10.1717649763 Once more with feeling. :: if in_documentation: recalculated_bandwidth_mbits = (recalculated_bandwidth/seconds) * m_bits .. csv-table:: Final Bandwidths :header: Source, Bandwidth (Mbits) Re-Calculated,94.02 Iperf,94.02 So there you have it.