Banktransfer: Handle trailing commas in headers for Lloyds Bank CSV files (#4782)

Lloyds Bank (UK) CSV files include a trailing comma in the header row
but not in the data rows, causing the `csvimport.parse` function to
skip the data rows. This occurs because the header length exceeds the
row length, making them unequal to `hint.cols`.

This commit adjusts the length check to allow a range of acceptable row
lengths, from the index of the last non-empty column in the header to
`hint.cols`. This ensures compatibility with headers containing one or
more trailing commas without affecting rows with correctly labelled columns.

The solution avoids breaking changes by leaving underlying data structures
untouched. Alternative approaches, such as dropping trailing commas before
parsing or removing empty elements after parsing, were avoided due to
potential risks. Specifically, trailing columns might contain data that
banks provide but fail to label in the header row.
This commit is contained in:
Kian Cross
2025-02-05 15:56:28 +00:00
committed by GitHub
parent 03d3879787
commit 5d4b218aa6
3 changed files with 42 additions and 1 deletions

View File

@@ -30,6 +30,19 @@ class HintMismatchError(Exception):
pass
def check_row_length(data, hint, row):
valid_lengths = [hint['cols']]
header = data[0]
for i in range(len(header) - 1, 0, -1):
if header[i]:
break
else:
valid_lengths.append(hint['cols'] - (len(header) - i))
return None not in row and len(row) in valid_lengths
def parse(data, hint):
result = []
if 'cols' not in hint:
@@ -39,7 +52,7 @@ def parse(data, hint):
good_hint = False
for row in data:
resrow = {}
if None in row or len(row) != hint['cols']:
if not check_row_length(data, hint, row):
# Wrong column count
continue
if hint.get('payer') is not None: