Data Pipeline Adapters
Validates a CSV file's header against a Pydantic contract to detect schema drift.
This adapter opens a flat file, reads only the first row (the header), and cross-references the column names against the keys defined in the Pydantic schema. It does not validate data types or row values, making it extremely fast for detecting dropped or renamed columns in large data pipelines.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract
|
type[BaseModel]
|
The Pydantic model representing the expected schema. |
required |
file_path
|
str
|
The absolute or relative path to the CSV file. |
required |
encoding
|
str | None
|
The encoding used to open the file (e.g., "utf-8"). |
None
|
**kwargs
|
Any
|
Additional keyword arguments (e.g., delimiter=";", quotechar="|")
to pass directly to the underlying |
{}
|
Returns:
| Type | Description |
|---|---|
list[dict[str, str]]
|
list[dict[str, str]]: A list of validation errors. Returns an empty list if all schema keys are present in the CSV header. |
Source code in rdce/adapters.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
Streams a CSV file row-by-row, validating data types against a Pydantic schema. Yields a dictionary for every row that fails validation, allowing the developer to route bad data without loading the entire file into memory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
contract
|
type[BaseModel]
|
The expected schema. |
required |
file_path
|
str
|
Path to the CSV file. |
required |
null_markers
|
list[str]
|
Strings that represent NULL in this CSV (e.g., "", "NaN"). |
None
|
ignore_nulls
|
bool
|
If True, forgives all null markers regardless of schema. |
False
|
encoding
|
str | None
|
The encoding used to open the file (e.g., "utf-8-sig"). |
None
|
**kwargs
|
Any
|
Passed directly to |
{}
|
Yields:
| Name | Type | Description |
|---|---|---|
dict |
dict[str, Any]
|
A payload containing |
Source code in rdce/adapters.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | |