Implement Data Import
Granit.DataExchange provides a mini-ETL pipeline for importing and exporting
tabular data. The import pipeline follows four steps — Extract, Map, Validate,
Execute — with streaming IAsyncEnumerable processing, intelligent column
mapping suggestions, and roundtrip (INSERT vs UPDATE) support.
Prerequisites
Section titled “Prerequisites”- A working Granit application with EF Core configured
- A domain entity to import into (e.g.,
Patient) Granit.Validationfor row-level validation
1. Install the packages
Section titled “1. Install the packages”dotnet add package Granit.DataExchangedotnet add package Granit.DataExchange.EntityFrameworkCoredotnet add package Granit.DataExchange.EndpointsThen add one or both file parsers:
dotnet add package Granit.DataExchange.Csvdotnet add package Granit.DataExchange.Exceldotnet add package Granit.DataExchange.Csvdotnet add package Granit.DataExchange.Excel2. Register services
Section titled “2. Register services”// Core import pipelineservices.AddGranitDataExchange();
// File parsers (at least one required)services.AddGranitDataExchangeCsv();services.AddGranitDataExchangeExcel();
// EF Core persistence (import jobs, saved mappings)builder.AddGranitDataExchangeEntityFrameworkCore(opts => opts.UseNpgsql(connectionString));
// Import definition and executor for your entityservices.AddImportDefinition<Patient, PatientImportDefinition>();services.AddImportExecutor<Patient, AppDbContext>();services.AddBusinessKeyResolver<Patient, AppDbContext>();3. Define an import definition
Section titled “3. Define an import definition”Each importable entity needs an ImportDefinition<TEntity> that declares which
properties are importable (whitelist), their display names, aliases, and the
business key used for INSERT vs UPDATE resolution:
public sealed class PatientImportDefinition : ImportDefinition<Patient>{ public override string Name => "Acme.PatientImport";
protected override void Configure(ImportDefinitionBuilder<Patient> builder) { builder .HasBusinessKey(p => p.Niss) .Property(p => p.Niss, p => p .DisplayName("NISS") .Aliases("National ID", "Numéro national") .Required()) .Property(p => p.FirstName, p => p .DisplayName("First name") .Aliases("Prénom", "Voornaam")) .Property(p => p.LastName, p => p .DisplayName("Last name") .Aliases("Nom", "Achternaam")) .Property(p => p.Email, p => p .DisplayName("Email") .Aliases("Courriel", "Mail", "E-mail")) .Property(p => p.BirthDate, p => p .DisplayName("Date of birth") .Format("dd/MM/yyyy")) .ExcludeOnUpdate(p => p.CreatedAt); }}Fluent API reference
Section titled “Fluent API reference”| Method | Description |
|---|---|
Property(expr, config?) | Declare an importable property (whitelist) |
HasBusinessKey(expr) | Business key for INSERT vs UPDATE resolution |
HasCompositeKey(exprs) | Composite key (multiple properties) |
HasExternalId() | External ID resolution (Odoo pattern) |
ExcludeOnUpdate(expr) | Never overwrite this property on UPDATE |
GroupBy(column) | Group rows for parent/child imports |
HasMany(collection, config) | Child collection (requires GroupBy) |
4. The import pipeline
Section titled “4. The import pipeline”The pipeline processes files through four stages, all streaming via
IAsyncEnumerable. Only the current batch (default 500 entities) and accumulated
errors are held in memory.
Upload --> Preview --> Confirm mappings --> Execute (async)Stage 1: Extract
Section titled “Stage 1: Extract”IFileParser reads the uploaded file and produces a stream of RawImportRow
values. Each row contains a dictionary of column names to string values.
services.AddGranitDataExchangeCsv();
// Sep (MIT, SIMD AVX-512/NEON) handles RFC 4180 quoting,// configurable separators, and UTF-8 with BOM detection.CSV parsing options:
| Option | Default | Description |
|---|---|---|
Separator | "," | Column separator |
Encoding | null | File encoding (null = UTF-8 auto-detect) |
SkipRows | 0 | Rows to skip before the header |
services.AddGranitDataExchangeExcel();
// Sylvan.Data.Excel (MIT, zero-dep) supports .xlsx, .xls, and .xlsb// via streaming DbDataReader.Supported formats:
| MIME type | Format | Extension |
|---|---|---|
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | Open XML | .xlsx |
application/vnd.ms-excel | BIFF (Excel 97—2003) | .xls |
application/vnd.ms-excel.sheet.binary.macroenabled.12 | Binary | .xlsb |
Stage 2: Map
Section titled “Stage 2: Map”IMappingSuggestionService suggests column-to-property mappings using four
strategies in decreasing confidence order:
| Level | Source | Confidence |
|---|---|---|
| 1 | Saved mappings from previous imports | Saved |
| 2 | Exact match on property name, DisplayName, or alias | Exact |
| 3 | Fuzzy match via normalized Levenshtein distance | Fuzzy |
| 4 | Semantic match via AI (optional ISemanticMappingService) | Semantic |
Stage 3: Validate
Section titled “Stage 3: Validate”IRowValidator<T> validates each mapped entity using FluentValidation. Invalid
rows are collected in the import report with their error codes and messages.
Stage 4: Execute
Section titled “Stage 4: Execute”IImportExecutor<T> persists valid entities in batches. For each entity, the
IRecordIdentityResolver<T> determines whether to INSERT or UPDATE based on
the business key.
5. Map REST endpoints
Section titled “5. Map REST endpoints”app.MapDataExchangeEndpoints();Import routes
Section titled “Import routes”| Method | Route | Description |
|---|---|---|
POST | /data-exchange | Upload file and create an import job |
POST | /{jobId}/preview | Get headers, preview rows, mapping suggestions |
PUT | /{jobId}/mappings | Confirm column mappings |
POST | /{jobId}/execute | Execute import asynchronously (202 Accepted) |
POST | /{jobId}/dry-run | Execute synchronous dry run |
GET | /{jobId} | Job status |
GET | /{jobId}/report | Import report (statistics + errors) |
GET | /{jobId}/correction-file | Download file with error rows only |
6. Import report
Section titled “6. Import report”The import report contains aggregate statistics and only the rows that failed — not the successful rows. For 100,000 rows with 50 errors, only ~50 error objects are held in memory.
ImportReport report = await executor.ExecuteAsync(entities, options);
report.TotalRows; // 100000report.SucceededRows; // 99950report.FailedRows; // 50report.InsertedRows; // 80000report.UpdatedRows; // 19950report.Duration; // TimeSpanExport pipeline
Section titled “Export pipeline”Granit.DataExchange also provides a full export pipeline with fluent field definitions, saveable presets, and background job support.
Define an export
Section titled “Define an export”public sealed class PatientExportDefinition : ExportDefinition<Patient>{ public override string Name => "Acme.PatientExport"; public override string? QueryDefinitionName => "Acme.Patients";
protected override void Configure(ExportDefinitionBuilder<Patient> builder) { builder .IncludeBusinessKey() .Field(p => p.LastName, f => f.Header("Last name")) .Field(p => p.FirstName, f => f.Header("First name")) .Field(p => p.Email) .Field(p => p.BirthDate, f => f.Header("Date of birth").Format("dd/MM/yyyy")); }}Register export services
Section titled “Register export services”services.AddGranitDataExport();services.AddExportDefinition<Patient, PatientExportDefinition>();services.AddScoped<IExportDataSource<Patient>, PatientExportDataSource>();services.AddSingleton<IExportWriter, CsvExportWriter>();services.AddSingleton<IExportWriter, ClosedXmlExportWriter>();Export routes
Section titled “Export routes”| Method | Route | Description |
|---|---|---|
GET | /export/definitions | List export definitions |
GET | /export/definitions/{name}/fields | Available fields |
POST | /export/jobs | Create and dispatch an export job |
GET | /export/jobs/{jobId} | Job status |
GET | /export/jobs/{jobId}/download | Download exported file |
Configuration
Section titled “Configuration”{ "DataExchange": { "DefaultMaxFileSizeMb": 50, "DefaultBatchSize": 500, "FuzzyMatchThreshold": 0.8 }, "DataExport": { "BackgroundThreshold": 1000 }}Next steps
Section titled “Next steps”- Add background jobs to schedule recurring imports
- Set up notifications to alert users when imports complete
- Granit.DataExchange reference for the full API surface and store interfaces