Implement Data Import

Granit.DataExchange provides a mini-ETL pipeline for importing and exporting tabular data. The import pipeline follows four steps — Extract, Map, Validate, Execute — with streaming IAsyncEnumerable processing, intelligent column mapping suggestions, and roundtrip (INSERT vs UPDATE) support.

Prerequisites

A working Granit application with EF Core configured
A domain entity to import into (e.g., Patient)
Granit.Validation for row-level validation

1. Install the packages

dotnet add package Granit.DataExchange
dotnet add package Granit.DataExchange.EntityFrameworkCore
dotnet add package Granit.DataExchange.Endpoints

Then add one or both file parsers:

dotnet add package Granit.DataExchange.Csv

dotnet add package Granit.DataExchange.Excel

dotnet add package Granit.DataExchange.Csv
dotnet add package Granit.DataExchange.Excel

2. Register services

// Core import pipeline
services.AddGranitDataExchange();

// File parsers (at least one required)
services.AddGranitDataExchangeCsv();
services.AddGranitDataExchangeExcel();

// EF Core persistence (import jobs, saved mappings)
builder.AddGranitDataExchangeEntityFrameworkCore(opts =>
    opts.UseNpgsql(connectionString));

// Import definition and executor for your entity
services.AddImportDefinition<Patient, PatientImportDefinition>();
services.AddImportExecutor<Patient, AppDbContext>();
services.AddBusinessKeyResolver<Patient, AppDbContext>();

3. Define an import definition

Each importable entity needs an ImportDefinition<TEntity> that declares which properties are importable (whitelist), their display names, aliases, and the business key used for INSERT vs UPDATE resolution:

public sealed class PatientImportDefinition : ImportDefinition<Patient>
{
    public override string Name => "Acme.PatientImport";

    protected override void Configure(ImportDefinitionBuilder<Patient> builder)
    {
        builder
            .HasBusinessKey(p => p.Niss)
            .Property(p => p.Niss, p => p
                .DisplayName("NISS")
                .Aliases("National ID", "Numéro national")
                .Required())
            .Property(p => p.FirstName, p => p
                .DisplayName("First name")
                .Aliases("Prénom", "Voornaam"))
            .Property(p => p.LastName, p => p
                .DisplayName("Last name")
                .Aliases("Nom", "Achternaam"))
            .Property(p => p.Email, p => p
                .DisplayName("Email")
                .Aliases("Courriel", "Mail", "E-mail"))
            .Property(p => p.BirthDate, p => p
                .DisplayName("Date of birth")
                .Format("dd/MM/yyyy"))
            .ExcludeOnUpdate(p => p.CreatedAt);
    }
}

Fluent API reference

Method	Description
`Property(expr, config?)`	Declare an importable property (whitelist)
`HasBusinessKey(expr)`	Business key for INSERT vs UPDATE resolution
`HasCompositeKey(exprs)`	Composite key (multiple properties)
`HasExternalId()`	External ID resolution (Odoo pattern)
`ExcludeOnUpdate(expr)`	Never overwrite this property on UPDATE
`GroupBy(column)`	Group rows for parent/child imports
`HasMany(collection, config)`	Child collection (requires `GroupBy`)

4. The import pipeline

The pipeline processes files through four stages, all streaming via IAsyncEnumerable. Only the current batch (default 500 entities) and accumulated errors are held in memory.

Upload --> Preview --> Confirm mappings --> Execute (async)

Stage 1: Extract

IFileParser reads the uploaded file and produces a stream of RawImportRow values. Each row contains a dictionary of column names to string values.

CSV
Excel

services.AddGranitDataExchangeCsv();

// Sep (MIT, SIMD AVX-512/NEON) handles RFC 4180 quoting,
// configurable separators, and UTF-8 with BOM detection.

CSV parsing options:

Option	Default	Description
`Separator`	`","`	Column separator
`Encoding`	`null`	File encoding (`null` = UTF-8 auto-detect)
`SkipRows`	`0`	Rows to skip before the header

services.AddGranitDataExchangeExcel();

// Sylvan.Data.Excel (MIT, zero-dep) supports .xlsx, .xls, and .xlsb
// via streaming DbDataReader.

Supported formats:

MIME type	Format	Extension
`application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`	Open XML	`.xlsx`
`application/vnd.ms-excel`	BIFF (Excel 97—2003)	`.xls`
`application/vnd.ms-excel.sheet.binary.macroenabled.12`	Binary	`.xlsb`

Stage 2: Map

IMappingSuggestionService suggests column-to-property mappings using four strategies in decreasing confidence order:

Level	Source	Confidence
1	Saved mappings from previous imports	`Saved`
2	Exact match on property name, DisplayName, or alias	`Exact`
3	Fuzzy match via normalized Levenshtein distance	`Fuzzy`
4	Semantic match via AI (optional `ISemanticMappingService`)	`Semantic`

Stage 3: Validate

IRowValidator<T> validates each mapped entity using FluentValidation. Invalid rows are collected in the import report with their error codes and messages.

Stage 4: Execute

IImportExecutor<T> persists valid entities in batches. For each entity, the IRecordIdentityResolver<T> determines whether to INSERT or UPDATE based on the business key.

5. Map REST endpoints

app.MapDataExchangeEndpoints();

Import routes

Method	Route	Description
`POST`	`/data-exchange`	Upload file and create an import job
`POST`	`/{jobId}/preview`	Get headers, preview rows, mapping suggestions
`PUT`	`/{jobId}/mappings`	Confirm column mappings
`POST`	`/{jobId}/execute`	Execute import asynchronously (202 Accepted)
`POST`	`/{jobId}/dry-run`	Execute synchronous dry run
`GET`	`/{jobId}`	Job status
`GET`	`/{jobId}/report`	Import report (statistics + errors)
`GET`	`/{jobId}/correction-file`	Download file with error rows only

6. Import report

The import report contains aggregate statistics and only the rows that failed — not the successful rows. For 100,000 rows with 50 errors, only ~50 error objects are held in memory.

ImportReport report = await executor.ExecuteAsync(entities, options);

report.TotalRows;      // 100000
report.SucceededRows;  // 99950
report.FailedRows;     // 50
report.InsertedRows;   // 80000
report.UpdatedRows;    // 19950
report.Duration;       // TimeSpan

Export pipeline

Granit.DataExchange also provides a full export pipeline with fluent field definitions, saveable presets, and background job support.

Define an export

public sealed class PatientExportDefinition : ExportDefinition<Patient>
{
    public override string Name => "Acme.PatientExport";
    public override string? QueryDefinitionName => "Acme.Patients";

    protected override void Configure(ExportDefinitionBuilder<Patient> builder)
    {
        builder
            .IncludeBusinessKey()
            .Field(p => p.LastName, f => f.Header("Last name"))
            .Field(p => p.FirstName, f => f.Header("First name"))
            .Field(p => p.Email)
            .Field(p => p.BirthDate, f => f.Header("Date of birth").Format("dd/MM/yyyy"));
    }
}

Register export services

services.AddGranitDataExport();
services.AddExportDefinition<Patient, PatientExportDefinition>();
services.AddScoped<IExportDataSource<Patient>, PatientExportDataSource>();
services.AddSingleton<IExportWriter, CsvExportWriter>();
services.AddSingleton<IExportWriter, ClosedXmlExportWriter>();

Export routes

Method	Route	Description
`GET`	`/export/definitions`	List export definitions
`GET`	`/export/definitions/{name}/fields`	Available fields
`POST`	`/export/jobs`	Create and dispatch an export job
`GET`	`/export/jobs/{jobId}`	Job status
`GET`	`/export/jobs/{jobId}/download`	Download exported file

Configuration

{
  "DataExchange": {
    "DefaultMaxFileSizeMb": 50,
    "DefaultBatchSize": 500,
    "FuzzyMatchThreshold": 0.8
  },
  "DataExport": {
    "BackgroundThreshold": 1000
  }
}

Next steps

Add background jobs to schedule recurring imports
Set up notifications to alert users when imports complete
Granit.DataExchange reference for the full API surface and store interfaces