Column mapping is the first step in any supplier data integration — and the one most teams do manually every single time. A new supplier sends a CSV. You open it, scan the headers, and start mentally translating: "OK, Farbe is color, Produktname is name, and what on earth is attr_07?"
Then you do it again next month. And again for the next supplier. The logic lives in your head, not in a system. This guide will walk you through how column mapping works, the common challenges you'll hit, and how to make the whole thing reusable.
What Is Column Mapping?
Column mapping is the process of translating supplier column names to your canonical attribute names. Your internal product schema uses color. Supplier A calls it Farbe. Supplier B calls it colour_name. Supplier C calls it clr. Column mapping is the translation layer that says: "all three of these mean color."
Without this step, you can't merge data from multiple suppliers into a single catalog. The data literally doesn't have the same shape.
| Supplier A (German) | Supplier B (UK) | Supplier C (Internal) | Your Schema |
|---|---|---|---|
| Farbe | colour_name | clr | color |
| Groesse | Size (UK) | sz | size |
| Produktname | Product Title | p_name | name |
| Mat. | Material Type | mat_code | material |
| Gewicht | Weight (kg) | wt | weight |
| Artikelnummer | SKU Code | sku | sku |
The mapping itself is simple. The challenge is doing it consistently, handling edge cases, and not losing the mapping when you close your spreadsheet.
Define Your Canonical Schema First
Before you map anything, you need a clear target. Your canonical schema is the definitive list of attributes your system expects, along with their types, constraints, and rules. Without it, column mapping is just renaming columns at random.
Here's an example of what a well-defined product schema looks like:
| Attribute | Type | Required | Allowed Values / Format | Unit |
|---|---|---|---|---|
| name | text | yes | max 200 chars | — |
| sku | text | yes | alphanumeric, unique | — |
| price | decimal | yes | 0.01 – 99999.99 | EUR |
| color | select | yes | controlled list (e.g. black, white, red, blue, green, ...) | — |
| size | select | no | XS, S, M, L, XL, XXL | — |
| weight | decimal | no | — | kg |
| material | multiselect | no | controlled list (e.g. cotton, polyester, leather, wood, ...) | — |
| category | text | yes | from category tree | — |
| brand | text | yes | max 100 chars | — |
| description | text | no | max 5000 chars | — |
| images | array (URLs) | no | valid URLs, .jpg/.png/.webp | — |
This schema is your contract. Every supplier feed, once mapped, should produce records that conform to these rules. If you don't have this documented, start there — even a simple spreadsheet listing your attributes, types, and constraints is better than nothing.
Common Mapping Challenges
Not every mapping is a clean one-to-one rename. Here are the five patterns you'll encounter in practice, each with a concrete example.
1. One-to-One Mapping (The Easy Case)
The supplier has a column that maps directly to one of your attributes. Different name, same meaning, same structure.
| Supplier Column | Your Attribute | Example Value |
|---|---|---|
| Product Name | name | Oslo Dining Chair |
| Brand Name | brand | Nordic Living |
| Main Category | category | Furniture > Chairs |
This is the straightforward case. You just rename the column. Most of your mappings will be this type, and they're trivial once documented.
2. One-to-Many (Splitting a Column)
The supplier packs multiple pieces of information into a single column. You need to split it into separate attributes.
Before (supplier data):
| Supplier Column | Example Value |
|---|---|
| Dimensions | 45 x 52 x 89 cm |
| Full Name | Oslo Dining Chair - Black |
After (your schema):
| Your Attribute | Extracted Value |
|---|---|
| width | 450 mm |
| depth | 520 mm |
| height | 890 mm |
| name | Oslo Dining Chair |
| color | black |
One-to-many mappings require parsing logic — splitting on delimiters, extracting substrings, or applying regex patterns. They're more work to set up, but the pattern is usually consistent within a given supplier.
3. Many-to-One (Merging Columns)
The supplier spreads a single concept across multiple columns. You need to combine them into one attribute.
Before (supplier data):
| Color 1 | Color 2 | Color 3 |
|---|---|---|
| Black | Grey | |
| Navy | White | Red |
After (your schema):
| colors |
|---|
| ["black", "grey"] |
| ["navy", "white", "red"] |
You see this pattern with colors, images (Image URL 1 through Image URL 10), and any multi-value attribute the supplier has spread across numbered columns. The mapping needs to collect non-empty values from all the source columns into a single array.
4. Derived Attributes (Generating from Other Data)
Your schema requires an attribute the supplier doesn't provide at all. You need to compute it from other available data.
Before (supplier data):
| Supplier Column | Value |
|---|---|
| Product Name | Oslo Dining Chair |
| Price (excl. VAT) | 249.00 |
After (your schema):
| Your Attribute | Value | Derivation |
|---|---|---|
| name | Oslo Dining Chair | direct mapping |
| slug | oslo-dining-chair | generated: lowercase, replace spaces with hyphens |
| price | 296.31 | calculated: base price * 1.19 (VAT inclusive) |
Common derived attributes include slugs, full URLs, display names, computed prices (with tax or margin), and concatenated fields. These aren't "mappings" in the traditional sense — they're transformations that run after the mapping step.
5. Missing Attributes (Flagging Incomplete Data)
The supplier simply doesn't provide a required attribute, and there's no way to derive it from other columns.
Before (supplier data):
| Product Name | SKU | Price | Color |
|---|---|---|---|
| Oslo Dining Chair | NDL-OSL-001 | 299.00 | Black |
After mapping (your schema):
| Attribute | Value | Status |
|---|---|---|
| name | Oslo Dining Chair | OK |
| sku | NDL-OSL-001 | OK |
| price | 299.00 | OK |
| color | black | OK |
| weight | — | Missing (required for shipping) |
| material | — | Missing (recommended for filters) |
Missing attributes need to be surfaced, not silently ignored. Your mapping configuration should know which attributes are required vs. optional, and flag products that come through with gaps. This turns column mapping from a dumb rename operation into an actual data quality check.
Building a Reusable Mapping
Here's what usually happens: you map Supplier A's columns in a spreadsheet. Three months later, Supplier A sends a new file. You open a fresh spreadsheet and do it again. Maybe you remember the mappings. Maybe you don't. Maybe a colleague does it differently this time.
The fix is to save your mapping configuration as a standalone artifact — separate from any specific data file. A mapping configuration for a supplier might look like this:
{
"supplier": "Nordic Living GmbH",
"mappings": [
{ "source": "Produktname", "target": "name" },
{ "source": "Artikelnummer", "target": "sku" },
{ "source": "Preis", "target": "price" },
{ "source": "Farbe", "target": "color" },
{ "source": "Groesse", "target": "size" },
{ "source": "Gewicht", "target": "weight" },
{ "source": "Mat.", "target": "material" },
{ "source": "Kategorie", "target": "category" },
{ "source": "Marke", "target": "brand" },
{ "source": "Beschreibung", "target": "description" }
]
}
The format doesn't matter — JSON, YAML, a database table, a config screen in a tool. What matters is that the mapping is saved, versioned, and applied automatically. When Supplier A sends next month's file, you don't open a spreadsheet. You run the file through the saved mapping and review the output.
This also lets you track changes. When a supplier adds a new column or renames an existing one, you update the mapping once. Every future file uses the updated version.
The Adapter Pattern: One Mapping Per Supplier
In software, an adapter is a component that translates one interface into another. The same concept applies to supplier data: each supplier has its own "interface" (column names, formats, quirks), and you need an adapter to translate it into your canonical schema.
In practice, this means maintaining one mapping configuration per supplier:
| Supplier | Columns in Feed | Adapter Handles |
|---|---|---|
| Nordic Living GmbH | German column names, comma decimals, weight in grams | Column rename, decimal format, unit conversion |
| HomeStyle UK Ltd | English names, imperial units, dimensions combined | Column rename, unit conversion, dimension splitting |
| QuickParts Inc. | Internal codes, numbered color columns, prices in cents | Code-to-name mapping, column merging, price division |
Each adapter is created once and applied automatically on every subsequent feed from that supplier. When a new supplier comes in, you build a new adapter. Your existing adapters continue working unchanged.
This pattern scales linearly. Supplier number 50 is the same amount of work as supplier number 5 — create the adapter once, then forget about it. Compare that to the manual approach, where the effort multiplies with every feed cycle.
Tools like FeedPrep implement this adapter pattern directly: you define your column mapping and transformation rules per supplier, and every future feed upload from that supplier gets the same treatment automatically. But even if you're building this yourself with scripts or a data pipeline, the principle is what matters — one saved, reusable mapping per supplier.
Where to Go from Here
Column mapping is the foundation, but it's only layer one. After mapping, you'll typically need value standardization and unit conversion to get your data fully normalized. The good news is that the same principle — document the rules once, apply them automatically — works for all three layers.
Start with your most problematic supplier. The one whose files take the longest to clean up. Document the column mapping, save it somewhere reusable, and see how much time it saves on the next import. That's usually enough to justify doing it for every supplier.