Join

Merging Product Information : How It Works and Best Practices

Joining files is essential when managing product data in SmartFeeds. It allows you to combine multiple imports and ensure that data sources communicate effectively. Whether you're enriching sales data with product details or merging datasets, using the right join method is key to avoid errors and duplicates.

Prerequisites

To manage users, you need:

  • Ensure you have at least two imports added to your product flow.
  • Make sure the files share at least one common column with comparable data (e.g., Product ID, SKU).

Table of Contents

  1. Understanding Joins and Their Use Cases
  2. The Importance of Join Order in Data Integration
  3. How to Join Files in SmartFeeds
  4. Quick Answers to Common Issues
  5. Related Articles

Understanding Joins and Their Use Cases

What is a join

A join links two datasets by matching data in specific columns called keys. This key acts as the "bridge" between files. For example:

  • File A (Sales Data): Tracks sales using Product ID.
  • File B (Product Details): Contains product names and categories, also using Product ID.

By joining on Product ID, you can create a combined dataset that shows both sales and product details.

Join TypeDescriptionExample Use Case
Inner JoinMatches rows in both files where the keys align.List sold products with valid product details.
Left JoinKeeps all rows from the left file and matches from the right file.Keep all sales data and add available details.
Right JoinKeeps all rows from the right file and matches from the left file.Keep all product details and check sales.
Full JoinCombines all rows from both files, with NULLs for unmatched rows.See all products and all sales in one view.

Example: You have two files:

File A (Sales Data)

Product IDSales Amount
101$500
102$300
103$200

File B (Product Details):

SKU_IDProduct NameCategory
101LaptopElectronics
102ChairFurniture
104PhoneElectronics

Inner Join:

Product IDSales AmountProduct NameCategory
101$500LaptopElectronics
102$300ChairFurniture

Left Join:

Product IDSales AmountProduct NameCategory
101$500LaptopElectronics
102$300ChairFurniture
103$200NULLNULL

Right Join:

Product IDSales AmountProduct NameCategory
101$500LaptopElectronics
102$300ChairFurniture
104NULLPhoneElectronics

Full Join:

Product IDSales AmountProduct NameCategory
101$500LaptopElectronics
102$300ChairFurniture
103$200NULLNULL
104NULLPhoneElectronics

The Importance of Join Order in Data Integration

The order in which you join files dramatically affects the resulting dataset, determining which products are included or excluded. Here's how the interface is organized:

  1. Import A (left side): This is the starting file in your product flow and cannot be changed. Think of it as your primary dataset.
  2. Join Type (center): You select the type of join you want to perform (Inner, Left, Right, Full).
  3. Import B (right side): You choose the second file to join from a dropdown menu (if you have multiple imports available).

Because Import A is fixed, your choice of join order and type is critical for controlling which data is included. (As already said) Left Join keeps all rows from Import A, while a Right Join keeps all rows from Import B.

Important: The output of a join (the "Joined File") becomes the new Import A if you need to perform another join. This lets you build complex, multi-stage joins

How to Join Files in SmartFeeds

If you have imports that haven't been joined yet, you'll see a "+ Join" button that lets you add another join step.

Step 1: Add Your Imports

Start by adding at least two imports to your product flow (for example, "Sales Data" and "Product Details"). Generally, for standard imports, your most important dataset should be Import A, especially if you're using a Left Join.

Step 2: Choose Your Join Keys

On the Join screen, Import A is displayed on the left, and Import B is on the right. SmartFeeds provides tools to examine the columns in each file, making it easier to select the correct join keys.

Identify the columns that contain matching data in both files. These are your "join keys," even if they have different names (e.g., "Product_id" in Import A and "SKU_ID" in Import B).

Step 3: Handle Non-Unique Keys

If you select a key that isn't unique (meaning it has duplicate values), SmartFeeds will warn you. Here are ways to prevent duplicates in your joined data:

  • Use Unique Keys: Whenever possible, choose keys that are unique within each file.
  • Create Composite Keys: Combine multiple columns to create a unique key (e.g., "Product ID" + "Date").
  • Deduplicate Your Imports: Remove duplicate rows from your source files before joining. Learn more about deduplicatio here.

Step 4: Select the Join Type

SmartFeeds offers four join types: Inner, Left, Right, and Full. For typical product data scenarios, the Left Join is often the best choice. It ensures that all your primary data from Import A is kept, while adding matching information from Import B.


Quick Answers to Common Issues

Can I use non-unique keys for a join?

Yes, but it may result in duplicate rows. Deduplication or key combination is recommended.

What happens if my files donโ€™t share any keys?

Joins require at least one matching column. You may need to preprocess your files to introduce common keys.

Which join type is best for standard imports?

Left Join is generally the best choice, as it retains the main data while enriching it with matches.


Related Articles

For further assistance with SmartFeeds, consider reviewing these articles:

For additional help, feel free to reach out via our Contact Us page.

Was this article helpful?