Automated Product Imports

Challenge

APEX Tools uses the Stibo Product Information Management (PIM) system to keep track of product data, images, and categorization. This data needs to be displayed on their websites across multiple brands and regions, each with its own set of data.

We had multiple challenges to solve for:

  1. We needed a way to import the PIM data into Drupal for display on the websites.

  2. The PIM contains data on active and inactive products and products across multiple markets where the product may not be available. We needed filtering.

  3. The PIM can export an XML file of the entire PIM, or a delta of changes between timeframes. But it is not exposed through an API that the websites can access.

  4. Images are stored on a separate sFTP server instead of a standard image server.

Solution

  1. Our solution was to use the Drupal Migrate module to build custom migrations for each brand and regional website. The Drupal Migrate module provides the skeleton for importing content from XML. The migration is run twice per day (per site) via a separate command line process to isolate it from interfering with the website display. Migrate module provides functionality to update content without overwriting old data. We can also use Migrate to update subsets of data from a delta XML without overwriting older data.

  2. Each site within the Drupal multisite has its own custom migration configuration that allows us to filter each site per rules. We have a common framework for importing, but different rulesets on what to bring in.

  3. The PIM exports a delta file that is uploaded to an FTP server via a custom process built by APEX Tools engineers every 12 hours. Our migration process begins by checking for new files loaded to the FTP server and pulling them in. Migrations are run on any files pulled from the FTP that has yet to be migrated.

  4. We implemented a third-party library called FlySystem for connecting to the FTP servers for both the data and images. This allows us to efficiently and easily access the secured content for use on the website.

Results

The client no longer has to manually update information on the website or use a long slow process to do full imports of content manually. Data is never out of sync for more than 12 hours. And the client has a reliable system for maintaining product data.