The component previewer is a feature that allows you to preview your data at each component step without having to validate packages and run full-scale production jobs. It gives you the ability to extract, transform and preview your data on any transformation component, allowing you to debug your pipeline and/or to confirm and validate your data flow logic.

Component previews are similar to the data previews available on source components, which you might already be familiar with. In the source component data preview, you can see a few rows of your data as it is provided by your source connection. In component previews, you can take a look at the data as provided by your source but as transformed by your selected transformation components.

Table of Contents

  1. Benefits of the Component Previewer
  2. How to Run Component Previews
  3. Component Previewer Behind the Scenes
  4. Access the Component Previewer Today

 

Benefits of the Component Previewer

We developed the component previewer and introduced this much-anticipated feature due to visible demand from our customers, and our desire to make data integration more convenient for our users. We understand that running production jobs on a large package may be time-consuming. Moreover, preparing small datasets and custom component combinations for a test package might not always be the most productive part of the data engineering process. Unavailability of resources may also hinder your debugging efforts and keep you from understanding your full-scale production data-flow.

We believe a swift sneak-peek into the data pipeline could be a solution for efficiently debugging component-heavy packages and validating ETL logic. The component previewer provides exactly that. 

Not only can users see their data transformed and presented to them within minutes and on the same page, but you can also see any real-time errors or flow-breaking issues that the package might hide.

Thanks to the live progress and error reporting during a component preview job, customers can now easily and quickly reach clarity within their data pipelines and may even be able to settle smaller issues within a data flow setup.

How to Run Component Previews

Component previews are available on all transformation components except for the Clone component.

To run a preview on a transformation component:

  1. Open a transformation component.
    thumbnail image
  2. Ensure it is correctly set up - it must receive data from other source or transformation components.
    thumbnail image
  3. Press the Preview button.
  4. Wait for the previewer to finish. Do not close the component window while the previewer is running.
  5. Preview progress, errors, and the final result will all appear in the opened component window.
    thumbnail image

 

Component Previewer Behind the Scenes

The component previewer feature works by running a special job on a partial data flow in your package. This data flow is made up of all the components which serve as input to the transformation component being previewed plus that transformation component itself. The input data to the job is sourced from the schema preview on the source component. This sample of data from your source connection is small enough to display the effects of your transformations and also keep the duration of preview jobs to a minimum.

Our package validation service then runs through this flow to make sure it's ready to be executed as a job. Then, the previewer looks for an available sandbox cluster on your account as it needs a cluster to execute the preview job. If one doesn't exist, it will be created for the purpose of the component preview. 

thumbnail image

Once the cluster is active, the preview job will run and report its progress via the live log box within the transformation component's window. Upon job completion, your data will be displayed in a table below the live log box.

thumbnail image

Access the Component Previewer Today

 

Since the component previewer feature uses a sandbox cluster to execute preview jobs, it is important that your subscription level allows for the creation of at least one such cluster.

We believe this feature provides a convenient way to interact with your data integration pipelines and we hope it can optimize and speed up your data engineering efforts too.

To enable this feature on your account, please contact your account representative or schedule an intro call with Integrate.io.