What is Data Pipeline Automation?

The practice of automating the collection and compilation of data from many sources.


What it is: Data Pipeline Automation is the practice of automating the creation of virtual infrastructure that transports data between systems. This is different from the traditional approach, in which data pipelines are created with code that must be rewritten as the data landscape changes, or with cloud-based services that need continuous reconfiguring.

What it does: With Data Pipeline Automation, engineers can create a system of data transportation and transformation that dynamically adapts to changing circumstances. So, without needing to write new code or reconfigure services, administrators can alter the pipeline significantly, for example adding new data sources to a pipeline or changing the manner in which that data is transformed before entering a central warehouse.

Why it matters: At this point, managing diverse data sources presents an engineering challenge for many large firms. If your firm’s data pipeline isn’t operating smoothly, it can negatively affect everything from sales management, to M&A activity, to regulatory compliance. But with Data Pipeline Automation, firms can set up a robust pipeline that adapts as their data ecosystems and business requirements change. Additionally, automation makes it easier to set up data analysis and storage solutions that take advantage of multiple cloud environments.

What to do about it: If your firm depends on data pipelines that are subject to frequent updates, you can consider Data Pipeline Automation. Additional use cases are if you’re preparing for a migration to the cloud, or gearing up for any other circumstance which will require unusually demanding and complex data transportation.

Full content available to GigaOm Subscribers.

Sign Up For Free