VEDA Data Ingestion Services

Ingestion services for VEDA

Overview

VEDA’s data ingestion services are designed to handle and manage the flow of data from various sources efficiently. The system integrates with Apache Airflow for orchestrating and scheduling data pipelines, and APIs to facilitate and manage data ingestion tasks.

Components

  1. Apache Airflow Instance
    • Airflow orchestrates various data ingestion tasks and workflows. It supports scheduling, monitoring, and managing pipelines, described using Directed Acyclic Graphs (DAGs).
    • For details on how to deploy, configure, or modify our Airflow instance, refer to the veda-data-airflow repository.
  2. Ingest APIs
    • The Ingest API facilitates data ingestion from multiple sources and manages the data flow into VEDA’s system. This API is included in veda-backend
    • To learn how to interact with these APIs as a user, consult the Dataset Ingestion Guide.