Introduction to the CCAT Data Transfer System ============================================= The CCAT Data Transfer System (CDT) is a system for transferring data from the CCAT observatory located on Cerro Chajnantor in Chile to downstream data archives. The CDT uses the Operations Database (OpsDB) to discover data that has to be transferred and to keep track of the status of the data transfer. The CDT is designed to be modular and flexible so that it can be adapted to different data transfer scenarios. Although we set out with only one planned Data Archive, i.e., the archive in Cologne, the system is designed to be able to handle multiple data archives. Here we implement a scheme where each data package is only sent out once from the observatory and then is replicated between the different data archives. This is done to minimize the load on the network between the observatory and the data archives, which will be the bottleneck in the data transfer system. The system prepares the data for transport as well as ensures data integrity and archives it to long-term storage. Once the data has been successfully checked into long-term storage, the system will mark files as archived in the OpsDB and will handle automatic deletion of the data from the local storage at the observatory. The major transport system will rely on `bbcp`_. `bbcp` is a fast and secure data transfer tool that is well suited for the high bandwidth and high latency network connection between the observatory and the data archives. The Data Transfer System is comprised of the following sub-systems: - Package Manager - Transfer Manager - Data Integrity Checker - Archiving System - Deletion System The system is deployed using Docker Compose, which orchestrates the following services: - **Promtail and Loki**: Used for log aggregation and monitoring. Promtail collects logs from the services and sends them to Loki for centralized storage and querying. - **Redis**: Acts as a broker and backend for Celery, and is used for task state management and caching. - **InfluxDB**: Collects and stores housekeeping metrics such as transfer rates and task states for performance monitoring and analysis. .. _bbcp: https://www.slac.stanford.edu/~abh/bbcp/