Extract, Transform and Load (ETL) moves raw data from your flat file system to a final destination, which is typically a data warehouse like IBM Db2 or Microsoft Azure SQL. Various file transfer protocols speed up the process, but do you know the differences between them?
In this guide, we look into the most-used transfer protocols for moving files from servers to clients via computer networks, enabling ETL to transform data from these files into readable formats for big data analytics. Here's a comparison of FTP, FTPS, SFTP, SCP, and other transfer protocols.
Table of Contents
- What is FTP?
- What is FTPS?
- What is SFTP?
- What is SCP?
- Other File Transfer Protocols
- Should You Use FTP, FTPS, SFTP, SCP, or Another Protocol?
- How Integrate.io Can Help
What is FTP?
FTP (File Transfer Protocol) is the traditional way to transfer files from clients to servers. Invented in the 70s, FTP is a simple way to move files between computers via TCP/IP — the framework that connects network devices online. Here's how FTP usually works:
- You upload files to the FTP server.
- You send these files via TCP/IP to the FTP host.
- The recipient receives and downloads the files.
FTP handles three data representations (8-binary data, ASCII 7-Bit, and EBCDIC 8-Bit) and moves files via one of three transmission modes (block, stream, and compressed.)
- It's quick and simple, and people have used it for 40 years.
- It transfers multiple directories at the same time.
As you've probably guessed, FTP is not the safest way to send files:
- There's no encryption involved.
- FTP uses two data channels, which increases the risk of hackers stealing your files.
Few data-driven companies use FTP anymore, especially not those that care about data security during ETL. FTP isn't suitable for data governance either. Need to comply with GDPR or HIPAA? Don't use FTP!
What is FTPS?
FTPS (File Transfer Protocol Secure, sometimes called FTP/SSL) is an extension of FTP created in the late 90s. Its purpose? To add an extra tier of security to FTP. FTPS uses an SSL/TLS layer underneath FTP, which encrypts its data channels.
- Lots of internet infrastructure has built-in support for SSL, making it easy to transfer files via FTPS.
- Strong authentication.
- X.509 certificate features.
- It can interfere with firewalls, so some users might struggle with it at first.
Recommended reading: FTPS ETL to Your Warehouse
What is SFTP?
SFTP (Secure File Transfer Protocol) also originated in the late 90s as an alternative to FTP. SFTP transfers various file formats via SSH, a client-server-based protocol. Unlike FTP, this only requires a single connection and encrypts files during the transfer process, making it harder for hackers to infiltrate sensitive information.
Similar to SSL, SFTP uses commands to execute the data connection when you transfer files. The recipient of your files connects to the SSH server and authenticates the server with cryptographic keys (SSH keys) or a username/password combo.
Recommended reading: SFTP ETL to Your Warehouse
- Data file encryption.
- Command execution.
- IPV6 HTTP support.
- TMUX support.
- Username/password authentication.
- Public key authentication.
- One channel for file transfers.
- Excellent choice for various types of flat files, delimited files, plain text files, CSV files, common flat files, files with comma-separated-values, and files with a simple structure.
- There are few cons. SFTP is a much safer alternative to FTP, especially for ETL.
What is SCP?
Based on SSH, SCP (Secure Copy Protocol) transfers files via encrypted IP-based data tunnels. It does this by moving files between local hosts and remote hosts (or two remote hosts).
- Like SFTP, SCP uses the SSH protocol for authentication, making it a safer FTP alternative.
- It's (sometimes) faster than SFTP for file transfers, particularly on high-latency networks.
- It lacks file management capabilities.
- It offers little support for resume file transfers.
- It's built for file transfers only. Unlike SFTP, you can't create directories or directory listings or delete files. Depending on the specific data type, it's far more limited in scope.
Other File Transfer Protocols
- TFTP (Trivial File Transfer Protocol) works on the User Datagram Transport Protocol (UDP) for file transfers. It dates back to the early 80s, and few companies use it anymore.
- MFT (Managed File Transfer) has administrative controls that support protocols like SFTP and FTPS. Used in the banking industry, MFT provides additional encryption during financial file transfers.
Should You Use FTP, FTPS, SFTP, SCP, or Another Protocol?
When transferring data from a flat file system via ETL, forget about FTP! It's not as secure as the other protocols on this list, and you could encounter all kinds of data governance problems. While SCP is a good FTP alternative and sometimes faster than SFTP on high-latency networks, it's limited to file transfers. So you can't delete files, create directories or execute other functions.
So that leaves SFTP vs. FTPS. While FTPS is effective and comes with encryption benefits, it's essentially an extension to FTP and still uses two connections. SFTP is an entirely different protocol with one connection, which reduces the risk of hackers stealing your data. As of 2021, SFTP is the safest file transfer protocol for data warehousing projects, with FTPS in second place.
How Integrate.io Can Help
Integrate.io provides full support for SFTP and lets you integrate FTPS with some analytics platforms for data analysis, making it the template for ETL workflows. Send and receive files to and from flat file databases, relational databases, data warehouses, database management systems, data lakes, and business intelligence tools without worrying about field names, delimiters, or other data markup issues.
Integrate.io requires no code or data engineering, making file transfers and data formatting a piece of cake. Whether you want to use SFTP, FTPS, or another transfer protocol, reach out to Integrate.io and learn about our 14-day demo.