SCP / SFTP (aka SSH)¶
DSS can interact with remote servers through the use of SSH to:
- Read and write datasets
- Read and write managed folders
DSS can use either the SCP or the SFTP protocol to interact with the remote server.
Note
You can use the DSS other_recipes/download to cache the contents from a SCP/SFTP server.
This can provide better performance if you need to read SCP/SFTP files a lot of time, and don’t mind the copy of the data which is made into a DSS managed folder.
By default, the download recipe will still check the SCP/SFTP server for updates when its output folder is rebuilt. This behavior can be disabled.
Defining the SSH connection¶
Accessing remote files stored on SCP/SFTP servers first requires the definition of a SSH connection to the remote server, as follows:
- Go to Administration > Connections
- Click the “New connection” button and select SSH
- Enter a name for the new connection, and the required connection parameters
- Save the new connection
SSH connection parameters¶
Name | Description |
---|---|
Host | Host name or IP address of the SSH server to access, mandatory. |
User | SSH username to use, mandatory. |
Use public key authentication |
|
Password | SSH password to use. Mandatory is using password authentication. |
Key passphrase | In public-key authentication mode, optional passphrase to use to decrypt the SSH private key. |
When using public-key authentication mode, connection to the remote server will be
attempted using any of the two standard SSH keys for the Studio Linux user, stored
respectively in files $HOME/.ssh/id_dsa
and $HOME/.ssh/id_rsa
, where $HOME
is the home directory of the DSS user account.
Creating SCP or SFTP datasets¶
- From the “Datasets” screen of Data Science Studio, click the “New dataset” button and select the “SCP” or “SFTP”
- Select the connection to use
- Click browse to locate your files