FAQ: How do I transfer large files between the Cloud Cluster HDFS and a host on campus that is outside the cluster?
Answer:
- Transferring into HDFS
- To copy a file into the cluster's HDFS use the following command from the computer containing the source file:
=$ ssh shell.disc.pdl.cmu.local hadoop dfs -put - /path/on/hdfs < /local/path/to/source/bigfile=
- Into local FS
- To copy a file into the cluster's local file system, use the scp command. If your username on opencloud and your local machine is userX, and you want to transfer "/home/userX/myfile.txt" to your home directory, do the following:
scp some_file.txt userX@shell.disc.pdl.cmu.local:path/to/directory/some_file.txt
- Transferring out of HDFS
- To get a file from the HDFS, use the following command at the computer where the destination file is to be stored:
ssh shell.disc.pdl.cmu.local hadoop dfs -get /path/in/hdfs - > /local/path/to/destination/bigfile
- Out of local FS
- To copy a file from the cluster's local file system, to your computer, use the scp command. If you're on your local machine and your username on opencloud and your local machine is userX, and you want to transfer "/h/userX/myfile.txt" to your local machine's home directory, do the following:
scp userX@shell.disc.pdl.cmu.local:/h/userX/myfile.txt ~/
Back to: CloudClusterFAQ