FAQ: How do I transfer large files between the Cloud Cluster HDFS and a host on campus that is outside the cluster?

Answer:

Transferring into HDFS
To copy a file into the cluster's HDFS use the following command from the computer containing the source file:
 =$ ssh shell.disc.pdl.cmu.local hadoop dfs -put - /path/on/hdfs < /local/path/to/source/bigfile= 

Into local FS
To copy a file into the cluster's local file system, use the scp command. If your username on opencloud and your local machine is userX, and you want to transfer "/home/userX/myfile.txt" to your home directory, do the following:
scp some_file.txt userX@shell.disc.pdl.cmu.local:path/to/directory/some_file.txt

Transferring out of HDFS
To get a file from the HDFS, use the following command at the computer where the destination file is to be stored:
ssh shell.disc.pdl.cmu.local hadoop dfs -get /path/in/hdfs - > /local/path/to/destination/bigfile 

Out of local FS
To copy a file from the cluster's local file system, to your computer, use the scp command. If you're on your local machine and your username on opencloud and your local machine is userX, and you want to transfer "/h/userX/myfile.txt" to your local machine's home directory, do the following:
scp userX@shell.disc.pdl.cmu.local:/h/userX/myfile.txt ~/

Back to: CloudClusterFAQ
Topic revision: r6 - 12 Jul 2013, MitchFranzos - This page was cached on 25 Dec 2024 - 03:14.

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding PDLWiki? Send feedback