![]() The aws s3 cp command supports just a tiny flag for downloading a file stream from S3 and for uploading a local file stream to S3. This is a piece of functionality that is not so much “advertised” and appears only at the bottom of the documentation. While running some web searches on the subject, I discovered that hidden within the official AWS documentation, is a feature of the copy command, that actually supports data streaming without the need to download anything. If the data is already on S3 then we need to download it of course and then split it, which is not ideal when clients deliver to your S3 buckets a few GBs.įirst thing that came to mind is to build an AWS Lambda that can actually do all this automatically but then the data needs to be downloaded and that presents in itself scaling challenges that come with a certain cost. Typically we use the AWS CLI to copy large volumes of data from on-premise infrastructure to S3 buckets and when we need to process the data in batches we split it all before the upload. We were working on a project that required large volumes of data to be processed during an initial bootstrapping process. One of these features is part of the aws s3 cp command which really escaped me, until I had the need for it. For the last few years that I have been working with AWS, I’ve been experimenting with platform features that are often not very well documented.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |