Keeping It Clean: How to Exclude Hidden Files When Copying to AWS S3

aws-s3-exclude-system-file

In the world of file management, especially when dealing with cloud storage like AWS S3, there’s a nifty trick that’s as useful as it is easy to overlook. It’s all about keeping your S3 buckets free from those hidden files and directories that start with a dot (“.”) – a common sight in Unix-like systems. These files, often crucial for system configurations, can clutter your cloud storage if copied unnecessarily. Here’s how you can keep your S3 bucket neat and clean, just the way you like it.

The Magic Command

Here’s the command that works like a charm:

aws s3 cp source/ s3://bucket/ --recursive --exclude '.*' --exclude '*/.*'

Breaking It Down

  • --exclude '.*': This bit of the command is like a polite but firm instruction to AWS, saying, “Please leave behind any file or directory that starts with a dot in my source/ folder.” These are usually your hidden files and directories.
  • --exclude '*/.*': This part extends that instruction to the subdirectories. It ensures that if there are hidden files or directories nested within other folders, they get the message too and stay put.

More Example

  • Sync form one bucket to another bucket
aws s3 sync s3://bucket-1 s3://bucket-2 \
--exclude 'customers/*' \
--exclude 'orders/*' \
--exclude 'report/

*: Matches everything
?: Matches any single character
[sequence]: Matches any character in sequence
[!sequence]: Matches any character not in sequence
  • Sync 2 bucket with exclude and include option
aws s3 sync s3://bucket1/bootstrap/ s3://bucket2/bootstrap \
--exclude '*' \
--include 'css/*'

*Only sync bootstrap/css
bootstrap/
├── css/
│   ├── bootstrap.css
│   ├── bootstrap.min.css
│   ├── bootstrap-theme.css
│   └── bootstrap-theme.min.css
├── js/
│   ├── bootstrap.js
│   └── bootstrap.min.js
└── fonts/
    ├── glyphicons-halflings-regular.eot
    ├── glyphicons-halflings-regular.svg
    ├── glyphicons-halflings-regular.ttf
    └── glyphicons-halflings-regular.woff

Why This Matters

You might wonder, “Why bother?” Well, here’s why:

  • Clutter-free Storage: By excluding hidden files, your S3 bucket stays organized and free from unnecessary data.
  • Security: Sometimes, hidden files can contain sensitive information. Not copying them reduces the risk of accidental exposure.
  • Efficiency: It saves on storage costs and makes file management more straightforward.

When to Use This

This command is particularly handy in scenarios like:

  • Deployments: When you’re deploying applications and want to ensure only the necessary files are uploaded.
  • Backups: For creating clean backups without system or configuration files.
  • Data Migration: When moving data between environments, keeping hidden files out can simplify the process.

Wrapping It Up

Remember, the AWS CLI is powerful, but with great power comes great responsibility. Commands like this make our digital lives a bit easier and our cloud storage a bit tidier. So, next time you’re using the AWS CLI for a big copy operation, keep this command in mind. It’s a small step for a command line, but a giant leap for your file management efficiency.

Happy copying, and here’s to clutter-free cloud storage!

Reference

AWS S3 CLI docs

Stackoverflow