How to Clone Large Google Drive Folders: A Simple Script Solution

The Challenge

I recently encountered a common problem: needing to download a large Google Drive folder (around 30 gigabytes!) containing numerous files and subfolders. Manually selecting and downloading each file would take forever. The “Download All” option wasn’t much better, as Google’s zipping process for such a large volume of data also consumes a significant amount of time. To complicate matters, the link owner could delete the files at any moment, creating a sense of urgency. My solution? A script to clone all files from the shared link directly to my Google Drive, allowing me to sync them to my computer at my convenience.

Initial Approaches & The Winning Solution

My first thought was to use gdown, a Python library designed for Google Drive downloads. However, I quickly discovered its limitation: it can only handle up to 50 files per link, making it unsuitable for my situation.

The successful solution came with Google Colab. If you’re unfamiliar with it, Google Colab is a hosted service by Google that allows you to run Python scripts online, leveraging Google’s infrastructure. This means you get direct access to Google resources like Google Cloud and, crucially for this task, Google Drive.

Step-by-Step Guide with Google Colab

Here’s how to use Google Colab to clone large Google Drive folders:

First, open a new notebook on Google Colab.

You’ll need to create three separate code blocks and run them sequentially.

Step 1: Authorize Google Colab

This step grants Google Colab permission to access your Google Drive, enabling it to copy files on your behalf.

1
2
3
4
5
6
7
8
# STEP 1: Authorize Google Colab to access your Google Drive.
from google.colab import auth
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

# This will trigger a pop-up window for you to authorize access.
auth.authenticate_user()
drive_service = build('drive', 'v3')
Step 2: Define the Copy Functions

This code block defines the necessary functions to recursively copy folders and their contents from the source to your Google Drive.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
# STEP 2: Define the functions to copy the folder.
def get_folder_id_from_url(url):
"""Extracts the folder ID from a Google Drive folder URL."""
try:
return url.split('/')[-1].split('?')[0]
except:
return None

def copy_folder_recursive(service, source_folder_id, destination_parent_id, new_folder_name):
"""Recursively copies a folder and its contents."""
print(f"Creating root folder: '{new_folder_name}'...")
# Create the root folder in the destination
file_metadata = {
'name': new_folder_name,
'mimeType': 'application/vnd.google-apps.folder',
'parents': [destination_parent_id]
}
try:
new_root_folder = service.files().create(body=file_metadata, fields='id').execute()
new_root_folder_id = new_root_folder.get('id')
print(f"Successfully created root folder with ID: {new_root_folder_id}")
_copy_children(service, source_folder_id, new_root_folder_id)
return new_root_folder_id
except HttpError as error:
print(f"An error occurred while creating the root folder: {error}")
return None

def _copy_children(service, source_folder_id, destination_folder_id):
"""Helper function to copy contents of a folder."""
page_token = None
while True:
try:
response = service.files().list(q=f"'{source_folder_id}' in parents and trashed=false",
spaces='drive',
fields='nextPageToken, files(id, name, mimeType)',
pageToken=page_token).execute()
for file in response.get('files', []):
print(f"Found item: {file.get('name')} ({file.get('mimeType')})")
if file.get('mimeType') == 'application/vnd.google-apps.folder':
# It's a folder, create it in the destination and recurse
print(f"Creating sub-folder: {file.get('name')}")
folder_metadata = {
'name': file.get('name'),
'mimeType': 'application/vnd.google-apps.folder',
'parents': [destination_folder_id]
}
new_folder = service.files().create(body=folder_metadata, fields='id').execute()
print(f"Copying contents of '{file.get('name')}'...")
_copy_children(service, file.get('id'), new_folder.get('id'))
else:
# It's a file, copy it
print(f"Copying file: {file.get('name')}")
file_metadata = {
'name': file.get('name'),
'parents': [destination_folder_id]
}
service.files().copy(fileId=file.get('id'), body=file_metadata).execute()
page_token = response.get('nextPageToken', None)
if page_token is None:
break
except HttpError as error:
print(f'An error occurred: {error}')
break
Step 3: Run the Main Process

Finally, execute this code block to initiate the cloning process. You’ll be prompted to enter the Google Drive folder URL and a name for your new folder.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# STEP 3: Run the main process.
def main():
"""Main function to run the folder copy process."""
source_folder_url = input("Enter the Google Drive folder URL to clone: ")
source_folder_id = get_folder_id_from_url(source_folder_url)

if not source_folder_id:
print("Invalid Google Drive folder URL.")
return

new_folder_name = input("Enter the name for your new folder in 'My Drive': ")

# 'root' is a special alias for the main "My Drive" folder
destination_parent_id = 'root'

print(f"Starting clone process...")
print(f"Source Folder ID: {source_folder_id}")
print(f"Destination: Your 'My Drive' folder")
print(f"New Folder Name: {new_folder_name}")

new_folder_id = copy_folder_recursive(drive_service, source_folder_id, destination_parent_id, new_folder_name)

if new_folder_id:
print(f"✅ Successfully cloned folder! You can find '{new_folder_name}' in your Google Drive.")
else:
print("❌ Folder cloning failed.")

# Run the script
main()

Conclusion

By following these steps, you can efficiently clone large Google Drive folders directly to your own Google Drive using Google Colab. The “magic” happens in Step 1, where you grant Colab the necessary permissions. This method bypasses the limitations of traditional download methods and ensures you can secure your files quickly.

I hope this guide helps you manage your large Google Drive downloads with ease! Let me know in the comments if you have any questions or run into issues.